Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qset Generator #1582

Open
wants to merge 30 commits into
base: dev/10.3.0
Choose a base branch
from
Open

Conversation

cayb0rg
Copy link
Contributor

@cayb0rg cayb0rg commented May 8, 2024

Up-to-date details of this PR available here:

#1582 (comment)

Qset Generator

Uses OpenAI's gpt-3.5-turbo model to generate a qset based off the widget's demo.json structure and the dall-e-2 model for generating images.

Setup

Generate an OpenAI key and add it to your .env file like:

OPENAI_API_KEY=<your_key_here>

Run Materia as usual. Note: this PR adds the openai-php/client package with composer which may require a Materia reinstall for those who have it installed already.

Create a widget. The instance has to have an existing ID to work, so new instances must be saved first before using the generator.

Click on "Generate Questions" in action toolbar. Enter a descriptive topic (3+ words at least). You may also choose to extend the current qset so that the current questions are kept and include images. Including images will take much longer. See widget compatibility for what is supported.

Appending to the current qset doesn't get the exact number right all the time. It will also take longer than generating a new qset with the same number of questions.

A log file called openai_usage.txt will be generated and stored in the /public folder with the generation times, tokens, costs, and more. I've attached an example openai_usage.txt from testing. Prompt tokens are for the input prompt, which includes the demo qset, your topic, and additional instructions. Completion tokens are for the output, or what the model used.
openai_usage.txt

gpt-3.5-turbo costs:

  • Input is $0.50 / 1M tokens
  • Output is $1.50 / 1M tokens

Widget Compatibility

This list will be updated as individual widgets are tested.

  • Adventure: ❌
  • Crossword
    • Text: ✅
      • Appendable: ✅
    • Images: N/A
  • Enigma
    • Text: ✅ (Generates questions, but not category names, and can only generate max of 3 questions at a time)
      • Appendable: ✅
    • Images: ❌ (Generates images, but times out)
  • Equation Sandbox: ❌
  • Flashcards
    • Text: ✅
      • Appendable: ❌
    • Images: ✅ (Most of the time)
  • Guess The Phrase
    • Text: ✅
      • Appendable: ✅
    • Images: N/A
  • Labeling
    • Text: ❌ (The AI just replaces the text in the demo qset, so while it "works", it doesn't work as desired)
    • Images: ❌
  • Last Chance Cadet
    • Text: ✅
      • Appendable: ✅
    • Images: N/A
  • Matching: Because the demo is based off "Spanish Verbs", the generated questions default to language and vocabulary matching. Needs a long, definitive topic to work best.
    • Text: ✅
      • Appendable: ✅
    • Images: N/A
  • Sequencer
    • Text: ✅
      • Appendable: ✅
    • Images: N/A
  • Survey
    • Text: ✅
      • Appendable: ✅
    • Images: ❌
  • Sort It Out!
    • Text: ✅
      • Appendable: ✅
    • Images: ✅
  • Syntax Sorter
    • Text: ✅
      • Appendable: ❌
    • Images: N/A
  • This or That: Text-based answers store their text in the asset field, making text-based answers near impossible to generate correctly with the current prompt
    • Text: ✅ (Works only with the custom prompt (to be defined in the install.yaml): 'Each answer is stored in the answer's options > asset. Inside asset, set the 'materiaType' attribute to 'text', the 'type' attribute to 'text', and the 'value' attribute to the actual answer.'
    • Images: ✅
  • Word Search
    • Text: ✅
      • Appendable: ✅
    • Images: N/A

More on images...

Generating images is pricey, at $0.016 per 256x256 image. The time to generate these is even more exorbitant; however, you can still test it if you'd like on This or That. Also, OpenAI will yell at us if it tries generating images with real people in it, which may happen by accident if the questions it itself generated had real people's names.

Furthermore, image generation is not compatible for most widgets, mainly because widgets store their assets in weird places which make them hard to track and manipulate. Therefore, images will most likely never make it to production, but feel free to try it out!

To-Do

  • For question generation to work, the widget must be saved first. Disable the question generator until new instance is saved
  • Make UI Pretty™
  • Create various error states
  • Test more widgets like Secret Spreadsheet, Radar Grapher, and more
  • Add fields to install.yaml on generation compatibility. Disable in creator if not compatible.
  • Add prompt field to install.yaml
  • Action bar in creator is now overflowing on smallish screens
  • Clean up.

@clpetersonucf clpetersonucf changed the base branch from master to dev/10.2.0 June 12, 2024 17:49
@clpetersonucf
Copy link
Member

This is amazing, as discussed:

  • Widgets should have to opt-in to allow question generation, instead of having it enabled globally or by default
  • Question IDs generated by the LLM should be null instead of randomly generated, to prevent collisions
  • The logs generated by prompt requests should be added to the .gitignore
  • The Generate Questions option should be disabled globally when the API key isn't set (or, alternatively, we add a second env variable to explicitly enable or disable the feature)

Some additional thoughts:

  • We might want to consider a variant of the Save History Keep/Cancel action bar overlay for this feature, to address that this is the generated qset and to highlight that if the content didn't load correctly (because the creator encountered an error with the generated qset, etc) the user can cancel and go back
  • Assuming we add a flag in the install.yaml, would it also be possible to add custom prompts, to better address variations in individual widget engines? Would this improve some of the compatibility issues with certain widgets?

@cayb0rg
Copy link
Contributor Author

cayb0rg commented Jun 12, 2024

Thanks for the feedback! Included some fixes for the things we discussed.

We might want to consider a variant of the Save History Keep/Cancel action bar overlay for this feature, to address that this is the generated qset and to highlight that if the content didn't load correctly (because the creator encountered an error with the generated qset, etc) the user can cancel and go back

Didn't even notice this, haha. We're definitely not "Previewing Prior Save". Apart from changing the title, are you suggesting changing "Select Cancel to go back to the version you were working on. Select Keep to commit to using this version." and/or altering the behavior of the "Cancel" button?

Assuming we add a flag in the install.yaml, would it also be possible to add custom prompts, to better address variations in individual widget engines? Would this improve some of the compatibility issues with certain widgets?

This could work for sure!

@clpetersonucf
Copy link
Member

Didn't even notice this, haha. We're definitely not "Previewing Prior Save". Apart from changing the title, are you suggesting changing "Select Cancel to go back to the version you were working on. Select Keep to commit to using this version." and/or altering the behavior of the "Cancel" button?

Behavior of keep and cancel should probably be the same, I'm just thinking of adding some additional language to clarify that the creator loaded the questions generated by AI, and that if the creator did not load properly you can hit cancel to return to where you were. Something to that effect, since the likelihood of a creator-breaking qset is higher.

@cayb0rg
Copy link
Contributor Author

cayb0rg commented Jun 18, 2024

Added custom prompts to the install.yaml for widgets, which fixed the problem This or That was having! I also added a custom confirmation dialog and a bit of additional language to the question generator dialog.

@clpetersonucf clpetersonucf changed the base branch from dev/10.2.0 to dev/10.3.0 July 16, 2024 19:19
@clpetersonucf
Copy link
Member

clpetersonucf commented Aug 21, 2024

Major changes/additions since Cay's departure:

  • Moved bulk of generation logic from the API endpoint methods into a new Widget_Question_Generator class.
  • OpenAI PHP client will initialize differently depending on whether the provider generation configuration is set to openai or azure_openai, in order to support both options.
  • Lots of refinements to prompt construction.
  • Reworked image generation to make multiple consecutive requests to DALL-E 2 to generate images associated with discrete and different descriptions. Image assets are now returned as base64 strings which are properly bootstrapped into assets. Reworked assign asset method to recursively attach an asset to a qset based on the description string.
  • Moved some configurations into additional environment variables.
  • Adjustments to qset hot-swapping in the creator to allow for qset generation with unsaved instances.
  • Removed the is_generable API endpoint and bundled the is_generable flag as a widget class property. The property is a composite of the widget's is_generable flag and the system's global flag for question generation.
  • The generation feature is available for widget creators when the is_generable flag returns true. Widget engines must opt in for question generation via the is_generable property in their install.yaml.
  • The widget engine is_generable flag is now globally disabled when a user is not authenticated or doesn't have the basic_author role.

Compatible Widget Engines

How do I update a widget to support question generation?

  1. Add the is_generable: Yes property to a widget's install.yaml
  2. Ensure history validation passthrough is supported in the creator's onSaveClicked(mode) method: if mode is 'history', validation should be bypassed.
  3. If required, add a generation_prompt field to the meta_data section of a widget's install.yaml. This prompt should provide clarifying instructions to facilitate the proper and consistent formatting of a qset.

@clpetersonucf clpetersonucf marked this pull request as ready for review September 3, 2024 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants