Skip to main contentSign up for an account
Sign up for an account at https://app.getplum.ai
Datasets
To get started, Plum AI needs to start ingesting a dataset of inputs to your LLM and outputs from your LLM.
This dataset is used to evaluate your application, which will drive the improvements you’ll make.
There are multiple ways to upload data to Plum:
- Upload a file containing the data
- Use the Python SDK:
pip install plum-sdk
- Use our API. See the API reference here.
Here’s how to upload a JSON file directly to Plum:
Evaluation workflow
Generate evaluation metrics
Based on the uploaded data and system prompt, Plum AI can get you started quantitatively evaluating your outputs.
To get started, click on “Generate Evaluation Metrics” for a set of metrics tailored for your specific use case.
You can edit these metrics to modify them or add your own.
Use generated metrics
Next, you can run an evaluation on your dataset using these metrics.
This will give you a quantitative understanding of how well your LLM is performing based on the data provided.
Click “Run Evaluation” and Plum AI will provide a statistical analysis of your outputs within seconds.
You now have a snapshot of your LLM application’s performance, which you can track over time.
Not only that, but Plum AI also provides you with reasons why and how your LLM is underperforming on specific metrics.
This allows you to iterate and improve particular aspects of your LLM performance over time.
Fine-tuning workflow
Generate synthetic data driven by the evaluation scores
Now that you have evaluations tailored to your preferences, you may want to fine-tune your LLM.
Providers such as OpenAI and Anthropic allow you to fine-tune models based on a dataset of positive examples you upload to their platform.
Plum AI can leverage your evaluation results and provide you with the exact right data to fine-tune a model.
Choose the size of synthetic dataset you want to generate from your initial seed dataset.
For reference, fine-tuning a model requires around at least 100 examples.
Click the “Generate” button to generate synthetic data based on evaluation scores.
Click on “Download in OpenAI’s .jsonl format” to download the synthetic dataset in the format required by OpenAI’s platform.
Upload the synthetic data to a major LLM provider like OpenAI’s fine-tuning API
Once you have a train.jsonl from Step 5, you could optionally create another file, validation.jsonl, using real heldout data that you haven’t used in the seed dataset.
- Go to the OpenAI fine-tuning page: https://platform.openai.com/finetune
- Click “Create”.
- Upload new training data.
After around 15 minutes, the fine-tuning run completes, and OpenAI will provide a customized model ID that you can start using.
Congratulations! You’ve completed one round of fine-tuning using Plum AI.
Unlock your data flywheel: generate a new set of data using your fine-tuned model, create synthetic data using Plum AI, and start another round of fine-tuning.