Skip to main content
POST
/
augment
Augment seed data to generate synthetic data
curl --request POST \
  --url https://beta.getplum.ai/v1/augment \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
  "multiple": 1,
  "eval_results_id": "<string>",
  "include_gathered": false,
  "pair_query": {
    "latest_n_pairs": 123,
    "pair_labels": [
      "<string>"
    ]
  },
  "target_metric_idx": [
    123
  ]
}'
{
  "synthetic_data_id": "<string>",
  "created_at": "<string>",
  "seed_data_size": 123,
  "synthetic_data_size": 123,
  "system_prompt": "<string>",
  "target_metrics": [
    "<string>"
  ]
}

Authorizations

Authorization
string
header
required

Body

application/json
target_metric_idx
integer[]
required

Array of indices of target metrics for redrafting synthetic data (from the evaluation results)

multiple
integer
default:1

Number of synthetic examples to generate per seed example (max 50)

Required range: x <= 50
eval_results_id
string

ID of evaluation results to use for target metrics (will use latest if not provided)

include_gathered
boolean
default:false

If true, includes gathered pairs (high-scoring and positively critiqued) in the synthetic dataset

pair_query
object

Optional query parameters to filter seed dataset pairs

Response

Data successfully augmented

synthetic_data_id
string

ID of the generated synthetic dataset

created_at
string

Timestamp when the synthetic data was created

seed_data_size
integer

Number of pairs in the original seed dataset

synthetic_data_size
integer

Total number of pairs in the synthetic dataset (including gathered pairs if include_gathered is true)

system_prompt
string

System prompt used for the synthetic data

target_metrics
string[]

Array of target metrics that were used for redrafting

I