# Get Started: Running AI Task Builder Batches via Prolific’s API # What you’ll accomplish * Upload raw data points (e.g., text, image URLs) into an **AI Task Builder Dataset** * Create an **AI Task Builder Batch** and attach **Instructions** (free text and/or multiple choice) * Convert your dataset into **tasks** and wait until the batch is **READY** * Create and **publish a Prolific study** that references the batch * Pull **annotated responses** from the batch # Prerequisites * Your **workspace\_id** * Your **API token** (`Authorization: Token ...`) * Decide how many **tasks per participant** you want to show (via `tasks_per_group`) * Decide your **study** settings (reward, sample size, timing, targeting) # Step-by-step guide ## Create a Dataset Create a container that will hold the datapoints participants will annotate. Response will include `id` (your `dataset_id`) and `status` (one of `UNINITIALISED|PROCESSING|READY|ERROR`). ## Request An Upload URL For Your Data Upload one or more files (CSV/JSONL/etc.) that contain the datapoints. **The response body will be structured as follows** ```json { "upload_url": "string", "expires_at": "string", // ISO-8601 DateTime string "http_method": "string", // PUT in all instances "dataset_id": "string" } ``` ## Upload Your Data (files → dataset) Upload the file containing your datapoints. ```bash curl -X PUT "UPLOAD_URL_FROM_PREVIOUS_STEP" --data-binary "@{path_to_file}" ``` **Poll status** until the dataset is processed: ```bash {"status": "READY"} # (wait for READY) ``` ## Create a Batch Batches bind datasets + instructions into something you can attach to a Prolific study. You can also include detailed task details here with complex introductions and steps. `task_introduction` and `task_steps` fields support basic HTML. (**Note:** `dataset` is optional: it can be omitted and added later via update.) Save the returned `id` as `BATCH_ID`. ### Update a Batch (optional) You can update a batch's name, task details, or associated dataset after creation. All fields are optional - include only what you want to update. ### Duplicate a Batch (optional) You can create a copy of an existing batch, either with or without its dataset. #### Duplicate with the same dataset (dataset is shared, not copied) ```bash # Duplicate with the same dataset (dataset is shared, not copied) curl -X POST https://api.prolific.com/api/v1/data-collection/batches/BATCH_ID/duplicate \ -H "Authorization: Token YOUR_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "name": "Copy of My Batch" }' # Or duplicate without a dataset (requires new upload) curl -X POST https://api.prolific.com/api/v1/data-collection/batches/BATCH_ID/duplicate \ -H "Authorization: Token YOUR_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "name": "Copy of My Batch", "upload_new_dataset": true }' ``` If `name` is omitted, the duplicate will be named `[Original Batch Name] (Copy)`. **Note:** When duplicating with a dataset, both batches will reference the same dataset - it is not duplicated. ## Add Instructions to the Batch You can create **multiple** instructions. Types supported: * `multiple_choice` (requires `options: [{label, value}, ...]`) * `free_text` (optional `placeholder_text_input`) (You can later **PUT** the same endpoint to update instructions - please include the complete payload as it will replace all instructions on the batch, or **GET** it to read them back.) ## Set Up the Batch (build tasks, set tasks per participant) Configure how datapoints become tasks and how many tasks each participant gets. **Note:** `dataset_id` is required and must be in `READY` status. Now **poll batch status** until it’s **READY**: ```bash {"status":"READY"} # (possible values: UNINITIALISED|PROCESSING|READY|ERROR) ``` ## Create a Prolific Study that References the Batch When the batch is **READY**, create a standard study but include these two fields: * `"data_collection_method": "DC_TOOL"` * `"data_collection_id": "BATCH_ID"` ## Publish the Study (Other actions available include `PAUSE`, `START` (resume), and `STOP`.) ## Retrieve Annotated Responses Pull back the participant answers as they come in. ```bash # { # "responses": [ # { # "id": "...", # "created_at": "...", # "batch_id": "BATCH_ID", # "participant_id": "...", # "task_id": "...", # "response": { # "instruction_id": "UUID", # "type": "multiple_choice" | "free_text", # "answer": "B" | "free-text string" # } # } # ] # } ``` # Operational Notes & Tips * **Statuses**: Datasets and Batches both progress through `UNINITIALISED → PROCESSING → READY` (or `ERROR`). Don’t attach a batch to a study until it’s `READY`. * **Instructions**: You can mix multiple instruction types within the same batch (e.g., two `multiple_choice` plus one `free_text`). * **`tasks_per_group`**: Controls how many tasks a single participant receives in one "sitting" (this is what will be classed as a single Submission on Prolific). Increase to reduce study overhead and per-task price dispersion; decrease if tasks are long. * **Targeting & Reward**: As with any Prolific study, all participants share the same targeting and reward. * Once a batch has been set up (step 6) you can no longer edit its instructions. * **Dataset Requirements**: Datasets do not need to be in READY status when creating, updating, or duplicating batches. However, the dataset **must** be in READY status before running the setup endpoint. * **Batch Duplication**: When duplicating a batch with a dataset, the dataset is shared between both batches, not duplicated. If you need separate datasets, use `upload_new_dataset: true`.