For clean Markdown of any page, append .md to the page URL. For a complete documentation index, see https://docs.prolific.com/api-reference/ai-task-builder/llms.txt. For full documentation content, see https://docs.prolific.com/api-reference/ai-task-builder/llms-full.txt. # Exporting Batch Data Once participants have completed your batch study, you can export all responses and uploaded files as a single ZIP archive. Exports are generated asynchronously — the API responds immediately with a job ID, and the archive is built out-of-band. This keeps the request fast even for large batches with many uploaded files. ## Workflow overview Request an export by sending a `POST` request to the export endpoint. The API returns a job ID immediately. Poll for completion by sending `GET` requests with your job ID until the status is `complete` or `failed`. Download the ZIP archive from the presigned URL included in the `complete` response. Work with the exported data — load `responses-by-submission.jsonl` or `responses-by-task.jsonl` for analysis, or extract files from the `files/` directory. ## Using the Prolific CLI The [Prolific CLI](/tooling/cli) handles the full request, poll, and download flow in a single command: ```bash prolific batch export ``` By default the archive is saved to `-export-.zip` in the current directory. Use `--output` to specify a path: ```bash prolific batch export --output ./my-export.zip ``` Requires the `PROLIFIC_TOKEN` environment variable and researcher access to the batch's workspace. ## Requesting an export ```bash POST /api/v1/data-collection/batches/{batch_id}/export ``` No request body is required. ### Responses If a new export job is created, the API returns `202 Accepted`: ```json { "status": "generating", "export_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890" } ``` If a recent export for this batch already exists, the API returns `200 OK` with the download URL immediately — you can skip polling and go straight to the download step. ```json { "status": "complete", "url": "https://...", "expires_at": "2026-03-20T10:30:00Z" } ``` Export requests are idempotent. Re-sending `POST` for a batch that is already generating or has a complete export returns the existing job ID or download URL rather than triggering a new generation. ## Polling for completion Use the `export_id` from the `POST` response to check the status of your export job. ```bash GET /api/v1/data-collection/batches/{batch_id}/export/{export_id} ``` Poll at a reasonable interval (every 3–5 seconds) until the status changes. | Status | Meaning | Next step | | ------------ | ---------------------------------------------------------- | ----------------------------- | | `generating` | The archive is still being built | Continue polling | | `complete` | The archive is ready — `url` and `expires_at` are included | Download the ZIP | | `failed` | Generation failed or the archive was deleted | Retry by sending `POST` again | ### Complete response ```json { "status": "complete", "url": "https://...", "expires_at": "2026-03-20T10:30:00Z" } ``` The `url` is a presigned HTTPS link valid for 24 hours. Re-poll the `GET` endpoint to receive a refreshed URL if it has expired. ### Polling example ```python import time import requests def poll_export(batch_id, export_id, token, timeout=600): headers = {"Authorization": f"Token {token}"} deadline = time.time() + timeout while time.time() < deadline: r = requests.get( f"https://api.prolific.com/api/v1/data-collection/batches/{batch_id}/export/{export_id}", headers=headers, ) r.raise_for_status() data = r.json() if data["status"] == "complete": return data["url"] if data["status"] == "failed": raise RuntimeError(f"Export failed for batch {batch_id}") time.sleep(3) raise TimeoutError("Export did not complete within the timeout period") ``` ## Archive format The downloaded ZIP contains the following structure: ``` export.zip ├── responses-by-submission.jsonl ├── responses-by-task.jsonl ├── batch.json ├── README.md └── files/ └── {submission_id}_{instruction_id}_{index}.{ext} ``` The `files/` directory is only present if participants uploaded files. The `README.md` inside the archive contains a quick-start guide and a pandas example. The archive ships two JSONL files covering the same underlying data from different angles — choose whichever suits your analysis: | File | One record per | Best for | | ------------------------------- | ---------------------- | ------------------------------------------- | | `responses-by-submission.jsonl` | Participant submission | Quality review, per-participant analysis | | `responses-by-task.jsonl` | Task | Content-centric analysis, agreement metrics | ### responses-by-submission.jsonl Each line is a JSON object representing one participant's submission, with all of their responses across every task they completed. ```json { "submission_id": "sub-abc123", "participant_id": "part-def456", "batch_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b", "created_at": "2026-03-19T10:30:00Z", "responses": { "task-id-1": { "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d": { "type": "free_text", "description": "Briefly describe the item shown", "value": "A red leather wallet with two card slots" }, "0192a3b5-e3f4-7a5b-6c7d-9e0f1a2b3c4d": { "type": "file_upload", "description": "Upload a photo of the item", "files": [ { "name": "photo.jpg", "path": "files/sub-abc123_0192a3b5-e3f4-7a5b-6c7d-9e0f1a2b3c4d_0.jpg" } ] } } } } ``` ### responses-by-task.jsonl Each line is a JSON object representing one task, with all participant responses for that task collected together. ```json { "task_id": "task-xyz789", "batch_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b", "responses": [ { "submission_id": "sub-abc123", "participant_id": "part-def456", "created_at": "2026-03-19T10:30:00Z", "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d": { "type": "free_text", "description": "Briefly describe the item shown", "value": "A red leather wallet with two card slots" } }, { "submission_id": "sub-ghi789", "participant_id": "part-jkl012", "created_at": "2026-03-19T11:15:00Z", "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d": { "type": "free_text", "description": "Briefly describe the item shown", "value": "Red bifold wallet, appears to be leather" } } ] } ``` The `path` for file uploads is relative to the archive root, so it can be used directly after extraction. #### Response value shapes | Instruction type | Value fields | | -------------------------------- | --------------------------------------------------- | | `free_text` | `value: string` | | `free_text_with_unit` | `value: string`, `unit: string` | | `multiple_choice` | `values: string[]` | | `multiple_choice_with_free_text` | `values: { option: string, explanation: string }[]` | | `file_upload` | `files: { name: string, path: string }[]` | ### batch.json Batch metadata and a list of all instructions, useful for mapping instruction IDs to their descriptions and types. ```json { "batch_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b", "name": "Product Image Annotation", "exported_at": "2026-03-19T10:30:00Z", "instructions": [ { "id": "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d", "type": "free_text", "description": "Briefly describe the item shown" }, { "id": "0192a3b5-e3f4-7a5b-6c7d-9e0f1a2b3c4d", "type": "file_upload", "description": "Upload a photo of the item" } ] } ``` ## Working with the exported data ### Load responses with pandas ```python import pandas as pd # Participant-centric view by_submission = pd.read_json("responses-by-submission.jsonl", lines=True) # Task-centric view (useful for inter-annotator agreement) by_task = pd.read_json("responses-by-task.jsonl", lines=True) ``` ### Extract uploaded files ```python import zipfile with zipfile.ZipFile("export.zip") as z: z.extractall("export/") # Files are at: export/files/{submission_id}_{instruction_id}_{index}.{ext} ``` The `submission_id` prefix in each filename lets you match files back to their submission record in `responses-by-submission.jsonl`. ### Handle all response types ```python for record in by_submission.itertuples(): for task_id, task_responses in record.responses.items(): for instruction_id, response in task_responses.items(): match response["type"]: case "free_text" | "free_text_with_unit": print(response["value"]) case "multiple_choice": print(response["values"]) case "multiple_choice_with_free_text": for v in response["values"]: print(v["option"], v["explanation"]) case "file_upload": for f in response["files"]: print(f["path"]) ``` ## Notes * **Presigned URL expiry:** download URLs are valid for 24 hours. Re-poll `GET` to receive a refreshed URL. * **Retry on failure:** a `failed` export can be retried by sending `POST` again. * **Active responses only:** deleted and `no_submission` responses are excluded from the export. *** By using AI Task Builder, you agree to our [AI Task Builder Terms](https://prolific.notion.site/Researcher-Terms-7787f102f0c541bdbe2c04b5d3285acb).