Exporting Batch Data

View as MarkdownOpen in Claude

Once participants have completed your batch study, you can export all responses and uploaded files as a single ZIP archive.

Exports are generated asynchronously — the API responds immediately with a job ID, and the archive is built out-of-band. This keeps the request fast even for large batches with many uploaded files.

Workflow overview

1

Request an export by sending a POST request to the export endpoint. The API returns a job ID immediately.

2

Poll for completion by sending GET requests with your job ID until the status is complete or failed.

3

Download the ZIP archive from the presigned URL included in the complete response.

4

Work with the exported data — load responses-by-submission.jsonl or responses-by-task.jsonl for analysis, or extract files from the files/ directory.

Using the Prolific CLI

The Prolific CLI handles the full request, poll, and download flow in a single command:

$prolific batch export <batch-id>

By default the archive is saved to <batch-id>-export-<YYYYMMDD-HHMMSS>.zip in the current directory. Use --output to specify a path:

$prolific batch export <batch-id> --output ./my-export.zip

Requires the PROLIFIC_TOKEN environment variable and researcher access to the batch’s workspace.

Requesting an export

$POST /api/v1/data-collection/batches/{batch_id}/export

No request body is required.

Responses

If a new export job is created, the API returns 202 Accepted:

1{
2 "status": "generating",
3 "export_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
4}

If a recent export for this batch already exists, the API returns 200 OK with the download URL immediately — you can skip polling and go straight to the download step.

1{
2 "status": "complete",
3 "url": "https://...",
4 "expires_at": "2026-03-20T10:30:00Z"
5}

Export requests are idempotent. Re-sending POST for a batch that is already generating or has a complete export returns the existing job ID or download URL rather than triggering a new generation.

Polling for completion

Use the export_id from the POST response to check the status of your export job.

$GET /api/v1/data-collection/batches/{batch_id}/export/{export_id}

Poll at a reasonable interval (every 3–5 seconds) until the status changes.

StatusMeaningNext step
generatingThe archive is still being builtContinue polling
completeThe archive is ready — url and expires_at are includedDownload the ZIP
failedGeneration failed or the archive was deletedRetry by sending POST again

Complete response

1{
2 "status": "complete",
3 "url": "https://...",
4 "expires_at": "2026-03-20T10:30:00Z"
5}

The url is a presigned HTTPS link valid for 24 hours. Re-poll the GET endpoint to receive a refreshed URL if it has expired.

Polling example

1import time
2import requests
3
4def poll_export(batch_id, export_id, token, timeout=600):
5 headers = {"Authorization": f"Token {token}"}
6 deadline = time.time() + timeout
7 while time.time() < deadline:
8 r = requests.get(
9 f"https://api.prolific.com/api/v1/data-collection/batches/{batch_id}/export/{export_id}",
10 headers=headers,
11 )
12 r.raise_for_status()
13 data = r.json()
14 if data["status"] == "complete":
15 return data["url"]
16 if data["status"] == "failed":
17 raise RuntimeError(f"Export failed for batch {batch_id}")
18 time.sleep(3)
19 raise TimeoutError("Export did not complete within the timeout period")

Archive format

The downloaded ZIP contains the following structure:

export.zip
├── responses-by-submission.jsonl
├── responses-by-task.jsonl
├── batch.json
├── README.md
└── files/
└── {submission_id}_{instruction_id}_{index}.{ext}

The files/ directory is only present if participants uploaded files. The README.md inside the archive contains a quick-start guide and a pandas example.

The archive ships two JSONL files covering the same underlying data from different angles — choose whichever suits your analysis:

FileOne record perBest for
responses-by-submission.jsonlParticipant submissionQuality review, per-participant analysis
responses-by-task.jsonlTaskContent-centric analysis, agreement metrics

responses-by-submission.jsonl

Each line is a JSON object representing one participant’s submission, with all of their responses across every task they completed.

1{
2 "submission_id": "sub-abc123",
3 "participant_id": "part-def456",
4 "batch_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b",
5 "created_at": "2026-03-19T10:30:00Z",
6 "responses": {
7 "task-id-1": {
8 "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d": {
9 "type": "free_text",
10 "description": "Briefly describe the item shown",
11 "value": "A red leather wallet with two card slots"
12 },
13 "0192a3b5-e3f4-7a5b-6c7d-9e0f1a2b3c4d": {
14 "type": "file_upload",
15 "description": "Upload a photo of the item",
16 "files": [
17 {
18 "name": "photo.jpg",
19 "path": "files/sub-abc123_0192a3b5-e3f4-7a5b-6c7d-9e0f1a2b3c4d_0.jpg"
20 }
21 ]
22 }
23 }
24 }
25}

responses-by-task.jsonl

Each line is a JSON object representing one task, with all participant responses for that task collected together.

1{
2 "task_id": "task-xyz789",
3 "batch_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b",
4 "responses": [
5 {
6 "submission_id": "sub-abc123",
7 "participant_id": "part-def456",
8 "created_at": "2026-03-19T10:30:00Z",
9 "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d": {
10 "type": "free_text",
11 "description": "Briefly describe the item shown",
12 "value": "A red leather wallet with two card slots"
13 }
14 },
15 {
16 "submission_id": "sub-ghi789",
17 "participant_id": "part-jkl012",
18 "created_at": "2026-03-19T11:15:00Z",
19 "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d": {
20 "type": "free_text",
21 "description": "Briefly describe the item shown",
22 "value": "Red bifold wallet, appears to be leather"
23 }
24 }
25 ]
26}

The path for file uploads is relative to the archive root, so it can be used directly after extraction.

Response value shapes

Instruction typeValue fields
free_textvalue: string
free_text_with_unitvalue: string, unit: string
multiple_choicevalues: string[]
multiple_choice_with_free_textvalues: { option: string, explanation: string }[]
file_uploadfiles: { name: string, path: string }[]

batch.json

Batch metadata and a list of all instructions, useful for mapping instruction IDs to their descriptions and types.

1{
2 "batch_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b",
3 "name": "Product Image Annotation",
4 "exported_at": "2026-03-19T10:30:00Z",
5 "instructions": [
6 {
7 "id": "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d",
8 "type": "free_text",
9 "description": "Briefly describe the item shown"
10 },
11 {
12 "id": "0192a3b5-e3f4-7a5b-6c7d-9e0f1a2b3c4d",
13 "type": "file_upload",
14 "description": "Upload a photo of the item"
15 }
16 ]
17}

Working with the exported data

Load responses with pandas

1import pandas as pd
2
3# Participant-centric view
4by_submission = pd.read_json("responses-by-submission.jsonl", lines=True)
5
6# Task-centric view (useful for inter-annotator agreement)
7by_task = pd.read_json("responses-by-task.jsonl", lines=True)

Extract uploaded files

1import zipfile
2
3with zipfile.ZipFile("export.zip") as z:
4 z.extractall("export/")
5
6# Files are at: export/files/{submission_id}_{instruction_id}_{index}.{ext}

The submission_id prefix in each filename lets you match files back to their submission record in responses-by-submission.jsonl.

Handle all response types

1for record in by_submission.itertuples():
2 for task_id, task_responses in record.responses.items():
3 for instruction_id, response in task_responses.items():
4 match response["type"]:
5 case "free_text" | "free_text_with_unit":
6 print(response["value"])
7 case "multiple_choice":
8 print(response["values"])
9 case "multiple_choice_with_free_text":
10 for v in response["values"]:
11 print(v["option"], v["explanation"])
12 case "file_upload":
13 for f in response["files"]:
14 print(f["path"])

Notes

  • Presigned URL expiry: download URLs are valid for 24 hours. Re-poll GET to receive a refreshed URL.
  • Retry on failure: a failed export can be retried by sending POST again.
  • Active responses only: deleted and no_submission responses are excluded from the export.

By using AI Task Builder, you agree to our AI Task Builder Terms.