For clean Markdown of any page, append .md to the page URL. For a complete documentation index, see https://docs.prolific.com/api-reference/ai-task-builder/llms.txt. For full documentation content, see https://docs.prolific.com/api-reference/ai-task-builder/llms-full.txt.

# Exporting Batch Data

Once participants have completed your batch study, you can export all responses and uploaded files as a single ZIP archive.

Exports are generated asynchronously — the API responds immediately with a job ID, and the archive is built out-of-band. This keeps the request fast even for large batches with many uploaded files.

## Workflow overview

<Steps>
  <Step>
    Request an export by sending a `POST` request to the export endpoint. The API returns a job ID immediately.
  </Step>

  <Step>
    Poll for completion by sending `GET` requests with your job ID until the status is `complete` or `failed`.
  </Step>

  <Step>
    Download the ZIP archive from the presigned URL included in the `complete` response.
  </Step>

  <Step>
    Work with the exported data — load `responses-by-submission.jsonl` or `responses-by-task.jsonl` for analysis, or extract files from the `files/` directory.
  </Step>
</Steps>

## Using the Prolific CLI

The [Prolific CLI](/tooling/cli) handles the full request, poll, and download flow in a single command:

```bash
prolific batch export <batch-id>
```

By default the archive is saved to `<batch-id>-export-<YYYYMMDD-HHMMSS>.zip` in the current directory. Use `--output` to specify a path:

```bash
prolific batch export <batch-id> --output ./my-export.zip
```

Requires the `PROLIFIC_TOKEN` environment variable and researcher access to the batch's workspace.

## Requesting an export

```bash
POST /api/v1/data-collection/batches/{batch_id}/export
```

No request body is required.

### Responses

If a new export job is created, the API returns `202 Accepted`:

```json
{
  "status": "generating",
  "export_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}
```

If a recent export for this batch already exists, the API returns `200 OK` with the download URL immediately — you can skip polling and go straight to the download step.

```json
{
  "status": "complete",
  "url": "https://...",
  "expires_at": "2026-03-20T10:30:00Z"
}
```

<Note>
  Export requests are idempotent. Re-sending `POST` for a batch that is already generating or has a complete export returns the existing job ID or download URL rather than triggering a new generation.
</Note>

## Polling for completion

Use the `export_id` from the `POST` response to check the status of your export job.

```bash
GET /api/v1/data-collection/batches/{batch_id}/export/{export_id}
```

Poll at a reasonable interval (every 3–5 seconds) until the status changes.

| Status       | Meaning                                                    | Next step                     |
| ------------ | ---------------------------------------------------------- | ----------------------------- |
| `generating` | The archive is still being built                           | Continue polling              |
| `complete`   | The archive is ready — `url` and `expires_at` are included | Download the ZIP              |
| `failed`     | Generation failed or the archive was deleted               | Retry by sending `POST` again |

### Complete response

```json
{
  "status": "complete",
  "url": "https://...",
  "expires_at": "2026-03-20T10:30:00Z"
}
```

The `url` is a presigned HTTPS link valid for 24 hours. Re-poll the `GET` endpoint to receive a refreshed URL if it has expired.

### Polling example

```python
import time
import requests

def poll_export(batch_id, export_id, token, timeout=600):
    headers = {"Authorization": f"Token {token}"}
    deadline = time.time() + timeout
    while time.time() < deadline:
        r = requests.get(
            f"https://api.prolific.com/api/v1/data-collection/batches/{batch_id}/export/{export_id}",
            headers=headers,
        )
        r.raise_for_status()
        data = r.json()
        if data["status"] == "complete":
            return data["url"]
        if data["status"] == "failed":
            raise RuntimeError(f"Export failed for batch {batch_id}")
        time.sleep(3)
    raise TimeoutError("Export did not complete within the timeout period")
```

## Archive format

The downloaded ZIP contains the following structure:

```
export.zip
├── responses-by-submission.jsonl
├── responses-by-task.jsonl
├── batch.json
├── README.md
└── files/
    └── {submission_id}_{instruction_id}_{index}.{ext}
```

The `files/` directory is only present if participants uploaded files. The `README.md` inside the archive contains a quick-start guide and a pandas example.

The archive ships two JSONL files covering the same underlying data from different angles — choose whichever suits your analysis:

| File                            | One record per         | Best for                                    |
| ------------------------------- | ---------------------- | ------------------------------------------- |
| `responses-by-submission.jsonl` | Participant submission | Quality review, per-participant analysis    |
| `responses-by-task.jsonl`       | Task                   | Content-centric analysis, agreement metrics |

### responses-by-submission.jsonl

Each line is a JSON object representing one participant's submission, with all of their responses across every task they completed.

```json
{
  "submission_id": "sub-abc123",
  "participant_id": "part-def456",
  "batch_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b",
  "created_at": "2026-03-19T10:30:00Z",
  "responses": {
    "task-id-1": {
      "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d": {
        "type": "free_text",
        "description": "Briefly describe the item shown",
        "value": "A red leather wallet with two card slots"
      },
      "0192a3b5-e3f4-7a5b-6c7d-9e0f1a2b3c4d": {
        "type": "file_upload",
        "description": "Upload a photo of the item",
        "files": [
          {
            "name": "photo.jpg",
            "path": "files/sub-abc123_0192a3b5-e3f4-7a5b-6c7d-9e0f1a2b3c4d_0.jpg"
          }
        ]
      }
    }
  }
}
```

### responses-by-task.jsonl

Each line is a JSON object representing one task, with all participant responses for that task collected together.

```json
{
  "task_id": "task-xyz789",
  "batch_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b",
  "responses": [
    {
      "submission_id": "sub-abc123",
      "participant_id": "part-def456",
      "created_at": "2026-03-19T10:30:00Z",
      "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d": {
        "type": "free_text",
        "description": "Briefly describe the item shown",
        "value": "A red leather wallet with two card slots"
      }
    },
    {
      "submission_id": "sub-ghi789",
      "participant_id": "part-jkl012",
      "created_at": "2026-03-19T11:15:00Z",
      "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d": {
        "type": "free_text",
        "description": "Briefly describe the item shown",
        "value": "Red bifold wallet, appears to be leather"
      }
    }
  ]
}
```

The `path` for file uploads is relative to the archive root, so it can be used directly after extraction.

#### Response value shapes

| Instruction type                 | Value fields                                        |
| -------------------------------- | --------------------------------------------------- |
| `free_text`                      | `value: string`                                     |
| `free_text_with_unit`            | `value: string`, `unit: string`                     |
| `multiple_choice`                | `values: string[]`                                  |
| `multiple_choice_with_free_text` | `values: { option: string, explanation: string }[]` |
| `file_upload`                    | `files: { name: string, path: string }[]`           |

### batch.json

Batch metadata and a list of all instructions, useful for mapping instruction IDs to their descriptions and types.

```json
{
  "batch_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b",
  "name": "Product Image Annotation",
  "exported_at": "2026-03-19T10:30:00Z",
  "instructions": [
    {
      "id": "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d",
      "type": "free_text",
      "description": "Briefly describe the item shown"
    },
    {
      "id": "0192a3b5-e3f4-7a5b-6c7d-9e0f1a2b3c4d",
      "type": "file_upload",
      "description": "Upload a photo of the item"
    }
  ]
}
```

## Working with the exported data

### Load responses with pandas

```python
import pandas as pd

# Participant-centric view
by_submission = pd.read_json("responses-by-submission.jsonl", lines=True)

# Task-centric view (useful for inter-annotator agreement)
by_task = pd.read_json("responses-by-task.jsonl", lines=True)
```

### Extract uploaded files

```python
import zipfile

with zipfile.ZipFile("export.zip") as z:
    z.extractall("export/")

# Files are at: export/files/{submission_id}_{instruction_id}_{index}.{ext}
```

The `submission_id` prefix in each filename lets you match files back to their submission record in `responses-by-submission.jsonl`.

### Handle all response types

```python
for record in by_submission.itertuples():
    for task_id, task_responses in record.responses.items():
        for instruction_id, response in task_responses.items():
            match response["type"]:
                case "free_text" | "free_text_with_unit":
                    print(response["value"])
                case "multiple_choice":
                    print(response["values"])
                case "multiple_choice_with_free_text":
                    for v in response["values"]:
                        print(v["option"], v["explanation"])
                case "file_upload":
                    for f in response["files"]:
                        print(f["path"])
```

## Notes

* **Presigned URL expiry:** download URLs are valid for 24 hours. Re-poll `GET` to receive a refreshed URL.
* **Retry on failure:** a `failed` export can be retried by sending `POST` again.
* **Active responses only:** deleted and `no_submission` responses are excluded from the export.

***

By using AI Task Builder, you agree to our [AI Task Builder Terms](https://prolific.notion.site/Researcher-Terms-7787f102f0c541bdbe2c04b5d3285acb).