# Exporting Collection Data
**Early Access**
AI Task Builder Collections are an early-access feature that may be enabled on your workspace upon request.
To request access or contribute towards the feature's roadmap, visit our help center at [https://researcher-help.prolific.com/en/](https://researcher-help.prolific.com/en/) and drop us a message in the chat. Your activation request will be reviewed by our team.
Note: This feature is under active development and you may encounter bugs.
Once participants have completed your collection study, you can export all responses and uploaded files as a single ZIP archive.
Exports are generated asynchronously — the API responds immediately with a job ID, and the archive is built out-of-band. This keeps the request fast even for large collections with many uploaded files.
## Workflow overview
Request an export by sending a `POST` request to the export endpoint. The API returns a job ID immediately.
Poll for completion by sending `GET` requests with your job ID until the status is `complete` or `failed`.
Download the ZIP archive from the presigned URL included in the `complete` response.
Work with the exported data — load `responses.jsonl` for analysis or extract files from the `files/` directory.
## Using the Prolific CLI
The [Prolific CLI](/tooling/cli) handles the full request, poll, and download flow in a single command:
```bash
prolific collection export
```
By default the archive is saved to `-export-.zip` in the current directory. Use `--output` to specify a path:
```bash
prolific collection export --output ./my-export.zip
```
Requires the `PROLIFIC_TOKEN` environment variable and researcher access to the collection's workspace.
## Requesting an export
```bash
POST /api/v1/data-collection/collections/{collection_id}/export
```
No request body is required.
### Responses
If a new export job is created, the API returns `202 Accepted`:
```json
{
"status": "generating",
"export_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}
```
If a recent export for this collection already exists, the API returns `200 OK` with the download URL immediately — you can skip polling and go straight to the download step.
```json
{
"status": "complete",
"url": "https://...",
"expires_at": "2026-03-20T10:30:00Z"
}
```
Export requests are idempotent. Re-sending `POST` for a collection that is already generating or has a complete export returns the existing job ID or download URL rather than triggering a new generation.
## Polling for completion
Use the `export_id` from the `POST` response to check the status of your export job.
```bash
GET /api/v1/data-collection/collections/{collection_id}/export/{export_id}
```
Poll at a reasonable interval (every 3–5 seconds) until the status changes.
| Status | Meaning | Next step |
| ------------ | ---------------------------------------------------------- | ----------------------------- |
| `generating` | The archive is still being built | Continue polling |
| `complete` | The archive is ready — `url` and `expires_at` are included | Download the ZIP |
| `failed` | Generation failed or the archive was deleted | Retry by sending `POST` again |
### Complete response
```json
{
"status": "complete",
"url": "https://...",
"expires_at": "2026-03-20T10:30:00Z"
}
```
The `url` is a presigned HTTPS link valid for 24 hours. Re-poll the `GET` endpoint to receive a refreshed URL if it has expired.
### Polling example
```python
import time
import requests
def poll_export(collection_id, export_id, token, timeout=600):
headers = {"Authorization": f"Token {token}"}
deadline = time.time() + timeout
while time.time() < deadline:
r = requests.get(
f"https://api.prolific.com/api/v1/data-collection/collections/{collection_id}/export/{export_id}",
headers=headers,
)
r.raise_for_status()
data = r.json()
if data["status"] == "complete":
return data["url"]
if data["status"] == "failed":
raise RuntimeError(f"Export failed for collection {collection_id}")
time.sleep(3)
raise TimeoutError("Export did not complete within the timeout period")
```
## Archive format
The downloaded ZIP contains the following structure:
```
collection-export-{collection_id}-{YYYYMMDDTHHMMSS}/
├── responses.jsonl
├── collection.json
├── README.md
└── files/
└── {submission_id}_{instruction_id}_{index}.{ext}
```
The `files/` directory is only present if participants uploaded files. The `README.md` inside the archive contains a quick-start guide and a pandas example.
### responses.jsonl
Each line is a JSON object representing one submission. Response values are keyed by instruction ID.
```json
{
"submission_id": "sub-abc123",
"participant_id": "part-def456",
"collection_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b",
"created_at": "2026-03-19T10:30:00Z",
"responses": {
"0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d": {
"type": "free_text",
"description": "Briefly describe the skin condition shown in your image",
"value": "Small red patch on left forearm, slightly raised"
},
"0192a3b5-e3f4-7a5b-6c7d-9e0f1a2b3c4d": {
"type": "file_upload",
"description": "Upload a clear photo of the affected area",
"files": [
{
"name": "photo.jpg",
"path": "files/sub-abc123_0192a3b5-e3f4-7a5b-6c7d-9e0f1a2b3c4d_0.jpg"
}
]
}
}
}
```
The `path` for file uploads is relative to the archive root, so it can be used directly after extraction.
#### Response value shapes
| Instruction type | Value fields |
| -------------------------------- | --------------------------------------------------- |
| `free_text` | `value: string` |
| `free_text_with_unit` | `value: string`, `unit: string` |
| `multiple_choice` | `values: string[]` |
| `multiple_choice_with_free_text` | `values: { option: string, explanation: string }[]` |
| `file_upload` | `files: { name: string, path: string }[]` |
### collection.json
Collection metadata and a list of all instructions, useful for mapping instruction IDs to their descriptions and types.
```json
{
"collection_id": "0192a3b4-c5d6-7e8f-9a0b-1c2d3e4f5a6b",
"name": "Skin Condition Image Collection",
"exported_at": "2026-03-19T10:30:00Z",
"instructions": [
{
"id": "0192a3b4-e7f8-7a0b-1c2d-3e4f5a6b7c8d",
"type": "free_text",
"description": "Briefly describe the skin condition shown in your image"
},
{
"id": "0192a3b5-e3f4-7a5b-6c7d-9e0f1a2b3c4d",
"type": "file_upload",
"description": "Upload a clear photo of the affected area"
}
]
}
```
## Working with the exported data
### Load responses with pandas
```python
import pandas as pd
df = pd.read_json("responses.jsonl", lines=True)
print(df.head())
```
### Extract uploaded files
```python
import zipfile
with zipfile.ZipFile("export.zip") as z:
z.extractall("export/")
# Files are at: export/files/{submission_id}_{instruction_id}_{index}.{ext}
```
The `submission_id` prefix in each filename lets you match files back to their submission record in `responses.jsonl`.
### Handle all response types
```python
for record in df.itertuples():
for instruction_id, response in record.responses.items():
match response["type"]:
case "free_text" | "free_text_with_unit":
print(response["value"])
case "multiple_choice":
print(response["values"])
case "multiple_choice_with_free_text":
for v in response["values"]:
print(v["option"], v["explanation"])
case "file_upload":
for f in response["files"]:
print(f["path"])
```
## Notes
* **Presigned URL expiry:** download URLs are valid for 24 hours. Re-poll `GET` to receive a refreshed URL.
* **Retry on failure:** a `failed` export can be retried by sending `POST` again.
* **Active responses only:** deleted and `no_submission` responses are excluded from the export.
***
By using AI Task Builder, you agree to our [AI Task Builder Terms](https://prolific.notion.site/Researcher-Terms-7787f102f0c541bdbe2c04b5d3285acb).