Skip to content

Jobs

Zatabase includes a built-in job orchestrator for running compute tasks alongside your data. Jobs execute as native processes or Docker containers, with real-time log streaming, artifact management, and cancellation support. This eliminates the need for a separate job queue (Celery, Sidekiq, Bull) for data-adjacent workloads.

Create a job by specifying the command, working directory, and optional environment variables:

Terminal window
curl -s -X POST https://your-project.zatabase.io/v1/jobs \
-H "Authorization: Bearer $ZATABASE_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "nightly-export",
"command": "python3 export.py --format parquet",
"working_dir": "/opt/scripts",
"environment": {
"OUTPUT_DIR": "/tmp/exports",
"ZATABASE_TOKEN": "'"$ZATABASE_TOKEN"'"
},
"timeout_seconds": 3600
}' | jq

The response includes a job ID that you use to monitor progress:

{
"job_id": "01HQR...",
"status": "queued",
"created_at": "2026-03-01T12:00:00Z"
}
queued -> running -> completed
-> failed
-> cancelled

Jobs transition through these states automatically. A queued job is picked up by the next available worker. Running jobs stream their stdout and stderr to the log endpoint. When the process exits, the job moves to completed (exit code 0) or failed (nonzero exit code).

Terminal window
curl -s https://your-project.zatabase.io/v1/jobs \
-H "Authorization: Bearer $ZATABASE_TOKEN" | jq
Terminal window
curl -s https://your-project.zatabase.io/v1/jobs/{job_id} \
-H "Authorization: Bearer $ZATABASE_TOKEN" | jq

Connect to the WebSocket endpoint for real-time log output:

Terminal window
websocat "wss://your-project.zatabase.io/v1/jobs/{job_id}/logs?token=$ZATABASE_TOKEN"

Each message is a JSON object with stream (stdout or stderr), line, and timestamp fields.

Cancel a running job with a configurable grace period:

Terminal window
curl -s -X POST https://your-project.zatabase.io/v1/jobs/{job_id}/cancel \
-H "Authorization: Bearer $ZATABASE_TOKEN" \
-H "Content-Type: application/json" \
-d '{"timeout_seconds": 10}'

Zatabase sends SIGTERM to the process and waits for the specified timeout. If the process does not exit, it is killed with SIGKILL.

Jobs can produce artifacts that are stored in Zatabase’s content-addressed filesystem. Artifacts are indexed by SHA-256 hash, so identical files are stored only once.

Terminal window
curl -s -X POST https://your-project.zatabase.io/v1/jobs/{job_id}/artifacts \
-H "Authorization: Bearer $ZATABASE_TOKEN" \
Terminal window
curl -s https://your-project.zatabase.io/v1/jobs/{job_id}/artifacts \
-H "Authorization: Bearer $ZATABASE_TOKEN" | jq
Terminal window
curl -O https://your-project.zatabase.io/v1/jobs/{job_id}/artifacts/{artifact_id} \
-H "Authorization: Bearer $ZATABASE_TOKEN"

The ZWORKER_MODE environment variable controls how jobs are executed:

ModeDescription
autoDetect Docker availability; fall back to local if unavailable
localExecute jobs as native child processes
dockerExecute jobs inside Docker containers

Local mode is fastest and simplest for trusted workloads. Docker mode provides process isolation, resource limits, and reproducible environments for untrusted or multi-tenant workloads.

Note: In production deployments, container-based job execution is handled by ZLayer, Zatabase’s native orchestration fabric. The docker worker mode is primarily intended for local development and testing.

  • ETL pipelines: Export data from Zatabase, transform it, and import results back
  • ML training: Run training jobs close to the data, store model artifacts in Zatabase
  • Report generation: Scheduled report generation with artifact storage
  • Data validation: Run validation scripts against ingested data and flag anomalies