Send a document and a JSON schema. Get perfectly typed JSON back. One API call, any file format, pay-as-you-go.
From unstructured file to strict JSON in seconds.
Pass any file via API or SDK—PDFs, images, emails, or spreadsheets. No preprocessing required.
Provide a JSON Schema or your saved ClicheFactory batch model JSON. Our engine handles the OCR and maps the fields automatically.
Get perfectly typed, schema-validated data returned instantly. Ready to drop straight into your database.
Unified extraction for your entire document pipeline.
Define your extraction model as JSON Schema or batch JSON in the app — get structured, validated JSON back.
Single-call extraction handles documents up to ~100 pages. For longer files, the Python SDK chunks and merges them.
Keep sensitive documents on your own hardware. Use Local Mode to process files locally and bring your own LLM API key. ClicheFactory orchestrates the parsing, but your data never hits our servers.
cf extract.Start free with 50 pages. Pay per page after that — no subscriptions.
| Mode | Full Service | BYOK | Best For |
|---|---|---|---|
| Fast | 10 credits | 2 credits | High-volume, latency-sensitive |
| Standard Popular | 40 credits | 5 credits | General use, best accuracy/cost |
| Robust | 80 credits | 10 credits | High-stakes, verification pass |
1000 credits = $1.00 USD. Credits never expire. Full pricing details →
Need domain experts to review extractions or build ground-truth datasets? Don't build internal tools. Define your extraction model in the app or upload your own JSON Schema, and we generate a strict, type-validated web UI for your non-technical team.
Bring your own LLM key (OpenAI, Gemini, or Anthropic) and train custom extraction pipelines on your document types. Upload labeled examples, train in minutes, deploy an artifact. Use it in the SDK, CLI, or API — one artifact ID, that's it.
Learn about Training50 free pages. No credit card. Full API access.