Quick Start
- Select an engine from the dropdown and configure it (model path, API key, etc.).
- Click Load Model and wait for the green status badge.
- Upload an image by dragging it onto the upload area or clicking to browse.
- Optionally click Segment to preview line detection before transcribing.
- Click Transcribe. Lines appear one by one as they are processed.
- Export the result as TXT, CSV, or PAGE XML.
Source Code
The public Polyscriptor source code is available on
GitHub.
This Hugging Face Space runs a curated hosted demo configuration.
Engines
| Engine | Best for |
| CRNN-CTC | Fastest; works well on Church Slavonic, Glagolitic, Ukrainian with trained models |
| TrOCR | HuggingFace Transformer OCR; good general-purpose accuracy |
| Qwen3-VL | Large vision-language model; best quality but slow, needs GPU |
| Kraken | Classical HTR; good for Latin scripts |
| Party | Whole-page transformer; requires PAGE XML with line segmentation |
| Commercial APIs | OpenAI / Gemini / Claude — cloud inference, no local GPU needed |
| OpenWebUI | Locally hosted models via OpenWebUI/Ollama |
Segmentation
- Kraken Classical — default line segmentation in this Hugging Face CPU demo.
- HPP — horizontal projection profile fallback.
- Kraken Neural / blla — available on the full server setup, but not enabled in this Space.
- PAGE XML upload — skip segmentation entirely by uploading an existing PAGE XML annotation (e.g. from Transkribus).
Tips
- Click a transcription line to highlight the corresponding bounding box in the image.
- Confidence badges: high ≥90% mid ≥75% low <75%
- Line-segmenting engines (CRNN-CTC, TrOCR, Kraken) use the segmentation method above. Page-level engines (Party, Qwen3-VL, Commercial APIs) do their own segmentation.
- API keys can be saved on the server — enter the key once, check Save key on server.
- Uploads are kept for 24 hours, then cleaned up automatically.
Keyboard