Polyscriptor HTR

HTR Engine

Engine

Image

Drop image or PDF, or click to browse

Segmentation

Method Device

Upload an image to begin

Transcription

Quick Start

Select an engine from the dropdown and configure it (model path, API key, etc.).
Click Load Model and wait for the green status badge.
Upload an image by dragging it onto the upload area or clicking to browse.
Optionally click Segment to preview line detection before transcribing.
Click Transcribe. Lines appear one by one as they are processed.
Export the result as TXT, CSV, or PAGE XML.

Engines

Engine	Best for
CRNN-CTC	Fastest; works well on Church Slavonic, Glagolitic, Ukrainian with trained models
TrOCR	HuggingFace Transformer OCR; good general-purpose accuracy
Qwen3-VL	Large vision-language model; best quality but slow, needs GPU
Kraken	Classical HTR; good for Latin scripts
Party	Whole-page transformer; requires PAGE XML with line segmentation
Commercial APIs	OpenAI / Gemini / Claude — cloud inference, no local GPU needed
OpenWebUI	Locally hosted models via OpenWebUI/Ollama

Segmentation

Kraken Classical — fast, reliable for single-column pages.
Kraken Neural (blla) — handles multi-column layouts and complex page structures. Use Max columns to tune how many sub-columns to detect.
HPP — horizontal projection profile, very fast fallback.
PAGE XML upload — skip segmentation entirely by uploading an existing PAGE XML annotation (e.g. from Transkribus).

Tips

Click a transcription line to highlight the corresponding bounding box in the image.
Confidence badges: high ≥90% mid ≥75% low <75%
Line-segmenting engines (CRNN-CTC, TrOCR, Kraken) use the segmentation method above. Page-level engines (Party, Qwen3-VL, Commercial APIs) do their own segmentation.
API keys can be saved on the server — enter the key once, check Save key on server.
Uploads are kept for 24 hours, then cleaned up automatically.

Keyboard

Esc — close this dialog