How it works

How Transcribe prepares a page
before transcription.

Transcribe runs three preparation steps on each page: detect where the writing is, decide whether it's one page or two, and divide each page into distinct tiles. The transcription model then reads one tile at a time.

Try the demo Explore the process

pipeline.py 3 stages

01 · DETECT

Find the words

02 · SPLIT

One page or two

03 · CROP

Cut into tiles

01 Find the words 02 One page or two 03 Cut into crops → What the model sees

STEP 01 · DETECT

Detect the lines of writing.

Transcribe sends the image to a computer-vision service, which returns a list of quadrilaterals — one per detected line of writing. These boxes mark where the handwriting is on the page; they are not the final transcription.

Input1 image · arbitrary size
ServiceAzure Read API
Returns84 lines · 4-corner polygons
Carriesrough transcription, used as hints downstream
Skipsblank margins · calibration cards
Sometimes missesfaint pencil · heavy bleed-through · tightly overlapping cursive

Azure Read · v3.2 quadrilaterals · n = 84 ● detections

84 boxes avg height 32 px each box = one line of writing

spread · 1800 × 1255 fold · x = 1863 px

STEP 02 · SPLIT

Determine whether the image is one page or two.

For each vertical slice of the image, Transcribe counts how many detected lines pass through it. A single page produces a roughly flat profile. A two-page spread produces a deep valley at the gutter. Six checks must all pass before the image is split; otherwise it is treated as a single page.

Signalx-crossings count per column
Checks6 / 6 must pass
Thresholdvalley depth ≥ 60% · balance ± 15%
On splitcut at gutter · pages processed independently
On misstreat as one page · no harm done

STEP 03 · CROP

Divide each page into tiles.

Every pixel of the page is assigned to a tile. The number of tiles is chosen so each one contains roughly 150 words, with a maximum of three tiles per page. This ensures full coverage — even ink the line detector missed still reaches the transcription model.

Target~150 words per tile
Range1 – 3 tiles per page (hard ceiling)
Prefersnatural column / row breaks
Falls back toeven vertical bands
Ensuresfull coverage · nothing thrown away

· ·

tile 01

tile 02

tile 03

single page · 865 × 1320 3 tiles · ~150 words each

3 tiles 150 words / tile every pixel covered · no orphaned ink

Result

What the model receives.

Each tile is sent to the transcription model as a separate image, with metadata about which page and region it came from. The results are stitched back together in order. This method multiplies the tokens the LLM processes per page.

Input · one crop

crop 01 / 02 860 × 640 · 148 words

2× tokens / page 2 tiles → 2 prompts no full-page context

Output · transcription

crop 01 · page 1 · spread 351 ● DONE · 1.8 s

To the Honble the Circuit Court of the District of Columbia

for the County of Alexandria —

Humbly complaining your Orator Alexander Henderson Esq.

sheweth unto your Honours — That a certain John Forbes and

Bennett Forbes, whom your Orator prays may be made

defendants to this Bill of Complaint, are indebted to your

Orator in the sum of Three Thousand dollars or thereabouts

as will appear by your Orators account

Scale

Run a project at once.

Upload a folder of pages and process them as a batch. Pages run in parallel and progress is shown live. Batch transcription is priced lower per page than one-off uploads.

transcribe.app / projects / Mass_31_2

Series J

Upload New Images Batch Transcribe 50% cheaper Export

153 / 230

Pages transcribed

$2.01

Total cost

66%

Complete

In flight

66% · ~3 min remaining

Search transcriptions… ⌘K

Verify

Edit the transcription beside the image.

Each page opens in a split-pane editor. Pan and zoom the image on one side, edit the text on the other. Save changes, mark the page verified, or copy the text out.

Mass_31_2 / page 143 — committee on Indian raids UNVERIFIED · ⌘S to save

PAGE 143 · CROP 01 / 02

In reference to yê Compl^tt of yê Dep^tys of Hampshire, Concerning the wrong they Sustaine by Indians: The Comittee sees noe way for Injuryes past but to refer yê to a Course of Law: And for p^rvention thereof for future doe Judge meete, That theires Indians be forbid entertaining or harbouring of greate Numbers of strange Indians, unless they will Ingage to make satisfaction for what Injury they shall doe yê English in yê tyme of theire aboade w^th them. And that they be also acquainted How theire resorting & Living among the english Townes especially in this tyme of theire warrs w^th yê Mowhawke doth occasion much dammage to yê English many ways, & therefore that they be warned to observe S^d Lawes, & also to Shun all offence & p^rjudice to yê English…

Save & Next Mark Verified ✓ Copy

Image quality

What works well — and what's less reliable.

Image quality is the biggest predictor of how much editing each page will need. A clean 300+ dpi scan typically needs light corrections.

Works well

300+ dpi scans
Even lighting, no glare
Flat pages — no curl at the spine
A single hand throughout
English / Latin script
Modest marginalia

Less reliable but can still produce a useful draft

Phone snapshots at an angle
Faded pencil
Tightly packed cursive
Severe bleed-through
Multiple hands on one page
Heavily annotated or crossed-out text
Non-Latin scripts

Ship it

Export as .txt, .docx, or .pdf.

Choose a format, choose a page range, and optionally include the source images. PDF exports include a cover page and one section per page.

.txt

Plain text

One line per page. Pipe into anything — grep, search index, downstream model.

.docx

Word document

Headings per page, formatting preserved, ready to share with a researcher.

.pdf

Formatted PDF

Cover page, table of contents, optional source-image facsimiles alongside the typed text.

Turn handwritten manuscripts into searchable text.

Get started

How Transcribe prepares a page
before transcription.

Detect the lines of writing.

Determine whether the image is one page or two.

Divide each page into tiles.

What the model receives.

Run a project at once.

Edit the transcription beside the image.

What works well — and what's less reliable.

Export as .txt, .docx, or .pdf.

Try it on a page.

Turn handwritten manuscripts into searchable text.

Get started

How Transcribe prepares a pagebefore transcription.

Detect the lines of writing.

Determine whether the image is one page or two.

Divide each page into tiles.

What the model receives.

Run a project at once.

Edit the transcription beside the image.

What works well — and what's less reliable.

Export as .txt, .docx, or .pdf.

Try it on a page.

How Transcribe prepares a page
before transcription.