OCR PDF — Extract Text from Scanned PDFs
Convert scanned PDF documents into searchable, selectable text using OCR. Supports 15+ languages. 100% free, processed in your browser.
Drag & drop files here or click to browse
Max file size: 50.0 MB
How to Extract Text from a Scanned PDF
Upload your scanned PDF
Select a scanned PDF or image-based PDF document from your device.
Choose the document language
Select the language of your document from 15+ supported languages for the best OCR accuracy.
Extract and download the text
Get the extracted text as a .txt file, copy it to your clipboard, or download a searchable PDF. All processing runs locally via Tesseract.js.
Making scanned documents searchable
OCR (optical character recognition) turns image-based PDFs — anything from a scanner, a phone photo, or a fax — into PDFs where you can select text, search with Ctrl+F, and copy passages. Without OCR, a scanned 50-page document is functionally a stack of pictures: there is no way to find a name, a date, or a dollar amount without reading every page.
Realistic expectations: Tesseract.js (the engine running in your browser here) handles clean printed text in English at 95–98% accuracy. It struggles with handwriting (use a specialized service for that), with very small fonts under 8 point, and with documents scanned at under 200 DPI. For best results, scan or photograph documents at 300 DPI minimum, with the page flat and well-lit. OCR is slower than other tools on this site — expect 2–8 seconds per page depending on your device — because it runs a neural network in WebAssembly rather than manipulating PDF objects. For non-English documents, support varies; the engine ships with English by default. The output is a searchable PDF where the original image is preserved and a hidden text layer is added behind it.