OCR (Optical Character Recognition) is the technology that converts an image of text — a photo of a page, a scanned document, a screenshot — into machine-readable text you can search, copy, edit, and reuse. An OCR app for iPhone runs this recognition on a mobile device rather than a desktop or a server, turning the camera into a text-capture tool for receipts, contracts, business cards, textbook pages, and handwriting. ScanLens performs OCR on-device using Apple's Vision framework and Neural Engine, so documents never leave the iPhone during recognition.
This page covers OCR text extraction specifically. For the full capture-and-save workflow, start with the document scanner for iPhone. For OCR inside a PDF file (an invisible text layer), see the searchable PDF page. For scanner apps compared, see the PDF scanner app overview.
When you scan a document, ScanLens runs a text-recognition pipeline entirely on your iPhone:
All OCR processing happens locally using Apple's Neural Engine. Your documents never leave your iPhone, ensuring complete privacy even for sensitive materials.
The OCR engine behind ScanLens is Apple's Vision framework — the same system-level text recognition that powers Live Text and the Camera app. Vision exposes a request, VNRecognizeTextRequest, that takes an image and returns recognized strings, each with a bounding box and a confidence score. ScanLens hands the scanned page to this request and uses the recognized text and its on-page coordinates to build copyable text and the invisible PDF text layer.
Vision offers two recognition paths. The accurate path runs a neural network that reads whole words and lines in context, which is what catches messy spacing, varied fonts, and the run-on letters of cursive; a fast path trades accuracy for speed on simple, clean text. For document scanning the accurate path is the sensible default, and it is the one that leans most on the iPhone's machine-learning hardware.
That hardware is the Neural Engine, a block of the Apple silicon chip built to run neural networks quickly and at low power. Because text recognition is a neural-network task, it maps onto the Neural Engine rather than tying up the main CPU — which is why OCR on a recent iPhone finishes a clean page in well under a second and a multi-page scan in a few seconds, without the heat or battery drain a long upload would cost.
A scan moves through the same ordered steps every time. The photo is first normalized — perspective-corrected so the page sits square, then contrast-enhanced so ink separates from paper. Vision performs text detection, locating the regions that contain writing and discarding blank margins and graphics. Each region goes to recognition, where the neural model converts pixels into characters and groups them into words and lines. A language-model pass then resolves an ambiguous shape to a real word rather than a near-miss. Finally, ScanLens positions an invisible text layer behind the page image, which is what makes the exported PDF searchable — the same workflow described on the searchable PDF page, seen from the engine's side.
The whole pipeline runs on the iPhone itself. The page image, the recognized text, and any copy you make never leave the device during OCR — no upload, no account, no remote server briefly holding your document. That is a real distinction from web-based "image to text" converters and many App Store OCR tools, which send your photo to a server to recognize it. For a passport page, a medical letter, or a signed contract, the difference between "processed on my phone" and "uploaded to someone's server" is the whole point. It also means recognition works with no internet connection at all.
ScanLens OCR handles a vast range of languages and scripts, making it useful for international documents, academic research, and multilingual workflows:
For documents containing multiple languages—like a English textbook with Japanese annotations—ScanLens automatically detects and processes each language appropriately without manual configuration.
An honest caveat: the language count is not a promise of uniform accuracy. Latin-script languages with large training corpora — English, Spanish, French, German — are the strongest. CJK scripts (Chinese, Japanese, Korean) recognize well on clean print but depend heavily on resolution, because each character carries far more detail per glyph than a Latin letter. Cyrillic (Russian, Ukrainian) is reliable for printed text. Arabic and other connected right-to-left scripts are harder, because the cursive joining and contextual letter forms complicate segmentation. Handwriting, in any script, is the hardest case and is covered on the scan handwriting to text page. Expect near-perfect results on clean printed Latin text, very good results on clean CJK and Cyrillic print, and budget for proofreading on Arabic, low-resolution CJK, and handwriting.
"OCR" gets used for three different things on iPhone. They overlap, but each fits a distinct job.
Live Text recognizes text inside Photos, the Camera viewfinder, and screenshots. Tap the text-selection glyph and copy the words. It is free, instant, and on every recent iPhone — the right tool for a phone number off a flyer, a Wi-Fi password from a card, or a few lines from a screenshot. Its limits show with documents: one image at a time, no multi-page batching, no searchable PDF, and no deskew or cleanup of a photographed page.
This is what ScanLens does. OCR is one stage of a document workflow: capture or import a page, the app deskews and enhances it, recognizes the text, and keeps that text attached to the document — as a searchable PDF layer, as copyable text, or as input to a PDF to Word hand-off. Recognition runs page by page across a whole document, and the text stays with the file you archive. The right tool when the text belongs to a document you want to keep, search, or send.
Some apps do nothing but OCR: paste an image, get text back. Many are web-backed, uploading the image to a server to recognize it. Convenient for a single odd image, but they add a privacy cost and an internet dependency an on-device, document-first tool avoids. If the content is sensitive, this is the category to be cautious with.
Rule of thumb: Live Text for a quick snippet, an in-app OCR feature when the text belongs to a document you are keeping. To pull text from a single photo specifically, the scan text from a photo walkthrough covers the fastest route.
ScanLens performs best on clean, well-lit printed documents and remains useful for many handwritten notes and mixed-layout pages. OCR quality depends on the source material, and when a scan comes back with errors the cause is almost always one of four factors — three of which are under your control at capture time.
Steady the phone, fill the frame, and find even light, and most printed pages recognize cleanly on the first try. Faded, creased, or stained originals stay the hardest cases for any scanner.
Once ScanLens extracts text from your documents, you can:
OCR (Optical Character Recognition) converts images of text into machine-readable text. ScanLens analyzes the page layout, identifies text regions, and turns them into searchable, selectable text that you can reuse in PDFs, notes, and other documents.
ScanLens OCR supports over 50 languages including English, Spanish, French, German, Chinese (Simplified and Traditional), Japanese, Korean, Arabic, Hebrew, Russian, and many more. It handles both Latin and non-Latin scripts, and automatically detects multiple languages in the same document.
Yes, ScanLens can recognize handwritten text. Results depend on legibility, lighting, and scan quality, so the best results come from clear handwriting and strong contrast between ink and paper.
Yes, ScanLens embeds OCR text invisibly within PDFs, making them fully searchable. The visual appearance remains unchanged, but you can use Ctrl+F (or Cmd+F) in any PDF reader to search for any word or phrase in your scanned documents.
Yes, all OCR processing happens locally on your iPhone using Apple's Neural Engine. No internet connection is required, and your documents never leave your device. This ensures complete privacy even for sensitive documents.
ScanLens uses Apple's Vision framework, the same on-device text recognition that powers Live Text and the Camera app. It runs a neural recognition model on the iPhone's Neural Engine, returning recognized text with on-page coordinates. ScanLens uses those coordinates to build copyable text and the invisible searchable layer in exported PDFs. Nothing is sent to a server to be recognized.
Live Text is excellent for grabbing a quick snippet from a single photo or screenshot, and it is built into every recent iPhone. A scanner app uses OCR as part of a document workflow: it deskews and enhances the page, recognizes text across a whole multi-page document, and keeps that text attached to the file as a searchable PDF layer or as input for PDF-to-Word. Use Live Text for a phone number off a flyer; use an in-app OCR feature when the text belongs to a document you want to search, archive, or send.
Recognition quality depends on the script and the source image. Latin-script languages like English, Spanish, French, and German are the strongest. Clean Chinese, Japanese, Korean, and Cyrillic print recognizes well but needs higher resolution because the characters carry more detail per glyph. Arabic and other connected right-to-left scripts are harder because the letters join and change shape in context. Handwriting is the hardest case in any script. For best results, use a sharp, well-lit, high-resolution capture and expect to proofread dense CJK at low resolution, Arabic, and any handwriting.
Download ScanLens free and try on-device OCR on your iPhone. See pricing and plans for the full feature set.