BizJournals Portfolio
Oct 31 2008 4:05pm EDT

Google Is Now Scanning Documents

Google has begun to index documents posted online that contain images of text using Optical Character Recognition (OCR) technology, it announced yesterday on its blog

Previously only docs converted to PDFs with text were indexed and included in results.  Since scanned docs are only a picture of text, they are typically more difficult to interpret, and the pages can include wrinkle, smudges or stains.

This advancement opens up a whole new collection of information, including many government and academic documents once hidden from the public searches.

The news comes a few days after Google settled its book-scan suit, giving it the go-ahead to continue its book search project.

By Chris Snyder for Wired.com

Also on Wired.com:
DHS: Scour Blogs to Stop Bombs
Google Yahoo Deal Crumbling, Report
Now Official: No One In Tech Can Defend McCain

Subscribe to Wired magazine


blog comments powered by Disqus
 
Great Global Business Adventure

To win in the global race, don't get distracted by competitive noise and focus on your clients.

David Duncan sees signs of sales rebounding at his candlemaking firm Paddywax.

If you’re in cleantech, you’re a global business, even if you’re local.

spotlight on

Football Fever

Gridiron Green

Who is more valuable, a star quarterback who makes $14 million a year or a player on the bench who pulls in a fraction that amount? In the NFL, a big paycheck doesn't necessarily mean big performance. Read More