Recent Blog Posts
-
Cable Companies Assail Rural Phone Subsidies
Nov 06 20092:16 pm EDT -
Windows 7 Sales Are Strong
Nov 06 20097:46 am EDT -
Biotech Firm Light Sciences Raises $35 Million
Nov 05 20095:57 pm EDT -
Tough VC Market Claims Frazier Technology
Nov 05 20098:02 am EDT -
Digby Buys Mobile Commerce Site Movaya
Nov 04 20091:08 pm EDT
Links
- Engadget

- Pandora

- GigaOM

- USA TODAY Tech

- Todd Bishop's Microsoft Blog

- Somewhat Frank's tech conference list

- BuzzTracker Tech

- The Long Tail

- Tom Foremski

- Roger McGuinn's Folk Den

- John Battelle's SearchBlog

- Mark Cuban's blog

- SciTech Daily

- Romenesko

- Kevin Maney's site

- Steven Johnson

- Marc Andreessen

- TechCrunch

- Fred Wilson

- paidContent

- Spiedies, mmmm

Google Is Now Scanning Documents
Google has begun to index documents posted online that contain images of text using Optical Character Recognition (OCR) technology, it announced yesterday on its blog.
Previously only docs converted to PDFs with text were indexed and included in results. Since scanned docs are only a picture of text, they are typically more difficult to interpret, and the pages can include wrinkle, smudges or stains.
This advancement opens up a whole new collection of information, including many government and academic documents once hidden from the public searches.
The news comes a few days after Google settled its book-scan suit, giving it the go-ahead to continue its book search project.
By Chris Snyder for Wired.comAlso on Wired.com:
DHS: Scour Blogs to Stop Bombs
Google Yahoo Deal Crumbling, Report
Now Official: No One In Tech Can Defend McCain
Subscribe to Wired magazine






