Google Docs adds OCR, converts images and PDFs to text
Google Docs continues to make the case for dumping your desktop work apps, this time with a useful new text recognition feature that converts PDFs or images into plain, editable text. This new OCR feature -- that's optical character recognition -- is quite accurate, and worked pretty well on some old college textbooks scans I had laying around on my hard drive. Things are a bit tricky when you've got a page with multiple columns -- your words might not end up in the right order, but they'll all be there, accurately recorded.
To use OCR, look for the " Convert text from PDF or image files to Google Docs documents" checkbox when you're uploading a file. The file will show up in Google Docs as a text document instead of its original format, so if you want to share the image, you'll have to upload it again with the box unchecked.
Google Operating System tested the new feature and didn't find it quite as accurate as I did. I agree with them that the loss of formatting is a problem, but the OCR was better than the 90% accuracy they noted in their test. Your mileage, obviously, may vary. The typeface, font size and scan quality of your PDF will all affect the results, but it should definitely be easier than re-typing the whole thing by hand.
We also previously covered Google Docs' OCR feature when it was still an experiment.
To use OCR, look for the " Convert text from PDF or image files to Google Docs documents" checkbox when you're uploading a file. The file will show up in Google Docs as a text document instead of its original format, so if you want to share the image, you'll have to upload it again with the box unchecked.
Google Operating System tested the new feature and didn't find it quite as accurate as I did. I agree with them that the loss of formatting is a problem, but the OCR was better than the 90% accuracy they noted in their test. Your mileage, obviously, may vary. The typeface, font size and scan quality of your PDF will all affect the results, but it should definitely be easier than re-typing the whole thing by hand.
We also previously covered Google Docs' OCR feature when it was still an experiment.













Comments
4
Subscribe to commentsm0r1artyJun 26th 2010 7:01AM
Whilst the previous comments have inspired me to consider alternatives to fashion I was just wondering how long do you think it will be until there is a translating version of this on the go?
It would be very handy to have Google 'Goggle' type features available.
216Jun 22nd 2010 7:38AM
Very useful
stancciaAug 22nd 2010 9:24PM
I know this App. It uses Tesseract engine. Here is another one you can have a try: free ocr. It only supports English though.
stancciaAug 22nd 2010 10:01PM
The site is at: http://www.goodocr.com