Google’s Book Scanning: An Open Source Offering

google_book_scan_patent

One of the more impressive embedded vision implementations (IMHO) that I've come across, albeit one of the more potentially troubling from a copyright infringement perspective, is Google's book-scanning, de-warping system. As detailed in patent 7,508,978, filed in September 2004 and awarded to the company in March 2009 (with thanks to CNET-via-Slashdot for the summary):

Google's system uses two cameras and infrared light to automatically correct for the curvature of pages in a book. By constructing a 3D model of each page and then 'de-warping' it afterward, Google can present flat-looking pages online without having to slice books up or mash them onto a flatbed scanner. Stephen Shankland writes that the 'sophistication of the technology illustrates that would-be competitors who want to feature their own digitized libraries won't have a trivial time catching up to Google.' First, a book is placed on a flat surface, while above it, an infrared projector displays a special mazelike pattern onto the pages.

Next, two infrared cameras photograph the infrared pattern from different perspectives. 'The images can be stereoscopically combined, using known stereoscopic techniques, to obtain a three-dimensional mapping of the pattern,' according to the patent. 'The pattern falls on the surface of (the) book, causing the three-dimensional mapping of the pattern to correspond to the three-dimensional surface of the page of the book.

Pretty cool technology. But pretty complex, therefore pricey, technology too. For a simpler, brute-force scheme for surmounting curved and crooked pages, check out this just-open-sourced design from Google, which employs a scanner and a vacuum cleaner (!!!) and costs just ~$1,500 (with further reductions possible):

Engineers from Google's Books team have released the design plans for a comparatively reasonably priced (about $1500) book scanner on Google Code. Built using a scanner, a vacuum cleaner and various other components, the Linear Book Scanner was developed by engineers during the '20 percent time' that Google allocates for personal projects. The license is highly permissive, thus it's possible the design and building costs can be improved.

It sucks. But in a good way (unless, that is, you're an embedded vision aficionado who'd prefer a more elegant algorithmic approach).

Sorry, couldn't resist 😉

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top