The Internet Archive
See the following -
How to Use Sphinx to Give an Old Book New Life
By Joshua Allen Holm | November 29, 2016
The Internet Archive, Project Gutenberg, and Google Books are wonderful sources of historical books, but the finished products of their digitization efforts, while thorough and functional, lack that last bit of polish. For example, one of my interests is historical cooking, specifically Georgian and Regency British cookery and the contemporary period in American cookery, but the PDF versions of the relevant cookbooks are usually just basic black and white scans with no features that aid findability or searchability. The plain text versions, while more searchable, are not aesthetically pleasing and often contain numerous optical character recognition errors...
- Login to post comments