Skip to Main Content

Text and Data Mining Databases

Guidance on using library-licensed resources for text and data mining.



The Text Creation Partnership provides full-text transcriptions of early print books from the following collections:

  • Early English Books Online (general access through ProQuest)
  • Eighteenth Century Collections Online (general access through Gale)
  • Early American Imprints: Evans, 1639-1800 (general access through Readex)

Access to each corpus is independent of the web-based search systems generally available to Georgetown University users to browse each collection. Visit the Text Creation Partnership website for more information about each corpus and access methods.

Any text mining of these collections should be carried out through the Text Creation Partnership, not by scraping the individual databases on the ProQuest, Gale, or Readex platforms.

Quick Info:

Coverage full collections, regardless of Georgetown University license
Registration required? No
API available?


Publication restrictions? No
Library permission required? No

Publisher Resources:

Library Resources Covered:

Creative Commons   This work is licensed under a Creative Commons Attribution NonCommercial 4.0 International License. | Details of our policy