Skip to Main Content

Text and Data Mining Databases

Guidance on using library-licensed resources for text and data mining.

HathiTrust Research Center


HathiTrust Research Center provides access to the +13 million public domain works within the HathiTrust Digital Library. In addition to API access, the HathiTrust Research Center offers the following tools to assist researchers:

  • Portal and Workset Builder: a set of tools for assembling collections of digitized text and performing text analysis on them. 
  • HathiTrust+Bookworm: a tool for visualizing and analyzing word usage trends in the HathiTrust Digital Library. 
  • HTRC Data Capsule: a secure computing environment for performing researcher-driven text analysis on HathiTrust content. 

Access to APIs and the above tools (except HathiTrust+Bookworm) require a personal account on the HathiTrust Research Center website.​

Quick Info:

Coverage All HathiTrust public domain corpus (+13 million works)
Registration required? Yes
API available? Yes (multiple)
Publication restrictions? No
Library permission required? No

Publisher Resources:

Library Resources Covered:

  • Access to the HathiTrust Research Center is not dependent on Georgetown University's database licenses.

Creative Commons   This work is licensed under a Creative Commons Attribution NonCommercial 4.0 International License. | Details of our policy