A very small, but growing number of library databases and subscriptions offer some form of access to corpora for text and data mining, usually through defined methods. This guide provides information on specific collections licensed by Georgetown University Library that do permit some form of text and data mining activity. Many publishers offer text and data mining access via API, separate from the search interfaces used in daily use of these titles. Note that this information is subject to change by the publisher. Always review the text and data mining documentation from the provider before beginning your project.
Note that many publishers do not permit text and data mining on their resources. Find out more about those resources.
Some providers listed in this guide limit the content available to text and data mine to metadata only, Georgetown-subscribed content only, Open Access content only, or to a limited number of their products. Providers rarely permit traditional screen scraping.
Except for the resources (and their associated access policies) listed below under the terms provided, electronic resources licensed by Georgetown University usually:
Violation of these terms can easily result in access to the electronic resource being shut down for the entire campus, affecting your research and that of your fellow Georgetown University faculty and students around the world. Please review the Responsible Use of Electronic Resources policy for more information.
The Library actively advocates for its e-resource suppliers to provide text and data mining rights and support to Georgetown users.
If you are unsure of the text and data mining policies of a library resource, contact firstname.lastname@example.org in advance.
Many library database, e-book, and e-journal providers do not permit text and data mining on their products or do not have solutions available to Georgetown University. These include:
In addition, some providers listed in this guide limit the content available to text and data mine to metadata only, Georgetown-subscribed content only, Open Access content only, or to a limited number of their products.
This work is licensed under a Creative Commons Attribution NonCommercial 4.0 International License. | Details of our policy