Skip to Main Content
Ask CWU Libraries
CWU Libraries Home

HathiTrust Digital Library

A partnership of academic & research institutions, offering a collection of millions of titles digitized from libraries around the world.

What is the HathiTrust Research Center?

The HathiTrust Research Center (HTRC) provides an infrastructure to search, collect, analyze, and visualize the full text of nearly 3 million public domain works within the HathiTrust and is intended for nonprofit and educational researchers. It is a collaborative research center launched jointly by Indiana University and the University of Illinois, along with HathiTrust, to help meet the technical challenges researchers face when dealing with massive amounts of digital text. It develops cutting-edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge.

The HTRC builds tools and services for scholars to perform research using data from the HathiTrust Digital Library. The Center is breaking new ground in the areas of text mining and non-consumptive research, allowing scholars to fully utilize HathiTrust content while preventing intellectual property misuse within the confines of current U.S. copyright law.

Tools from the HTRC

HathiTrust Research Center creates tools for researchers to analyze the text of content in the HathiTrust Digital Library. These tools make it easier for researchers to perform text analysis and many have easy to use interfaces.

Extracted Features

An unrestricted dataset of metadata and word count for each page in the HathiTrust Digital Library.

HTRC Bookworm

The HathiTrust Bookworm allows for the searching of lexical trends millions of volumes in HathiTrust over time. Search by date, publication country, language, words per volume, number of words, or resource type to see how often words have been used. 

Text Analysis Algorithms

HTRC Algorithms are web-based, click-and-run tools to perform computational text analysis on volumes in the HathiTrust Digital Library. The algorithms can help you explore, analyze, and visualize public worksets or those you have created.

Data Capsules

A system of a secure computing environments for performing researcher-driven text analysis on the HathiTrust corpus. All users may access public domain items but access to items in copyright are restricted to member-affiliated researchers.

HTRC Beta Tools

Experiment with HTRC in-development and beta tools. Some tools may eventually be integrated into HTRC's suite of tools, or they may be taken down. Some tools may require experience with programming.

Worksets and Datasets

Worksets

HTRC worksets are user-created collections of HathiTrust volumes to be treated as data and analyzed using HTRC tools and services. Worksets are curated by researchers, and they can be shared and cited to improve reproducibility.

Datasets

HTRC releases research datasets to facilitate text analysis using the HathiTrust Digital Library.

Creating Collections

What are collections?

HathiTrust Digital Library allows users to create lists or collections of items in the HathiTrust. These collections can be searched independently of the rest of the repository and are a great way to gather related resources together. Each collection can be kept private or made public so that other users can view them. HathiTrust has many different types of collections publicly available. Some featured collections could include Women Composers Collections, Records of the American Colonies, Islamic Manuscripts, and Ancestry and Genealogy.

Types of Collections

Temporary Collections

Without logging in, you can create temporary collections. These collections have all the same functionality of permanent collections, but are only available until the end of the browser session.

Permanent Collections

If you are logged into a institutional account, you can create permanent collections. Permanent collections will be available any time that you are logged in and you can keep adding items to grow the collection. These collections can be shared via many different social media platforms and can be made public so they can be found on HathiTrust.

Creating and Adding to Collections

To create a collection:

  • If you want to create a permanent collection you will need to log in with your CWU ID or you can create a temporary collection without logging in.
  • Click on the "Collections" or "My Collections" option on the top of any HathiTrust page
  • Click on "Create a New Collection"
  • You will need to name the collection, give it a description, and if you are logged-in, you can choose where it is public or private.
  • Click "add"

To add a work to a collection:

  • Via Full-text search:
    • Search for items using the full-text search
    • Each result will have a box next to it, select the results that you would like to add
    • At the top of the results, choose the collection you want to add the selected results to
    • Click "add"
  • Via individual item page:Add to Collection menu
    • Search for an item
    • Click on either the "Full View" or "Limited (search only)" option to get to the individual item page
    • In the menu on the left-hand side in the "Add to Collections" menu, select a collection
    • Click "add"

HathiTrust collections may be used to save books for future reference; search across selected volumes; share with others; or to download metadata for selected volumes.

For more information see:

Searching Collections

Clicking on the Collections tab at the top of the homepage brings up a list of all the collections created and made public by HathiTrust users. There are a variety of ways to sort these public collections, including by number of items, title, and owner.  You may also search public collections by keyword. There is no advanced search option for collections.