Description: An exploratory project using neural nets to expand the reach of metadata and encourage novel connections across text-based cultural heritage collections in the Library of Congress.
Home About the Project Doc2Vec and neural nets Next steps Technology About the Content Data Cleaning Explore the Collections Home Situating Ourselves in Cultural Heritage What if you could see the breadth of content across hundreds of thousands of historical documents at a glance? What if a machine could read all of them and help you find your own connections — and your own understanding of America?
This experimental project aimed to use, learn about, and help improve computational access to Library of Congress digital content, and to expose more of the Library’s rich collections to scholars and the public. Using machine learning, this exploratory tool serves as a proof of concept for finding potentially similar or related content outside of manual classification. It incorporates all available documents from the Reconstruction era (1865-1877), because how we tell the story of this period is critical to
As you begin an exploration of the collections, items will scatter across the screen. There are thousands of articles, letters, book chapters, manuscripts, and other text-rich items represented in the display, so some clusters or dots may represent several items; zoom in to see more.