Multilingual Tools for Digital Humanities Projects
April 29th is the Day of DH 2021, an initiative led by centerNet where digital humanities (DH) practitioners share their work with a global community through social media and blog posts. The 2021 organizing committee in the U.S. is led by fellow Jesuit institution Loyola University Chicago and UCLA around the theme of “multilingual DH.” To participate in this online conversation, we’re highlighting our own Italian Pamphlet Digital Collection and sharing a selection of tools and resources that can help with multilingual DH projects.
Italian Pamphlet Collection at Fordham Libraries
Managing digital collections is one of the many ways that libraries and librarians participate in DH scholarship. Digitization allows Fordham Libraries to share aspects of our unique collection with scholars around the world through the web. For those who read Italian and are interested in the history of Italian unification known as the Risorgimento, Fordham’s Digital Collections portal holds a collection of nearly 1,600 pamphlets on this topic. Published between 1815-1880, these pamphlets reflect the Catholic Church’s response to the changes occurring in 19th-century Italy.
Looking for a user-friendly tool to examine texts with your students? Voyant Tools is “a web-based reading and analysis environment for digital texts.” This set of tools can be helpful to anyone looking to explore patterns and structures of a text. The user interface can be configured in 10 languages and the overall platform can analyze multiple languages. Check out the Languages section of their documentation page to learn more about the options, as well as strengths and weaknesses.
For those who want to dig deeper to understand this set of tools, we recommend reading Hermeneutica: Computer-Assisted Interpretation in the Humanities by Geoffrey Rockwell and Stefan Sinclair, the creators of Voyant Tools. Fordham Libraries owns this title in both print and electronic format.
Trying to transcribe text in scanned images to create a humanities dataset? Rather than typing out each word, ABBYY FineReader can help automate the process of recognizing text within images through OCR (optical character recognition). This proprietary software option can be adjusted to recognize multi and monolingual typed text in 195 languages. It can OCR text in paragraphs on a page or text in tabular or spreadsheet format. As with any automated process, the quality of OCR results will vary based on a variety of factors.
Visit the Digital Scholarship Workstation in Walsh Library to try out ABBYY FineReader for your DH projects. Feel free to consult a library liaison for DH for advice about OCR’ing a file and optimizing the language settings in ABBYY.
For advanced researchers, Multilingual DH is “a loosely-organized international network of scholars using digital humanities tools and methods on languages other than English.” They host an email list and a GitHub page including a list of natural language processing tools sorted by language.
Continue Exploring DH
Be sure to explore the global DH conversations happening today on Twitter and Instagram under the hashtags #DayOfDH2021 and/or #multilingualDH.
Curious about DH work happening at Fordham? See our Day of DH summary from 2020 and visit this blog regularly for updates about DH and Fordham Libraries. As always, feel free to Ask a Librarian or consult our Digital Humanities research guide if you have questions.
By Tierney Gleason, Reference and Digital Humanities Librarian