Skip to Main Content

Digital Humanities: Documenting

This guide provides general and specialized information on the theory, tools, methodologies, and communities of practice in the digital humanities.
Last Updated: Feb 23, 2024 4:22 PM

What is Digital Humanities Data?

Humanities scholars do not always think of their research as "data," but the material that humanists collect, analyze, and contextualize includes data of many types: text, images, numeric data, sound, video, artifacts, etc.  This data may not be digital in its original form, but it can be converted to a digital format and computationally analyzed and visualized in new ways.  It is increasingly common for humanities data to be "born-digital." 

Organizing and documenting data can be daunting, but establishing a system early in the research stage saves time later, especially when it comes to grant applications and publishing.  This page provides a list of some helpful resources.

Documenting Your Project

Regardless of what stage your digital project is at, you probably have computer file folders full of articles, images, notes, drafts, and possibly spreadsheets.  Creating documentation files that explain your project and your data will help ensure its sustainability and re-use.  Ideally, these files will be platform agnostic (work on many systems, such as Windows and Mac).  

There are many different ways to document projects and data and often there is no one-size-fits-all solution.  It is important to document both project process (context, administrative info, workplan, workflows, etc.) and the the digital content (files, metadata, code, etc.).  

An Introduction:

The Project Documentation Checklist from the Socio-Technical Sustainability Roadmap provides an introduction to recordkeeping considerations.  

Additional Resources:

The links below and elsewhere on this page provide examples of processes or platforms that facilitate documentation.

Open Science Framework - OSF is a robust open-source platform for managing and documenting research projects.  Don't let the word "science" steer you away.  It can be used for humanities projects as well.  Here is a post about how it is being used in the Eleanor Roosevelt Papers Project

Preserving Your Research Data - This Programming Historian lesson provides concrete examples of how an individual scholar can effectively document digital research.  

GitHub - GitHub is a repository and version control platform that provides a range of functionalities.  It allows collaborative code development, review, and hosting with project management options (cards, notes, task lists).  GitHub Pages and Wikis provide space to write, record, and publish project documentation. 

Colectica for Excel - This tool allows you to document your spreadsheet data within an Excel workbook, using the open standard for data documentation.  

Organizing Images

Humanities scholars in many disciplines use images in their research or take photos of archival documents.  Taking good quality images and documenting their provenance can facilitate re-use for analysis, publishing, digital projects, teaching, and long-term preservation.  These resources provide guidance on specifications for taking photographs and organizing them.  

Tropy - Tropy is a free, open-source software that allows you to organize and describe your research photographs.  You can annotate images, record metadata, group images, add tags and notes, and search for images.  

Using Digital Tools for Archival Research - This is a research guide created by the University of Illinois Library Scholarly Commons.  Focused on using digital photography in the archives, it discusses how to get started, how to organize images, what software to use for editing, and more.  Included in the Organization section is a template for recording important metadata that can be filled out and stored with images. 

Documenting & Managing Data

Data can quickly get unwieldy in a digital project.  Managing data efficiently from the beginning can save time and hassle later.  These guides provide tips for stewarding your project data through its entire lifecycle.

Start Here: Research Data Management - This UB Libraries research guide provides guidance on responsible data management practices and includes information specific to UB regarding data privacy, security, and storage.

  • Data Management Plans (DMPs) - Data management plans provide a way to document the type of data you will be collecting for your project and establish procedures for stewarding data through the research lifecycle.  Plans are useful for any project and are typically a requirement for federal and international grant applications.  

Data Management for the Humanities - UCLA Library has an extensive research guide that is geared towards humanities data.  

Digital Humanities Data Curation  - This is an introduction and community guide to digital humanities data curation.  Data curation is “the active and ongoing management of data throughout its entire lifecycle of interest and usefulness to scholarship.” (Cragin et al. 2007)  Trevor Muñoz and Julia Flanders were the managing editors, and many experts in the field contributed to this guide.

Data Management Plan Tutorial - This tutorial by Penn State provides modules that explain different components of a data management plan (DMP) that is often required by grant-funding agencies, including the NEH Office of Digital Humanities.   


Broadly speaking, metadata is data about data.  We all create and interact with metadata in our daily lives, whether it is through social tagging, research documentation, or searching databases with keywords and subject terms.  In the digital humanities, metadata helps make a project "accessible, authoritative, interoperable, scalable, and preservable." (Gilliland, from Setting the Stage)

Ontologies and Metadata, Intro to Digital Humanities, Joanna Drucker

Setting the Stage by Anne J. Gilliland, in Introduction to Metadata, edited by Murtha Baca

Documenting the Data in Penn State's Data Management Plan Tutorial

Messy, Clean, Missing: Data Complexity

"The Point of Collection" and "On Missing Datasets" by Mimi Onuoha

"Against Cleaning" by Katie Rawson and Trevor Muñoz

Digital Humanities was created by UB Libraries' 2018-2020 CLIR Postdoctoral Fellow, Heidi Dodson. It is currently maintained by Stacy Snyder. Guide content is licensed CC BY 4.0.

Creative Commons symbol.