Skip to Main Content

Processing and Description: Digital Material

Guide for UB Special Collections covering accessions, processing, description, and care of collection material.
Last Updated: Sep 27, 2021 4:22 PM

DIGITAL MATERIAL

Collection-level description of born-digital and hybrid collections

The following guidelines are adapted from the UC Guidelines for Born-Digital Archival Description, version 1.0, 2017. 

(University of California Systemwide Libraries. (2017). UC Guidelines for Born-Digital Archival Description. UC Office of the President: University of California Systemwide Libraries. Retrieved from https://escholarship.org/uc/item/9cg222jc.)

FIELD EAD TAG REQUIRED CONTENT EXAMPLES
Extent <extent> Yes/Repeatable

A quantitative measurement of processed digital content.

Include both the size of the digital material in GB as well as the total number of files that have been preserved.

Use ‘GB’’ instead of “Gigabytes”, “gigabyte”, “Gb”, “GBs”

When calculating size, round to three decimal points only when content is less than 1 GB.

For unprocessed material where capacity is unknown or difficult to estimate, include a count of the unprocessed media formats.

When describing an archival object that is represented by both digital and analog materials, choose Portion = "Part"

Hybrid Collection:

Extent: 109 linear feet
Extent: 985 GB

Container summary:

6 oversize boxes, 1 manuscript box, and 3400 GB (37,364 files)

 

Born-Digital Collection:

Extent: 3750 GB
Container summary: 58,439 digital files

Container summary: 3 unprocessed hard drives (100 GB) and 10 unprocessed 3.5" floppy disks
Abstract <abstract> Optional

Brief, summary of the collection. If there is significant born-digital material present in a collection, the Abstract should reflect this.

The collection includes digital photographs, correspondence, including email, reports, meeting minutes, and other documentation created during professional involvement on University committees.
Appraisal Information <appraisal> Optional Note any general information about actions relating to appraisal, deduplication, or weeding of digital files. Refer to or link to library policy if applicable. Do not include specific technical details about the process of de-duplication or weeding in Appraisal Information. Use the Processing Information Note to provide additional information if necessary. Temporary and deleted files were removed from this collection, according to the Library’s digital preservation and privacy policies.
Arrangement <arrangement> Yes

Include a sentence or two about how the digital materials were organized and arranged.

Note whether or not the original order of the files has been maintained.

Note whether the digital materials have been segregated into their own series or whether the digital material has been subdivided into multiple series, and if so on the basis of what (ie, content, format, etc.)? This is especially important to note in the context of hybrid collections, as well as in cases where there may be duplicative or overlapping material, which is often due to a donor’s migration and/or backup routines.

Exception: If the files have been re-arranged by the processor into a new folder structure or if file names were altered, include this information in the Processing Information note.

May repeat at series-level and sub-series-level.

Born-digital materials are integrated into their corresponding series based on content. The original order of the files is retained.

Sub-series arranged alphabetically by course title or subject. Born-digital materials are arranged in the order in which they were created by Professor Claude Welch.

Digital materials have been placed into the ‘writings’ series based on their content. No other changes have been made to the original arrangement, structure, or naming of the files.

Collection is arranged into series: Correspondence; Teaching materials; Photographs; Research and Publications.
Conditions Governing Access <accessrestrict> Optional Use this field to inform researchers about special instructions needed to access the digital content.

Label = Special Viewing Instructions

Content = Access to digital material provided via the University Archives’ Digital Archives online repository.

Conditions Governing Use <userestrict> Optional This element is used in the same way as to describe physical materials.  
Immediate Source of Acquisition <acqinfo> Optional

Record general information about the acquisition of born-digital material, such as the source, date, and type of acquisition.

Acquisition of born-digital content often involves technical processing after media has been physically transferred, the specific methods and processes of born-digital acquisition and data capture should be detailed in a Processing Information Note
The digital files from the Arnold Berleant papers were donated by Arnold Berleant in June 2016 on one hard drive.
Physical Characteristics and Technical Requirements <phystech> Optional

Use this field to inform researchers about the physical or technical characteristics of digital materials that will affect their ability to access them.

If any portion of a collection contains digital material that cannot be readily accessed by researchers, then a note must be placed at every level of the collection (ie series, box, folder, etc.) to notify researchers that this is the case. This ensures that researchers will see the requirements necessary to access material.

May repeat at sub-series level

Unprocessed materials example:

University Archives cannot provide access to all media formats in Series I. Correspondence due to lack of required hardware. For more information, please contact lib-archives@buffalo.edu.

Please note that the University Archives is not able to provide access to two 3.5-inch floppy disks located in this series due to data corruption.

Processing Information <processinfo> Yes If the digital materials were processed either at a later date and/or by a different person than the rest of the collection, specify when and by whom they were processed.

Digital materials processed by Archives staff in 2018.

Processed by John Smith in February 2016. Digital materials processed in 2017 by Jane Doe, under the supervision of Sarah Jones.
Processing Information (continued) <processinfo> Yes

Information that future archivists and users use to understand where the materials came from, how they were created, and the process by which they are able to access the materials. Decisions to migrate data from storage media, to normalize different file formats, redact or remove Personally Identifiable Information (PII), extract files, or alter filenames, or if material cannot be processed due to technical limitations must all be recorded.

Also note if processors have altered original file structure, deleted empty directories, changed (or not) permissions, etc.

This collection was processed in accordance with the University Archives processing guidelines/policies. For more information, see https://research.lib.buffalo.edu/digitalpreservation/processing

Digital files created by [insert donor name] were transferred to the University Archives on various types of storage media including compact discs and floppy disks. Where possible, digital content saved on storage media was migrated from the storage media, normalized to standard preservation and access formats, and transferred to a stable preservation environment following the University Libraries' Digital Preservation guidelines. Some folder titles were altered and file arrangement reorganized to assist researchers in locating and identifying digital content.

Some files have been redacted to protect personal identifiable information or Protected Health Information (PHI) in accordance with federal regulations and UB Libraries policies.

The original file structure has been maintained; duplicate files have not been deleted.
Scope and Content <scopecontent> Yes

Describe the functions that created the digital materials, the types of records described, and the date ranges. Sometimes, a processor will edits an existing Scope and Content note. When adding description to an existing note, incorporate description of the born-digital material in a balanced way.

May repeat at series-level and sub-series-level

This series contains Arnold Berleant's digital manuscripts. Included are drafts, corrections, and limited correspondence with editors and peers.

 

Describing series and sub-series in born-digital and hybrid collections

The following guidelines are adapted from the UC Guidelines for Born-Digital Archival Description, version 1.0, 2017. 

(University of California Systemwide Libraries. (2017). UC Guidelines for Born-Digital Archival Description. UC Office of the President: University of California Systemwide Libraries. Retrieved from https://escholarship.org/uc/item/9cg222jc.)

FIELD EAD TAG REQUIRED CONTENT EXAMPLES
Date <unitdate> Yes/Repeatable

A Creation date signifies the date of the creation of the resource.

A Modified date signifies the most recent date of change to the resource. A Modified date is used only when describing digital materials and can be used in conjunction with a creation date.

A Normalized date signifies the date of file normalization. A normalized date is used when files are converted to a different file format for the purpose of long-term preservation. A normalized date may be used in conjunction with a creation and/or modified date.

Label = Creation

Expression = April 1, 2017

Type = Single

Begin = 2017-04-01

 

Label = Modified

Expression = April 24, 2017

Type = Single

Begin = 2017-04-24

 

Label = Normalized

Expression = May 30, 2017

Type = Single

Begin = 2017-05-30

Extent <extent> Yes

A quantitative measurement of processed digital content in this series/sub-series.

Include both the size of the digital material in GB as well as the total number of files that have been preserved.

Use ‘GB’’ instead of “Gigabytes”, “gigabyte”, “Gb”, “GBs”

When calculating size, round to three decimal points only when content is less than 1 GB.

For unprocessed material, estimate according to the maximum storage capacity of the media.

When describing an archival object that is represented by both digital and analog materials, choose Portion = "Part"

Born-Digital Collection:

Extent: 7 GB
Container summary: 1,526  files, 20 folders

Container summary: 3 unprocessed hard drives (100 GB) and 10 unprocessed 3.5" floppy disks

Arrangement <arrangement> Yes

Include a sentence or two about how the digital materials are organized and arranged.

Note whether or not the original order of the files was maintained during processing.

Note whether the digital materials are segregated into their own series or whether the digital material is subdivided into multiple series, and if so on the basis of what (i.e., content, format, etc.)? This is especially important to note in the context of hybrid collections, as well as in cases where there may be duplicate material.

Exception: If the files were re-arranged by the processor into a new folder structure or if file names were altered, include this information in the Processing Information note.

May repeat at sub-series level

Born-digital materials are integrated into their corresponding series based on content. The original order of the files is retained.

Sub-series arranged alphabetically by course title or subject. Born-digital materials are arranged in the order in which they were created by Professor Claude Welch.

Digital materials have been placed into the ‘writings’ series based on their content. No other changes have been made to the original arrangement, structure, or naming of the files.

Conditions Governing Access <accessrestrict> Optional Use this field to inform researchers about special instructions needed to access the digital content.

Label = Special Viewing Instructions

Content = Access to digital material provided via the University Archives’ Digital Archives online repository.

Physical Characteristics and Technical Requirements <phystech> Optional

Use this field to inform researchers about the physical or technical characteristics of digital materials that will affect their ability to access them.

If any portion of a collection contains digital material that cannot be readily accessed by researchers, then a note must be placed at every level of the collection (ie series, box, folder, etc.) to notify researchers that this is the case. This ensures that researchers will see the requirements necessary to access material.

May repeat at sub-series level

Unprocessed materials example:

University Archives cannot provide access to all media formats in Series I. Correspondence due to lack of required hardware. For more information, please contact lib-archives@buffalo.edu.

Please note that the University Archives is not able to provide access to two 3.5-inch floppy disks located in this series due to data corruption.

Processing Information <processinfo> Optional/Repeatable

Information that future archivists and users use to understand where the materials came from, how they were created, and the process by which they are able to access the materials. Decisions to migrate data from storage media, to normalize different file formats, redact or remove Personally Identifiable Information (PII), extract files, or alter filenames, or if material cannot be processed due to technical limitations must all be recorded.

Also note if processors have altered original file structure, deleted empty directories, changed (or not) permissions, etc.

This collection was processed in accordance with the University Archives processing guidelines/policies.

Digital files created by [insert donor name] were transferred to the University Archives on various types of storage media including compact discs and floppy disks. Where possible, digital content saved on storage media was migrated from the storage media, normalized to standard preservation and access formats, and transferred to a stable preservation environment following the University Libraries' Digital Preservation guidelines. Some folder titles were altered and file arrangement reorganized to assist researchers in locating and identifying digital content.

Some files have been redacted to protect personal identifiable information or Protected Health Information (PHI) in accordance with federal regulations and UB Libraries policies.

Scope and Content <scopecontent> Yes

Describe the functions that created the digital materials, the types of records described, and the date ranges. Sometimes, a processor will edits an existing Scope and Content note. When adding description to an existing note, incorporate description of the born-digital material in a balanced way.

May repeat at series-level and sub-series-level

This series contains Arnold Berleant's digital manuscripts. Included are drafts, corrections, and limited correspondence with editors and peers.

 

Describing files in born-digital and hybrid collections

The following guidelines are adapted from the UC Guidelines for Born-Digital Archival Description, version 1.0, 2017. 

(University of California Systemwide Libraries. (2017). UC Guidelines for Born-Digital Archival Description. UC Office of the President: University of California Systemwide Libraries. Retrieved from https://escholarship.org/uc/item/9cg222jc.)

FIELD EAD TAG REQUIRED CONTENT EXAMPLES
Date <unitdate> Yes/Repeatable

A Creation date signifies the date of the creation of the resource.

A Modified date signifies the most recent date of change to the resource. A Modified date is used only when describing digital materials and can be used in conjunction with a creation date.

A Normalized date signifies the date of file normalization. A normalized date is used when files are converted to a different file format for the purpose of long-term preservation. A normalized date may be used in conjunction with a creation and/or modified date.

Label = Creation

Expression = April 1, 2017

Type = Single

Begin = 2017-04-01

 

Label = Modified

Expression = April 24, 2017

Type = Single

Begin = 2017-04-24

 

Label = Normalized

Expression = May 30, 2017

Type = Single

Begin = 2017-05-30

Extent <extent> Yes

A quantitative measurement of processed digital content in this file.

Include both the size of the digital material in GB as well as the total number of digital files and folders that are preserved.

Use ‘GB’’ instead of “Gigabytes”, “gigabyte”, “Gb”, “GBs”

When calculating size, round to three decimal points only when content is less than 1 GB.

For unprocessed material, estimate according to the maximum storage capacity of the media.

When describing an archival object that is represented by both digital and analog materials, choose Portion = "Part"

Born-Digital Collection:

Extent: 7 GB
Container summary: 1,526  files, 20 folders

Container summary: 3 unprocessed hard drives (100 GB) and 10 unprocessed 3.5" floppy disks

Arrangement <arrangement> Yes

Include a sentence or two about how the digital materials were organized and arranged.

Note whether or not the original order of the files has been maintained.

Note whether the digital materials have been segregated into their own series or whether the digital material has been subdivided into multiple series, and if so on the basis of what (i.e., content, format, etc.)? This is especially important to note in the context of hybrid collections, as well as in cases where there may be duplicate material.

Exception: If the files are re-arranged by the processor into a new folder structure or if file names were altered. Include this information in the Processing Information note.

The original order of the files is retained.

Sub -folders are arranged alphabetically by course title or subject.

Conditions Governing Access <accessrestrict> Optional Use this field to inform researchers about special instructions needed to access the digital content.

Label = Special Viewing Instructions

Content = Access to digital material provided via the University Archives’ Digital Archives online repository.

Physical Characteristics and Technical Requirements <phystech> Optional

Use this field to inform researchers about the physical or technical characteristics of digital materials that will affect their ability to access them.

If any portion of a collection contains digital material that cannot be readily accessed by researchers, then a note must be placed at every level of the collection (ie series, box, folder, etc.) to notify researchers that this is the case. This ensures that researchers will see the requirements necessary to access material.

May repeat at sub-series level

Unprocessed materials example:

University Archives cannot provide access to all media formats in Series I. Correspondence due to lack of required hardware. For more information, please contact lib-archives@buffalo.edu.

Please note that the University Archives is not able to provide access to two 3.5-inch floppy disks located in this series due to data corruption.

Processing Information <processinfo> Optional/Repeatable

Information that future archivists and users use to understand where the materials came from, how they were created, and the process by which they are able to access the materials. Decisions to migrate data from storage media, to normalize different file formats, redact or remove Personally Identifiable Information (PII), extract files, or alter filenames, or if material cannot be processed due to technical limitations must all be recorded.

Also note if processors have altered original file structure, deleted empty directories, changed (or not) permissions, etc.

Digital files created by [insert donor name] were transferred to the University Archives on various types of storage media including compact discs and floppy disks. Where possible, digital content saved on storage media was migrated from the storage media, normalized to standard preservation and access formats, and transferred to a stable preservation environment following the University Libraries' Digital Preservation guidelines. Some folder titles were altered and file arrangement reorganized to assist researchers in locating and identifying digital content.Some files have been redacted to protect personal identifiable information or Protected Health Information (PHI) in accordance with federal regulations and UB Libraries policies.

Some folder titles were altered and file arrangement reorganized to assist researchers in locating and identifying digital content.

Scope and Content <scopecontent> Yes

Describe the functions that created the digital materials, the types of records described, and the date ranges. Sometimes, a processor will edits an existing Scope and Content note. When adding description to an existing note, incorporate description of the born-digital material in a balanced way.

Contains Arnold Berleant's digital manuscripts. Included are drafts, corrections, and limited correspondence with editors and peers.

Basic Information

FIELD TAG XIP REQUIRED CONTENT EXAMPLE
Title <title> <title> Yes This element provides a word or phrase by which the material being described is known or can be identified. A title may be devised or formal. Title is often auto-filled by Preservica and is the name of the collection or deliverable unit being described and may include the date of creation. Newsletters 2005
Identifier <refid> <collectionref> Yes An identifier unique to this digital object. When describing a digital object stored in Preservica, the string will be a 32-digit number. The number is located in the External Documents > Location Field, among other locations in the record. 464a6113-3deb-4903-805c-23a1db330c59
VRA Core level     Yes

Choose from drop-down list. Primarily for use in multi-level VRA Core compliant records to indicate that a description is about a collection, a work, or an image.

Work (a built or created object)

Collection (an aggregate of such objects)

Image (a visual surrogate of such objects)

Work

Collection

Image
Digital Object Type <typeOfResource>   Yes Choose a generic term indicating the basic content type of the digital object from a drop-down list.  
Language <language> <language> Optional The language of the material being described English
Restrictions     Optional Select this tick box if access restrictions exist  

 

File Versions record*

*Two file version records are generated automatically; one for the digital preservation staff system and one for the electronic record access system.

FIELD REQUIRED CONTENT EXAMPLES
Make Representative Optional Select for Electronic Record Access System file version  
File URI Yes File URI is supplied by Preservica. Edit URL to remove all text except the URL https://buffalo.access.preservica.com/archive/sdb%3AdeliverableUnit|613cdbdf-978c-4871-9bd3-42a71a19aafa
Publish Optional

Select tick box to publish the Electronic Record Access System file version only.

 
Use Statement Yes

This statement indicates the use for which the digital object is intended.

Supplied by Preservica.

Digital Preservation Staff System

Electronic Record Access System
XLink Actuate Attribute Yes

This attribute is used during export to indicate how the digital object should display.

Choose from drop-down list

Used in conjunction with Show Attribute
onRequest
XLink Show Attribute Yes

This attribute is used during export to indicate where the target of a link should be displayed.

Choose from drop-down list

Used in conjunction with Actuate Attribute
new
File Format Name Optional

The name of the format for the file type(s)

Choose from drop-down list

Only applicable at the collection and deliverable unit level 

 
File Format Version Optional

The version of the format for the file type

 

Separate multiple versions using a semicolon

 

Leave blank for digital objects containing multiple file formats
1.4; 1.5
File Size No The size of the digital file  
Checksum No A digital signature used for monitoring the integrity and authenticity of a digital file  
Checksum method No The algorithm used for generating checksums  

To create thumbnails add a third File Version record and fill in the following fields:

FIELD REQUIRED CONTENT EXAMPLES
File URI Yes

A URL to the thumbnail for the folder or file including
the Deliverable Unit UUID

https://buffalo.access.preservica.com/download/
thumbnail/deliverableUnit_UUID

https://buffalo.access.preservica.com/download/thumbnail/
deliverableUnit_8c3a0804-06d5-4e52-a843-6463a8781a81

Publish Yes Select tick box to publish the thumbnail.  
Use Statement Yes Select the appropriate Type, for a image thumbnail, select Image-Thumbnail Image-Thumbnail
XLink Actuate Attribute Yes

This attribute is used during export to indicate how the thumbnail should display.

Choose from the drop-down list

Used in conjunction with Show Attribute

OnLoad
XLink Show Attribute Yes

This attribute is used during export to indicate where the target of link should be displayed.

Choose from drop-down list

Used in conjunction with Actuate Attribute

Embed
File Format Name Optional

The name of the format for the file type

Choose from drop-down list

Only applicable at the collection and deliverable unit level

JPEG File Interchange Format
File Format Version No    
File Size No    
Checksum No    
Checksum method No    

 

Dates sub-record

FIELD REQUIRED CONTENT EXAMPLES
Label Yes/Repeatable

Describes the type of activity that the date signifies.

Modified is preferable to Creation with describing digital materials, where a modified and creation date are present, creation date should be displayed first.

A Normalized date signifies the date of file normalization. A normalized date is used when files are converted to a different file format for the purpose of long-term preservation. A normalized date may be used in conjunction with a creation and/or modified date.

Creation

Modified

Normalized

Expression Yes A natural language field used to express the date or date range of the materials August 20 2017-August 31 2017
Type Yes

Indicate the type for normalized date information, either a single date or a date range (inclusive or bulk).

Choose from a drop-down list

Single

Inclusive

Bulk

Begin Yes The date at which the material begins 2017-08-20
End Yes The date at which the material ends 2017-08-31
Certainty No    
Era Yes Indicates the period during which years are numbered, such as B.C. or C.E. Defaults to “ce”
Calendar Yes Indicates the system of reckoning time Defaults to “Gregorian”

 

Extent sub-record

FIELD REQUIRED CONTENT EXAMPLES
Portion Yes Choose whole Whole
Number Yes

Open numeric field used to express the number of units in the extent statement

Express in gigabytes

For files less than 1 MB, round up to 0.001 GB

23.5

0.02
Type Yes

A term indicating the type of unit used to measure the extent of the materials.

Digital materials are measured in gigabytes
Gigabytes
Container Summary Yes Open text field describing the number of files and optionally, the number of folders

12,000 files

12,000 files, 82 folders