On February 15, 2010, in Research, by Chris Prom

After I spoke at the Society of Archivists’ Data Standards Group, a member of the audience asked if I have been working to evaluate sofware suitable for appraising records, i.e. helping archivists or producers select records for deposit into a trusted digital repository.  At the time I responded (somewhat off the cuff) that I had found particular file managers, renames, and bulk deletion programs to be useful, but that I hadn’t really considered the question all that much.

But as I reflected on it later, the  question seemed to grow more complex.  Most, if not all,  of the development work concerning digital repositories focuses on meeting the requirements of the OAIS reference model.  However, the reference model itself has nothing to say about how records should be selected for deposit.  On one hand, this makes sense, since each archives has a different focus.  But  appraisal (i.e. the selection of records for inclusion in an archives) has always been the most debated (or at least most heavily written about) archival topics.

Regardless of which theory or set of principles that an archivist uses, appraisal willalways be a process that requires careful and intelligent decision making by a person.  Each repository has its own collection focus and documentation policy.

In the analog archives world, appraisal decisions are made both before and after formal accession/submission.  After surveying the records (either formally or informally), the archivist will work with a records producer to determine which classes (series) of records should be retained for their continuing administrative, legal, or historical value.  After submission, records may be subject to further appraisal, as particular files or items are removed (“weeded”) from those that are permanently retained.  If the files are poorly organized (or not organized at all), the archivist may reorganize the records or supply new file/folder titles to increase their accessibility.

For appraising electronic records, I think that three basic things are needed:

  • A set of tools to examine, identify, compare, delete, rename, and reorganize records at both high and low granularity
  • A tool to manage information concerning records surveys/assessments.
  • A tool to manage submission agreements  (perhaps based on the model described in the TAPER project).
  • A method or set of tools to ensure that appraisal actions are documented.

While it may be possible to program selection rules into a sophisticated selection algorithm, most repositories would be happy to simply have efficient ways for the archivist to appraise electronic records.  So, we need a set of tools that allow us to examine, characterize, delete and possibly, reorder records quickly.  Such tools would allow us to decide whether or not they fall within the scope of the archives documentation policy, then take appropriate actions concerning them.

PREMIS may be a model here, but it tracks preservation actions (such as migrations, generation of checksums, etc), and it tracks information for individual files, not for groups of files.  EAD provides the <appraisal> and <processinfo> elements in which the archivist can provide descriptive notes describing exactly what was done, although it is an open question as to how often those fields are actually used.

In my next post, I’ll point to some software that can be used for they types of appraisal tasks listed above.  For now, my point is simply that the collective “we” has not put very much emphasis on development of tools to appraise, as opposed to preserve, electronic records.  But, this is a very big problem when you are staring at an accession of 25 GB in over 28,000 files.

  • Seth

    This question of effective electronic records appraisal has been on many of our minds for the past few years. As I read your last line “But, this is a very big problem when you are staring at an accession of 25 GB in over 28,000 files” I look at my desk upon which sits ~220 disks (a mix of both 4.7 GB DVDs & 700 MB CDs) which was handed to one of our collectors as part of an organization’s closing and no opportunity for early appraisal. This raised the question: do we even want these materials? I continually tell collectors to appraise based on content & context rather than simply the form or medium fully realizing that we don’t have very good tools to assist them in this task.

  • cjh

    Georgia Tech’s PERPOS project has been developing such tools for NARA’s ERA.

  • 80gb

    Glad the question seems to be proving a fruitful avenue of research! My original motivation for asking it was really very simple – although I’d used various tools to examine and identify files to weed from a complex accession, I did not know of any which then allowed the archivist to move easily and seamlessly to the next step – actual deletion (and documenting that deletion). Secondly, your point about comparison, is, I think, very important. The accession I was dealing was not well organised and had multiple duplicates from different phases of network storage and on CDs. Identifying the duplicates is relatively straightforward. The decision about which to destroy (and which context, by extension, the archivist feels happy to eliminate) is much harder – Emory’s work on the overlapping circles of Salman Rushdie’s personal archive is interesting in this regard: