The Ideal Appraisal Tool

On April 1, 2010, in Research, by Chris Prom

As I’ve been working to pare my sets of test records down to a management set of records that might comprise the records to be included in a submission information packet, I’ve more or less come to conclude that what archivists really need is a purpose-built tool for conducing records appraisal.  Sure, it is possible to cobble together an approach to records appraisal using a variety of open source or paid tools, but none of the tools really allow you to go through and identify records quickly and easily.

What would an ideal e-records appraisal tool look like?

Before we get to that, allow me a short digression:

A Major Problem in working with E-records

Most, if not all applications that support electronic records work require that electronic records be in a more or less ‘archive ready’ form.  For example, if you wish to run a batch conversion operation on records, they generally must have the correct file extension, be held in a single directory without other formats, etc.   Furthermore, most of the applications to view and display electronic records use a number of defined formats.  But records are not stored that way on user’s computers, and few if any will be able to arrange them ‘properly’ prior to deposit.  Even if they did, asking donors to rearrange the files would undermine their provenance and original order.

The large scale projects concerning digital preservation tend to focus on issues like migration, preservation planning, and the development of description, repository, and access systems.  To my knowledge, no one is currently really working on such a tool that archivists can use to select and weed through records as part of an appraisal or processing in anticipate of deposit and to do so in a way that preserves original order; the closest thing is the SABA Copying Program which I described yesterday, which is really useful only in Denmark.  (In addition, Tufts University is focusing some attention on developing a an application and content model for records surveys and submission agreements, which is a separate matter, but could support this other tool that I have in mind.)

In order to get records into the archives, and work with them once there, what is really needed is a tool that can be used to do the same type of appraisal that is done in traditional analog archival work both before archives are deposited and as they are being processed (e.g. weeded) or prepared for permanent storage.  Doing this type of work has been the biggest roadblock I’ve encountered in working with my ALA Office of Intellectual Freedom Files.

Based on the experience I’ve had working with my test records, I think the ideal appraisal tool would:

  • be platform independent
  • be able to be run from a USB key/CD
  • not delete records or modify them in any way on the source machine or folder, but simply copy them to a new folder as a SIP
  • include methods to manual exclude records from copying based on defined criteria or individual selection
  • Include ways to bulk include/exclude files and folders
  • facilitate quick browsing and marking of folders/records, perhaps using an open source OSX ‘cover flow’ like interface that displays most common file types right in the interface (without needing to load external applications).
  • include tree analysis tools, to identify particular types of files, show how much space is occupied by folders or file types, and browse/mark them in a graphical interface.
  • include filtering option so records of only a particular date, extension, etc can be browsed and marked at one time.
  • facilitate duplicate identification and marking for exclusion, across multiple folders.  It would allow duplicate marking based on defined criteria (eg. retain only those highest in folder hierarchy, etc.)
  • Provide flexibility in how the records are copied into the SIP folder, to facilitate additional processing, e.g.:
    • allow user defined naming syntax
    • in original order
    • in folders sorted by date ranges
    • in folders sorted by extension or PUID
  • Record all actions including file exclusion decisions, renaming, resorting, etc undertaken during the copying operation, using  a simple tab or csv file and also in an xml syntax.  Decisions should be recorded for each item.  Items or folders that are not copied should be listed.  Some of this information may need to be entered manually, but most could be captured automatically during the copying process, in a manner similar to that used by the SABA-copying program.  (Information from this operation can later be used (perhaps by a processing application) to reassemble the original order or to accompany the files as part of the AIP’s provenance information. )
  • Provide some very simple forms which the archivist can use to manually supply accession name, inclusive dates, donor information, and appraisal information/rationales used.  (:all files in x folder were not included because they were duplicates”, etc.

Tagged with:  
  • Richard Lehane

    Hi Chris,
    thanks for a very practical post – I love the list, you are ready to go out to tender!
    I was a little confused by your use of the term ‘appraisal’ in this context. At one point you write ‘appraisal or processing’: are you proposing a tool for file by file processing (like weeding through a very messy paper deposit for junk and copies) or for assessment of significance? Or for both?
    On your ‘Recommendations’ page you make a clear distinction between ‘appraisal/assessment’ and ‘processing’ (steps 5 and 6). Those two steps seem to be conflated here?

  • Chris Prom

    I guess the two terms are conflated a bit, but in general I was trying to indicate that while files are being processed, in most cases, some level of appraisal is done, to remove materials that will not be included in the final processed records. Some, but not all of of the tools I listed here would be useful for appraisal are a less granular level (such as series level.)

  • Pingback: Lifestreams: Archives for the Digital Age? « Practical E-Records()