Much to Learn. . .

On September 10, 2009, in Projects, Research, Software Reviews, by Chris Prom

As I review my notes for the Society of Archivists Conference, I’m struck by one paper in particular: that of Malcom Todd.  He reviewed the digital preservation advisory services that the The National Archives (TNA) provides to the broader archives community in the UK.  (As I’ve noted elsewhere, TNA takes a much more expansive role than NARA in providing services for professional archivists, including policy planning and tools development for the entire archvies sector.) They they are hoping to ramp up this activity in providing assistance to broader UK community concerning electronic records and digital preservation planning and tools.

While many of the services and software that Mr. Todd reviewed where not new to me (e.g. DROID* and PRONOM), he provided a useful roadmap of acitivties that TNA is taking to transfer knowledge, including involvement in the “Digital Preservation Roadshows” that are co sposored by the Society of Archivists, TNA, and other organziations.  He noted that there are plans to combine the work from TNA ( PRONOM) and the Harvard  (JHOVE) in a combined Uniform Digital File Registry (UDFR).

There was much to chew on in his talk, but the most salient points I took away were these  (just to be clear–these are my conclusions, not necessarily Malcolm’s):

  1. The digital curation and IT communities have far outpaced the archivists in developing tools to facilitate digital presevation work.
  2. Digital preservation is a solvable problem, but it is only a small part of what we need to be effective in working with e-records (I know, this point is relatively facile and in any case is not new.)
  3. With a few notable exceptions, few pracicing archivists with actual ‘line’ experience have been heavily involved with standards and tool development or even in testing the tools developed to facilitate electronic records work.
  4. It is highly impertive that line archvists become more heavily involved in technial projects.  If we don’t do so, we will never influence the development of  software, methods, and policies.
  5. There is way too much information for one person to read, assess, and assimilate, even if one limits limits oneslef but one aspect of electronic records work, such as digital preservation.

As I’ve been reflecting on all this, I’ve also been reading UNESCO-commissioned paper by Kevin Bradley from the National Library of Australia and his colleages Junran Lei and Chris Blackall at the Australian Parntership for Sustainable Repositories (thanks to Peter van Gardener for the citation).  The paper provides a useful review (ciria 2007) of the state of play concerning digital repository software. It provides recommendations as to how UNESCO might assist in developing a low-cost repository system that can be used in nearly any context (including that of smaller archives and developing nations.  In general the report is surprising upbeat and lays out a set of specific steps the could be taken to develop a low-cost repository system.

Both Malcolm’s talk and the Australian report leave me with a distinct sense of dread:  archivists need to do much more to involve themselves in the nitty gritty of systems design and workflow management.  There are many projects and tools that might be used as part of integrated workflow for electronic records, but there is precious little work being done to tie them together into a software suite that archivists could use without years of study, training, and experimentation.

For example, the Bradley paper I mentioned above notes that there are many tools to ingest and manage technical and preservation metadata for simple archival objects, but the report is silent on the issue of how descriptive metadata should be generated and/or managed in such a system (it seems to imply that each file/object will have its own descriptive record but doesn’t say how it should be created.)   Similarly, a tool like DROID or JHOVE might be useful as one small part of an electronic records workflow, since it is very useful to know what kind of file you are assessing or trying to preserve.  But let’s not kid ourselves–identifying file formats is only a very small part of  our work for– though obviously it  has implications for appraisal, arragement, description,  preservation and access.

Nevertheless, if we want to work effectively with electronic records, I think we can come close to cobbling together a set of tools from existing software.  Admittedly, there are likely to be gaps.  One or more key functional requirments for good archival practice (such as appraisal methods) will be unmet, at least in the short term.  And we need to be careful that in picking and choosing from the smorgaboard of tools that others have created we do not electronically reincarnate the workflow and management issues that left us with staggering backlogs of paper files.

Let me be the first to admit that I have compiled a gigantic folder of  raw ‘electronic records’ that I hope to appraise, arrange, describe, preserve and provide access to–at some future date.  At the same time, we can only gain the expertise we need to influence system design if we use, evaluate, criticize (constructively) and refine  existing products and services.  (Only after we have done this might we consider developing new tools.)

Where am I going with this post?  Simply here: my first few weeks thinking about electronic records have shown me how much I don’t know.  They also provide me the idea for a feasible workplan for the next few months . . . more on that in my next post.

*Older versions of the DROID software and a description of the project are found here.

Tagged with:  

One of the more interesting sessions I attended at the SOA Conference was the talk by Viv  Cothey regarding “Digital Curation at the Gloucestershire Archives”.  He noted that while many conceptual frameworks and large research projects are developing methods for digital curation, there is a real lack of tools that facilitate archival workflows in local government archives and other repositories that are not a ‘big player’.   In addition, most of the tools that out there need extensive training or have a steep learning curve.

Viv described the GAip software that he and his colleages developed.  Basically, the GAip (Gloucestershire Archives ingest packager) software automatically creates a OAI Submission Information Packet (SIP) in something similar to the BagIt format.  Viv also described thier work, which is sponsored by the Welsh government’s CyMaL, to intergrate a SWORD deposit client into the GAip, so that the BagIt record and the files it represents can be deposited in any SWORD-compliant repoistory, such as a D-Space or Fedora implementation, at the click of a button.

This seems like truly exciting work that should inform practice in many other projects.  One of the things that impressed me most about the project is that they seem to be matching it to the expressed needs and existing workflows in the repository, so that they can deal with relatively simple deposits of electronic records, such as a set of digital photographs of group of word processed documents that a donor might be supplying on a CD.  Typically, these would be described as an aggregate in the AIP/BagIt records, then deposited to the Library (much like an archives might describe a folder or box of correspondence or photos in the aggregate).  While the software may or may not be able to deal with more complex objects, such as datasets and downloaded websites, in my own personal experience, having a tool to deal with the relatively more simple objects would, in itself, be a huge step forward.

At this time, there appears to be little information about GAip on the web, however, I hope to contact Viv and take a look at the software as part of my project.  So, stay tuned.

Tagged with:  

Society of Archivists Conference

On September 2, 2009, in Projects, Research, Software Reviews, by Chris Prom

I’m at the Society of Archivists Conference (UK), in Bristol England this week.  It is quite a bit more intimate than SAA, I’d say the attendance is around 200 altogether, which has made it quite easy to meet people involved in e-records and digital preservation projects.  There is a good conference blog that is up and running.

My initial impression is that there is quite a bit of interaction between the digital preservation/IT and the archival community in the UK, including some really useful interaction between practitioners and developers.  Malcolm Todd from The National Archives and Clive Billenness, Programme Manager for PLANETS project both provided quite detailed descriptions of specific ways that archivists can take advantage of recently developed tools and can contribute to the software development process.  I was also really impressed by the talks by Viv Cothey from the Gloucestershire Archives, Steve Bailey from JISC, and Rachel Hardiman from Northumbria University  (more on those later).

I’ll post my detailed notes later, but for now I’ll simply note that I impressed by  several of the presentations.  Not to provide too much prominence to this one, but the PLANETS work that Clive Billenness described in his talk on Tuesday holds a lot of potential.  In the past, I’ve seen a lot of people nod their heads knowingly when other people mention it, as if they understand the very important work that the Europeans are doing.  But I have to confess I knew a lot less about it than I should when I dropped the name into my research proposal, and I wonder how much of their work is really know in the US–I didn’t find any mentions of it when searching Kate T’s blog so maybe someone more plugged in than me can let me know. For example, did anyone talk about it at SAA?

Anyway PLANETS, which has received a nice kiss from the EU in the form of 15 million euros of funding since the project inception and which has the backing of major corporate sponsors, is in the process of launching a testbed where a repository (or any government agency or person), can process sets of electronic stuff through a variety of tools, then compare the results to decide which tools might be most effective for the repository’s local situation.  It sounds like a really practical idea, so I’m all in favor.

The first tool they have released is PLATO, which is a preservation/decision support tool. I need to check it out a lot more closely, but I think you can run digital objects through it to make basic decisions as to the best approach to follow for the particular group of records you need to preserve.  I’ll be giving it a try once I am back in Dundee and away from and this overpriced Marriott wireless.

Over the next several months, many other tools will be released by PLANETS.  I think it may save me a lot of time and hassle installing software, since the testbed will allow you to actually use tools with records and compare results using standardized criteria (I really need to look into this).  The approach the EU took with PLANETS really is very different than that which LC took with NDIIPP, and it is great that PLANETS is reaching out to the archival community at conferences like this.  When I talked to Clive about it after the session, he invited me to their training session in Sofia, Bulgaria in mid-September and seemed genuinely interested to get input from me and other practitioners.  I’m not sure I can make that since I have another commitment about that time, but it would definitely be worth attending one of their training sessions coming up in the near future.

Tagged with: