I spent three days last week in Geneva, where I attended the 8th European Conference on Digital Archiving. As part of the conference, I presented the first public report of my work. There is an interview of me on the conference blog and I have also uploaded an extended version of my remarks, which I hope to turn into a more formal publication over the upcoming weeks.
I found the conference very useful and thought provoking. I continue to be amazed by the depth and breadth of digital preservation work that is being completed, and also by the range of research being completed relevant to the law and economy of electronic records. Of the sessions that I attended, these stand out for their particular utility:
- Robert Sharpe from Tessela made the argument that most digital preservation work can be automated, aside from preservation planning and assessment work. If an archives can a) define what obsolescence means b) decide what to do about it (e.g. how to migrate or emulate) and c) evaluate whether a migration or emulation strategy has worked, the rest of the digital preservation process can be automated. He also described their product, Saftey Deposit Box, of which version 4.0 was released at the conference. Although to date, the product has been adopted mainly by large state archives, the Wellcome Library/Archives is also in the early stages of implementing it. I had a chance to speak briefly with Natalie Walters from the Wellcome Library, and while they have not yet set the system up, she stressed that of all the vendors they had spoken to regarding their project, Tessella were the only ones who seemed to understand digital preservation from an archivists point of view. I am hoping that, as they implement their product, Tessella will integrate additional features that make it attractive to, and affordable for, smaller archives, and I hope to have some time to touch base with Robert or another of their representative before I leave the UK.
- Osmo Paolnen from the Mikkelli University of Applied Sciences in Finland presented a talk on a business model for a diversified digital repository. He descirbed a program the university developed to ingest, store, and provide access to electronic records. The program is fully self funded, and has several clients throughout Finland. Initially, most of the clients were medical programs needing long term storage of biomedical/radiological data, but they have now expanded into other areas. In his remarks, he stressed the importance of having a solid business model in place before beginning, and of charging clients from the very start; he seemed to feel that if they had begun the program without having charged the full amount to support the service, it would have been impossible to set a realisitic income level later. After his remarks were finished, I really had to wonder whether such a program could or would be successful in the United States, certainly it shows an innovative public sector approach to developing revenue and services that fulfill a University public service mission in a quite unique way. Unfortunately, I have been unable to locate a website regarding the project, not surprising given the fact the University webiste it is in Finnish!
- Doreen Kerbuo Mageto from Oslo University presented a summary of her master’s thesis “Cost Factors in Digital Preservation”. The purpose was to assess how institutions in Norway are determining the cost of digital preservation. She attempted to map perceived costs onto the OAIS funcational model, to lead to a relatively easy to apply cost model so that repositories could determine how much preservation services would cost, choose which to use, and make decisions regarding relative priority and funding. The results of her specific project are probably less important than the fact that she suggests a plausible, general, and relatively simply to apply method by which repositories can set budgets and measure costs as a percentage of the functions that a e-repository needs to fulfill . There is certainly a great need since over 70% of the institutions she had surveyed had no cost model in place for digital preservation. I got the impression listening to her that whatever institution she eventually lands at will be very fortunate to have her services in helping set the basis for a strong and well-funded program.
- Jason Baron, Director of Litigation at NARA gave a thought provoking keynote on day 2. He first reviewed some recent case law regarding records discovery in US civil cases, noting clearly that courts were increasingly presuming that records system must captured an adequate record of correspondence, data, or information of any kind. In the current legal framework, “email plays a leading if not decisive role,” and courts have been very willing to sanction parties that cannot comply with a discovery request, even if malfeasance or negligence cannot be shown. As a result, he argued that “e-discovery is the hottest topic in law today,” and many businesses are changing their behavior so that systems automatically capture records as they are being produced. This will become increasingly necessary at all types of institutions, and from his perspective as a lawyer, the ‘least worst” solution is automatic capture and preservation of all emails within a set of parameters defined by the institution, because it will not require users to change there behavior (he left unaddressed issues that arise when people stop using institution-specific email services, and migrate to gmail or another service (which has been a problem in particular for some politicians in the US.)
- Steve Bailey from JISC argued in a keynote address that just this problem (unaddressed by Baron) is becoming increasingly common in corporate, non-profit, and personal environments: that other people are becoming responsible for our records, and that increasingly records are not stored by function or subject relationship (or really any kind of systematic arrangement scheme) but are segregated by format. After a long rhetorical section asking what we happen if the output of Samuel Pepys were treated the same way, he noted that it is likely that the division of videos to You Tube, Documents to Google Docs, photos to Flickr, outsourced email and blogs, information on twitter, and so forth, poses a huge potential risk to future research. Fighting against this trend will only doom our profession to irrelevance, he argued, and he closed with four specific steps that we can take to ensure that records are preserved. Of his four points, the one that rang the most true was his assertion that perhaps archives should strongly assert our role in acting as document/photo/content holders. In most cases, the core business models of the companies do not rely on holding the data, but on indexing it and making it available; perhaps, he suggested, Google, et. al would not mind being relieved of the storage costs if we as a profession can promise permanent access and preservation. Although Bailey did not mention it, Twitter’s recent announcement of a partnership with Library of Congress makes it seems possible that such a relationship may evolve over time.
- Jim Suderman presented an overview of four attempts to apply the IneterPARES principles (embodied in the chain of preservation model) at government archives in Canada. Jim’s talk served as a useful reminder the the InterPARES work can be applied in a relatively straightforward fashion one you determine at which point in the chain of custody a record lies. In addition, he provided some interesting background on the Archivematica project, in particular that point (new to me) that the City of Vancouver requires the use of open source software, thus (presumably) providing the impetus for the project.
- Luciana Durante spoke about the Digital Record Forensics Project, a part of the InterPARES3 work. The project, which is a partnership between Archives and Law Faculty at UBC, as well as Vancouver Police Department Forensics until seeks to apply the principles of diplomatics (w/in context of current statue and case law regarding evidence), using digital forensics technology to help capture and determine the authenticity of records; the hope is that a new branch of archival studies “Digital Record Forensics” will emerge. What that is so remains to be seen, but in the meantime it should be interesting to follow the results of this project, because one of the most useful tools I have found for dealing with email is Aid4Mail, which is used extensively in the digital forensics/law enforcement community, and it appears likely that version 2.0 of the software, which should be released shortly, will significantly enhance the program’s utility. (I will be posting about the software after it’s release.)