Later this week, I’ll be attending a Society of Archivists “Digital Preservation Roadshow” in Edinburgh, and in mid-November I’ll be attending a three day PLANETS training session in Bern, Switzerland. To prepare for both events, I spent a few days reviewing the PLANETS work in detail, reading general descriptions of the project as well as several of the more recent technical publications.
My general impression is that the PLANETS work is not very widely known in the US archival community, but there is a bit more awareness in the digital preservation and digital libraries arena. In any case, it has been useful for me to review the project outcomes to date, because the approach that this European project is taking toward digital preservation is very different from that of either the US National Archives (ERA) or the Library of Congress (NDIIPP).
The PLANETS project aims to develop an suite of services that national libraries and archives in the EU can use to plan digital preservation services and to manage the electronic collections. It is a four year project, ending in May 2010, with a relatively low funding level (€15 million) and a limited number of research partners (16, including state archives, university research units and industry/commercial).
Although the focus of work is on tools and services for national archives and libraries, the focus is on providing practical tools and services that integrate with existing systems and archival workflows. The best single overview of the project is this article by Adam Farquahar and Heleln Hockx-Yu from Volume 2, Issue two of the International Journal of Digital Curation.
First and foremost, it is important to note that PLANETS is not a repository project, and it is not developing tools to ingest digital objects into a repository. Instead, its end goal is to deliver a set of ‘click and install’ tools and services that will allow a repository to administer, configure, and deploy preservation services and workflows. They take it as a given that each institution will select and implement its own repository software, either from a commercial vendor, a local system, or an open source project, such as Fedora.* PLANETS software includes the following:
- PLANETS testbed: A controlled environment in which project participants can run experiments, evalute results, and share information concerning experiments, such as migrating a file from one format to another. version 0.5 was released last year, and can be installed on a local PC; however, the installation procedure looks quite involved. As the testbed evolves, it will include access to many if not all of the PLANETS pre-bundled services, listed below, and repositories will be able to register additional services. The testbed is directly applicable to my work, and it appears that a version of it is in pre-public release. The Bern meeting will describe how to use it, and after registering to use the tool and getting training in Bern, I will use it to carry out experiments using some of the e-records I have in my custody.
- A preservation planning tool (PLATO) to allow an organization to define, evaluate and implement a preservation plan for a specific group of records.
- Preservation characterization services to analyze digital objects, establish/extract their significant qualities into an XML format, and compare the qualities of original objects with objects on which some preservation action (such as migration) has taken place. Associated Tools: Core Registry (extension of PRONOM database), XCL standards and tools (Extractor, Comparator).
- Preservation Action Tools: PLANETS is planning to bundle a set of custom-built and third part conversion and migration tools into a service that will allow repositories to easily undertake conversion actions for a set of specific file types. Software status of this is unknown, I could not find links on the site.
- Emulation Services: One of the most ambitious and potentially fruitful projects is the development of an Emulation Framework that will allows users to run operating systems and software from multiple systems. A detailed description of the technology was provided at the European Conference on Digital Libraries. The software might be loaded either to provide emulated access to document in its original format, or more likely, to allows for automated conversion of documents to a standard format for an ingest into a repository. The plan is to run a web service through which repositories could define workflows, run the emulator, then receive output through the web services (without knowing anything about the underlying process. The description is very complex, but it sounds like a plausible approach. The underlying technology for the webservice is called GRATE (Gloabal Remote Access to Emulation Services) and it uses VNC and the open source emulator QEMU. PLANETS is also developing the Dioscuri emulator for x86 based systems and is doing some work with the Universal Virtual Computer technology.
- An Interoperability Framework, most recently described here. It is also available for download. The Interoperability Framework provides a modular, service oriented architecure under which individual repositories can tie together the PLANETS tools and services that best meet their needs. It appears that it is being desined in a way that it can be implemeted on a single machine, multiple servers, or even in a grid across several institutions. PLANETS promotion literature says that its open architecture allows for the intergration of third party services as well. Presumably, an institution could use it to integrate PLANETS into a repository, digital library, or archival storage software layer.
The existence of the interoperabilty framework and a relatively small number of closely defined tools makes the project results easier to understand and assimilate than the multitude of tools and services as well as initiatives being supported by NDIIPP. In comparison to the ERA project, PLANETS shares a huge amount information about the technologies being used and is actively dispersing them outside of the target user community–national libraries and archives. (Although it should be noted that NARA does link to some research projects concering certain elements in the ERA systems, I believe that most elements of the ERA system are either proprietary or at least not applicable outside of the the Federal context.)
In general, I am very impressed with the potential that the PLANETS software holds, although it is difficult to evaluate the full impact of the project until after the final tools are released. At the present moment, documentation is available for beta or alpha version of some software, and other componets are being completed. As such, implemenation of a PLANETS instance, or even components of it, in a production version would e premature for those institutions that are not directly engaged in the project. However, there are a singifcant number of positives to this project which make it worth keeping a close eye on it. The software is being released under the GNU license so any project outcomes could be adapted by other communities. Second, the project is specifically aimed at facilitating archival workflows, so it probably holds great direct relevance for archivists. Finally, there is a clear deployment path in mind, and even though the project is likely to be exended.
In short, it seems likely to me that many tools and services emerge from the project will be applicable for ‘smaller’ archives. I’ll know a lot more about them by then end of next month. At the very least, I’m hoping I can use the testbed as part of my methodology for evaluating how different pieces of software and services facilitate parts of a workflow for electronic records processing, ingest, and preservation.
*The services listed below will be able to interact with an institution’s repository using a data registry service and Digtial Object Manager, so that the outcomes of a particular set of actions can be integrated back into the repository if so desired; there is a current implemetations that are based on Apache Jackrabbit and Fedora. It is bit unclear, but the implmentions seem to use OAI-PMH. But institutions may be required to build a made-to-order digital object manager to translate requests and queries into the language the the PLANETS services can understand.