Archive for category Software Reviews

HTTrack Evaluation

HTTrack is a free, open source website copier that can be downloaded to your desktop and used to harvest websites. Due to the changing nature of the web, archivists are interested in having a way to take snapshots of websites so that we have a record of what these sites looked like and what information was contained in them. Finding straightforward and cost effective ways of doing this is likely to be an essential part of archival work in the future.

Read the rest of this entry »

, ,

3 Comments

Open Source Software (OSS) Evaluation Project

One of my major preoccupations is evaluating open source software (OSS) and the projects that develop OSS.  For my Fulbright project, I settled on a rough and ready set of evaluation criteria, but some circumstances demand more rigor.  Picking the wrong development framework or library, for example, could fatally wound an OSS development project.To help me and hopefully the Libraries, Archives, and Museum community as a whole) get a better handle on OSS evaluation methods, I wrote a small grant application to the University of Illinois Library’s Research and Publication Committee.

Read the rest of this entry »

,

2 Comments

Planets Testbed Review

Last week, I reviewed the Planets PLATO Preservation Planning tool.  The Testbed is another Planets web-service that can be used in planning preservation services/actions.  Its purpose is to allow users to locate, select, and test services that can be used to undertake preservation actions, such as identification, characterization, and migration.  It is a part of the Planets Interoperatiblity Framework, and should be available for download and local installation after the end of the Planets project, May 30th.  In the meantime, users can register for an account on the public site.

The testbed includes areas to browse services, browse previous experiments, and conduct new experiments.

Read the rest of this entry »

,

No Comments

Using Windows File Managers as Appraisal Tools

After my experience with Mac Finder and Pathfinder, I spent some time today testing Windows file utilities to appraise records.  In general, I did not find them to be quite as useful as Pathfinder, (although I have not used the version of Windows Explorer that is included in Windows 7).  Nevertheless, you may find some of the following tools helpful when attempting to weed or reorganize complex sets of electronic records.  Any of these applications are useful to have around, since they eliminate most of the major problems with windows explorer (such as the infamous failure to complete a copy operation if one file fails due to a too-long path name.)

Read the rest of this entry »

, ,

No Comments

More on using DROID for Appraisal: Evaluation

Over the past day, I have been testing tools for appraisal, using records from the American Library Association Office of Intellectual Freedom (OIF) the Freedom to Read Foundation (FTRF), and the Leroy J. Merritt Humanitarian Fund.   The files are particularly appropriate for this purpose since they represent the completing functioning of related groups within a larger organization, since no prior appraisal has been conducted on the files, and since the files are likely to have continuing value to the organization, as well as future research value for students, scholars, and members of the public.

Under a research/nondisclosure agreement, I was supplied a snapshot of a office’s working files on July 28, 2009.  Although the files were given to me for research purposes only, it is possible that the Office of Intellectual Freedom will decide to include some of the files  in the American Library Association Archives, at the end of the research project.

The files comprise a complete electronic record of the office since the time that office began storing files on a shared server.  The folders use a deep file structure and include a wide range of file formats.  In addition, some of the materials are sensitive and will need to either be removed from the archives or placed under a restriction policy. (This is particularly the case for Merritt Fund materials, which include case files.)  For this reason, it is important that potentially private materials be identified and then segregated and removed from materials to be deposited, or placed under appropriate restriction policies, in agreement with the creating office.

Obviously, one needs a semi automated way to identify potential files for inclusion.  Such work could be completed either by an archivist or a records creator, but tools are needed to sort through these materials.  As a result, I tested several approaches.

Read the rest of this entry »

, ,

2 Comments

Using DROID for Appraisal

DROID, developed by the UK National Archives,  is a tool that can also assist archivists in identifying file formats.  It is sometimes used as part of processes to preserve electronic records.  The FITS tools, for example, make use of it to extract information concerning the identity of the file type, and the proof of concept version of Archivematica stores some of the information that DROID extracts in the archival information packet that it generates.

However, I think it may be equally valuable as part of an appraisal process, when an archivist is trying to understand the components of a particular series of records.

DROID reads internal header information from one or more files then uses a sophisticated algorithm to compares that information to signature files stored in the PRONOM database.  Based on the comparison, DROID declares whether a match is ‘positive,’ tentative’ or ‘unidentified’.   For each positive or tentative match, DROID provides the Pronom Unique ID (PUID), MIME type, format, and version.  The exact process that the software uses is described in the technical manuals for the system, but obviously the success of the process depends largely on the completeness of the database/signature file to which DROID refers.

The tool is very helpful, but I don’t think many people outside of large scale digital preservation projects are actually using it, since it is somewhat of a power tool and since its main purpose is to support preservation of digital objects in a repository.  You can download versions of it for all major platforms from Sourceforge; the stats provided seem to indicate that it has been downloaded around 8,000 times (version 4.0 1,600 times).

Aside from its use for digital preservation, it can also be used when assessing files for potential accession.  In the future, DROID (or an application like it) could be even more useful. When UDFR proposal and resources such as the PLANETS Core Registry (PCR) come to fruition, particular file formats could be linked t lists of software that can render and/or undertake preservation actions for particular file types.  The PLANETS tools, such as PLATO and the Testbed,, when they are released in May, may include some of this expanded functionality.

In any case, my full ‘evaluation’ of DROID, which I used to ID my test records, is after the break.

Read the rest of this entry »

, , , ,

1 Comment

Installing OAIS Software: Archivematica

Over the past week, I’ve been in Cardiff, Wales, at a Forum for Fulbright fellows and scholars.  If I get some time this weekend, I may post a few thoughts about it and/or some reflections on Scotland and my Fulbright experience to date. In the meantime, I’d like to update you on my adventures installing software  for undertaking preservation actions within an OAIS environment.

For those of you who missed past postings, the tools I am evaluating  wrap together a variety of open source tools to help archives with many aspects of the ingest, storage, and access process.  So far, I’ve reviewed RODA and also DAITSS.   Both of them, at least in their current forms, are difficult for anyone without server admin experience  to install.  Any archivist would need significant support to get them running.  I may also try my hand at ISLANDORA (which looks like it would take even more work), but that may not prove necessary since I have been having so much good luck with Archivematica.

Unlike the other software, which is server based, Archivematica is a virutal appliance and runs inside VirtualBox or another virtualization engine that supports the open virtualization format (such as VMWare).  It uses file based storage, so it can be implemented within any existing file storage systems that are available or can be made available on the host computer.  Until now, this project has been deliberately flying under the radar, although I’ve known about is since past fall, when the project manager, Peter van Garderen contacted me.

Based on my initial experiences, Archivematica offers a credible, thoughtful, and I believe supportable model for facilitating archival work with electronic records.  In fact, I like this project I’ve chosen to become directly involved in development and will begin contributing code to it over the next few weeks.

Here’s why I like the  project so much:

Read the rest of this entry »

,

1 Comment

Maintaining Integrity

A few weeks ago, Alan Bell and I had an interesting conversation with Ian Angles, head of the Servers and Storage Unit at the University of Dundee’s Information and Communication Services (ICS).  Ian is going to be helping me install and test repository applications, such as DSpace, Islandora, and RODA, over the early part of January.  As part of our meeting, Ian, Alan and I got into interesting side conversation about fixity information.

Read the rest of this entry »

,

1 Comment

Trustworthy Digital Objects

In case you haven’t read them yet, I’d recommend taking a look at Henry Gladney’s article “Long-Term Preservation of Digital Records: Trustworthy Digital Objects,” which was published  in the most recent issue of the American Archivist.  I had read a manuscript version of it on Gladney’s website, but held off on commenting since it was not yet published when I read it.

Read the rest of this entry »

No Comments

“Data Curation” and Faculty “Papers”

Last week, I had an interesting lunchtime conversation with Geoff Barton, who directs the bioinformatics group at the University of Dundee’s College of Life Sciences.  Going into the conversation, I had hoped that it might prove possible to work with his group to identify one or more datasets and/or applications that would be suitable for inclusion in a pilot deposit project for a pilot ARMMS e-records repository.  In the end, that did not prove as feasible as I hoped, but in the process I gained a bit of insight into the particular challenges of working with the electronic ‘papers’ of faculty members.

Read the rest of this entry »

, ,

No Comments