After my posting regarding SIARD last week, Hartwig Thomas, the developer at the Swiss Federal Archives who is most closely associated with SIARD contacted me to see if he could diagnose the problems I ran into while converting two access databases to the .siard format. As I suspected, the problems the applicaiton ran into were due to irregularies in the source Access Databases (for example, the Challenged Books database that I was trying to convert had a field marked as #Deleted and thus was, in one sense, corrupt, even though it still opened in Access.
Using the appraisal tools I discussed last week, I discovered the the OIF files I am working with contain about 150 database files. Most of these are in Microsoft Access format, although a few are Paradox files. Using a free Paradox file viewer, I was able to quickly determine that the latter were contact databases, and decided not to undertake an migration or preservation work on them.
Similarly, I examined the 84 access databases included in the accession record and quickly determined that many of them held duplicate information. Based on an examination of each database, I determined that the vast majority of them containted transactional information (such as order of merchandise relating to Banned Books Week or conference registrations), and did not meet appraisal criteria for permenent retention. I therefore deleted those.
But one database in particular, had enough evidential and informational value to suggest that it should be prseserved permenantly: a comprehensive database tracking book challenges that have been reported by librarians to the OIF. While the file is certainly readable using current versions of Microsoft Access, and while I will certainly retain a copy among the final SIP that I am preparing, prudence suggested that a copy should also be generated in a non-proprietary format so that the data at least, if not the look and feel are preserved outside of a depedency on proprietary software.
As I noted in a previous post, SIARD, developed by the Swiss Federal Archives, is one tool that can be used for database normalizaton. It is platform-independent java tool. After spending a bit of time working with it, I am impressed by its capabilities, but unfortunately, I ran into repeated and intractible problems in using the program with some large Microsoft Access Databases that used poorly defined schemas and or badly structured data. In the end, I could not get the software to create a normalized database for the Challenged Books Database. (More on the after the jump.)
At the Planets workshop I am attending in Bern, Amir Bernstein from the Swiss Federal Archives demonstrated the SIARD Suite. SIARD is set of a Java applications that facilitate the preservation of information stored within relational databases. It can be run as a client application (on multiple platforms) or can be called and integrated with other services. It will even run from a USB drive!