Practical E-Records
Posts Tagged Databases
A Question of Data Conservancy/Curation
Posted by Chris Prom in Methods, Research on July 2, 2010
Bill Maher sprung by my desk this morning as excited as a puppy regarding something he found in newly-accessioned records that were discovered in the basement of our law school. The records, generated by a University committee, document a project to survey student incomes and expenses in the mid to late-1950s. They include published reports, correspondence, raw survey results, and coding keys for the Illiac (mainframe computer) used to crunch the data (pdf).
More on SIARD
Posted by Chris Prom in Research on March 26, 2010
After my posting regarding SIARD last week, Hartwig Thomas, the developer at the Swiss Federal Archives who is most closely associated with SIARD contacted me to see if he could diagnose the problems I ran into while converting two access databases to the .siard format. As I suspected, the problems the applicaiton ran into were due to irregularies in the source Access Databases (for example, the Challenged Books database that I was trying to convert had a field marked as #Deleted and thus was, in one sense, corrupt, even though it still opened in Access.
Using SIARD for Database Migration
Posted by Chris Prom in Research on March 19, 2010
Using the appraisal tools I discussed last week, I discovered the the OIF files I am working with contain about 150 database files. Most of these are in Microsoft Access format, although a few are Paradox files. Using a free Paradox file viewer, I was able to quickly determine that the latter were contact databases, and decided not to undertake an migration or preservation work on them.
Similarly, I examined the 84 access databases included in the accession record and quickly determined that many of them held duplicate information. Based on an examination of each database, I determined that the vast majority of them containted transactional information (such as order of merchandise relating to Banned Books Week or conference registrations), and did not meet appraisal criteria for permenent retention. I therefore deleted those.
But one database in particular, had enough evidential and informational value to suggest that it should be prseserved permenantly: a comprehensive database tracking book challenges that have been reported by librarians to the OIF. While the file is certainly readable using current versions of Microsoft Access, and while I will certainly retain a copy among the final SIP that I am preparing, prudence suggested that a copy should also be generated in a non-proprietary format so that the data at least, if not the look and feel are preserved outside of a depedency on proprietary software.
As I noted in a previous post, SIARD, developed by the Swiss Federal Archives, is one tool that can be used for database normalizaton. It is platform-independent java tool. After spending a bit of time working with it, I am impressed by its capabilities, but unfortunately, I ran into repeated and intractible problems in using the program with some large Microsoft Access Databases that used poorly defined schemas and or badly structured data. In the end, I could not get the software to create a normalized database for the Challenged Books Database. (More on the after the jump.)
“Data Curation” and Faculty “Papers”
Posted by Chris Prom in Research, Software Reviews on December 21, 2009
Last week, I had an interesting lunchtime conversation with Geoff Barton, who directs the bioinformatics group at the University of Dundee’s College of Life Sciences. Going into the conversation, I had hoped that it might prove possible to work with his group to identify one or more datasets and/or applications that would be suitable for inclusion in a pilot deposit project for a pilot ARMMS e-records repository. In the end, that did not prove as feasible as I hoped, but in the process I gained a bit of insight into the particular challenges of working with the electronic ‘papers’ of faculty members.
Database Preservation: Solved?
Posted by Chris Prom in Research, Software Reviews on November 19, 2009
At the Planets workshop I am attending in Bern, Amir Bernstein from the Swiss Federal Archives demonstrated the SIARD Suite. SIARD is set of a Java applications that facilitate the preservation of information stored within relational databases. It can be run as a client application (on multiple platforms) or can be called and integrated with other services. It will even run from a USB drive!
Subscribe (RSS)