Archive for category Methods

A Question of Data Conservancy/Curation

Bill Maher sprung by my desk this morning as excited as a puppy regarding something he found in newly-accessioned records that were discovered in the basement of our law school.  The records, generated by a University committee, document a project to survey student incomes and expenses in the mid to late-1950s.  They include published reports, correspondence, raw survey results, and coding keys for the Illiac (mainframe computer) used to crunch the data (pdf).

Read the rest of this entry »

, , ,

No Comments

Trustworthy Digital Objects

In case you haven’t read them yet, I’d recommend taking a look at Henry Gladney’s article “Long-Term Preservation of Digital Records: Trustworthy Digital Objects,” which was published  in the most recent issue of the American Archivist.  I had read a manuscript version of it on Gladney’s website, but held off on commenting since it was not yet published when I read it.

Read the rest of this entry »

No Comments

E-Records Obligations, Activities, and Tools: Entity Relationships

Over the past week, I’ve been grappling with a question that on its surface seems relatively pedestrian, but that I think offers a key structure that can guide my research:  how can I effectively track and evaluate the numerous pieces of software and services that a repository might use as part of a trustworthy system to accession, preserve, manage and provide access to electronic records?  As I was doing this, I spent a considerable amount of mulling over the Tufts reports that I commented on last week, in particular Eliot and Kevin’s point that they wished they had tracked the various preservation requirements that they developed for a university context ( expressed in OAIS sections and subsections) using a database.

As I was thinking things through (and getting a little more confused each time I did so), I finally got it into my head that each of the many preservation ‘requirements’  that a repository might need to implement was really an obligation that demanded specific actions  by a particular agent (such as an institution, a human, a computer application, or an element in the preservation infrastructure).  A rationale for each obligation might be found in one or more standards, best practices documents, or guidelines (e.g. citations).  In order to show that each obgligation has been satisfied, a particular agent undertakes or fulfills (with a certain degrees of obligation) one or more activities or events (e.g ‘actions’).   Certain  of these actions may be facilitated by the use of particular pieces of hardware, software, services or infrastructure (e.g. tools), and each action may generate, modify or make use of various other resources (such as reports, logs, etc).

It seemed to me that the relationships between obligations, agents, activities, tools and resources,  would be best tracked in a database.  Once in a database format, the various requirements, activities and software entries could be added, deleted, updated,  ordered, reordered, and searched in a public format.  As additional entities  are identified, they could also be defined as entities added to the conceptual model..   For example, one might define a workflows entity  to define and order the specific actions needed to complete a larger task.

So, as a first step toward implementing this in a database format, I spent some time learning about entity relationship modeling and resharpening my understanding of database design tools.  In the process, I downloaded and used the excellent MySQL Workbench and used it to design a proposed entity relationship model/database diagram.

Regardless of whether or not I eventually program a web-accessible  a database to track information in the way I have designed this model, the modeling exercise helped me to think through the various relationships and issues involved in dealing with the whole range of e-records works.  It makes me glad that I finally learned something about entity relationship modelling; I can see why many software developers use it before beginning a software project. It really forces a person to think about how a system might best work–or where potential pitfalls might arise in the subsequent development process.

Over time, I hope to develop this model into a database.  Such a database would provide useful way to provide information about practical tools that archivists might use.  It would offer not only information about requirements in an abstract sense, but would link them to specific actions and to tools that might help accomplish such actions.  Over time, such a database could grow through community involvement, the addition of evaluative information, and other features.   I’ll be sure to post additional information as I work on the database over the upcoming weeks, in the meantime I’d be interested in any feedback as to whether such a database might be useful and whether my proposed model makes sense.

Download the entity relationship model/database diagram (pdf)

, ,

No Comments

Barebones research methodology

Last year, when I was serving on my Library’s Executive Committee, we briefly considered a request from a junior faculty member that we run workshops on the question “How do pick a research topic?”

It was a bit disarming to hear that someone working toward tenure needed direction on such a basic point, but it is still a good question.  In the past, when I’ve been wanting to do some research, I’ve always tried to think of a practical problem that I or my colleages are having, then turn that problem into a formal question. Of course, some questions are too big and some questions are too little. But some problems are just right.

In the past, I’ve thought a “just right”  problem is on that I can express as  a question I can investigate over six months, working about 15 hours a week.  Now that I’m on sabbatical, I have a bit more time to do reserach.  Neverthless, I need some pretty clear limits since I’m dealing with a very complex area (electronic records), in which a huge number of people that are smarter than me are doing excellent work.

So here is my plan:

  1. Formulate research question:  ”What current tools, methods and software are most effective in helping archivists at under resourced insittutions identify, arrange, preserve and provide access to born-digital records that have been donated to a repository at the end of their period of active use?” (done, Sept 2009).
  2. Conduct literature review and software search, attend training events regarding digital preservation, and  develop lists of articles, software, tools and methods in the resources section of this blog (ongoing through November 2009).
  3. Assemble 4 sets of e-records typical of those that might need to be accessioned, arranged, preserved and provided for access at a university archives or other under-resourced repository  (3/4 Done).
    • a) backlog of existing ‘one-off’ e-records accessions held by the University of Illinois Archives and Dundee ARMMS.
    • b) Email of Paul Lauterbur, Nobel prize winning chemist.
    • b) Office files of American Library Association’s Office of Intellectual Freedom and
    • d) set still to be identified; likely a non profit organization or a faculty member at University of Dundee that is using participatory software (e.g. wikis, blogs, annoation/commenting systems, community image galleries, etc.)
  4. Develop simplified e-records processing workflow (based on Tufts/Yale project’s Requirements for Trustworthy Recordkeeping and Preservation, Ingest Guide, and Maintain Guide, as well as other resources). (October 2009)
  5. Match specific pieces of software to draft e-records processing workflow;  identify software gaps. (October-November 2009)
  6. Develop sofware/method evaluation criteria, which will use a two phases process (Oct-Nov. 2009):
    • Brief comparison of program attributes to processing workflow/needs assessment.
    • In depth analysis of ‘top candidates’
  7. Use evaluation criteria to narrow complete list of software to a subset that will be evaluated in a formal test of software using live e-records. (early December 2009)
  8. Process e-records listed in step 3 using processing workflow, recording numeric evaluation and evaluative comments for each software application or method in subset, for its usefulness in working with  defined record types (images, documents, email, websites, etc). (December-January)
  9. Write formal evaluation paper summarizing methodology and results of my evaluation. (February 2010).
  10. Develop recommended list of tools; contribute to software development projects to assemble toolkit to facilitate e-records work at ‘under-resourced’ institutions.  (March-May 2010.)

,

No Comments

About my research

For the past several years, I’ve felt a strong need to get a better handle on electronic records as part of my job as assistant university archivist at the University of Illinois. Like many archivists, I’ve found the issue intimidating for several reasons, not all of which are worthwhile discussing here or now.

But one of the biggest problems is simply keeping up with technology–part of the reason for my original proposal to the Fulbright Commission for this research project. I was reminded of this on Friday when talking to my friend and colleague at the University of Dundee, Alan Bell. Alan pointed me to Mark Matienzo’s ArchivesBlogs, which syndicates posts relating to archives and records issues from other blogs

On one hand, it was a bit humbling to realize that my blog is the 207th one on Mark’s list. (On the other hand, its nice to know that if I say something too embarrassing, most people won’t notice.)

I guess the fact that there are some many blogs makes me both a bit intimidated and a bit hopeful as I start off this project on electronic records.  Even leaving the blogs aside, there is an immense and growing published research literature which, at least when I am not on sabbatical, I would not have time to to keep up with, much less master.

The bottom line is that while a whole lot of work is going on with electronic records and archives, it is difficult to say what works or doesn’t work on the practical level.

In my project statement, I noted that I would spend a considerable amount of time studying and learning about electronic research projects based in the UK. At the end of the day, I hope to to assemble an open-source toolset. That will, of course, be a major part of my work here.

But since putting he project statement together last August, I’ve begun to sketch out a methodology for the initial stages of my project: I will assemble at least three sets of electronic records documenting a signficant individual or organiziation and I will evaluate the facility of three differents sets of tools in working with these records.

The three sets of tools that I plan to evaluate are broadly speaking: 1) software developed by the archival/records management community in the UK and Europe, 2) software developed by the archival/records management community in the US, and 3) other open source or commercial software developed for non-archival purposes.

My first step, which I’ll send the next month working on, will be gathering a list of tools to evaluate, assembling relevant documentation, doing a literature review, and developing a detailed evaluation methodology.

Thus, this project serves a dual purpose: Not only do I hope to evaluate electronic records tools, but at the end I also hope to have made some progress in putting together sets of records that can more easily be preserved by an appropriate institution.

No Comments

This blog’s purpose

The practical e-records blog is intended to share information concerning a research project I am directing at the Center for Archive and Information Studies (CAIS) at the University of Dundee. The project aims to evaluate software and conceptual models that archivists and records manager might use to identify preserve, and provide access to electronic records.

Electronic records (as well as analog records) are the artifacts of daily existence left behind by organizations, families and people. They tell the world who we are and what we do. They record our thoughts and our aspirations, our fears and our dreams. As George Orwell pointed out may years ago, individuals and societies cannot retain accurate memories without accurate and complete records. As my colleague Pat Whatley here at CAIS pointed out out to me, Virginia Woolf is reputed to have said, “Nothing has really happened until it is recorded.” Ironically, using online sources, I have been unable to verify the veracity of that quote.

The challenge for archivists lies in ensuring that future generations can verify the veracity of the digital traces we leave behind. The task is doubly tricky since the flood of materials being created is immense. Should we preserve everything? Hardly possible. But how do we decide what to preserve, and how do we preserve it, much less provide access to it.

In this blog, I hope to wrestle with these issues on a practical level and, at the end of the day, come up with a few techniques and recommendations that other archivists and records managers might find useful.  In addition, I’ll probably include a few reflections on the things that my family and I do in Scotland while were here on a Fulbright scholarship.

Hope you find it interesting enough to check back from time to time!

, ,

2 Comments