Recently, I had the opportunity to meet with an administrator whose email will be provided to the University Archives for permanent retention. Since the materials in question are not being donated the the University, but are being preserved as records, we do not secure a deed of gift. Instead, the following form provided an effective way to gain informatory about the records and to inform the individual of actions the archives will be undertaking to preserve the records, as well as the management principles that lie behind those actions:
Continue reading »
This past Monday, I spoke at the Museums and the Web “Deep Dive” on email preservation. At the session, I distributed the following handout, which is drawn largely from my Digital Preservation Coalition Tech Watch Report. I am posting it here, in response to a request at the seminar.
Selected Email Preservation Resources
David Bearman, “Managing Electronic Mail.” Archives and Manuscripts 22/1 (1994), pp. 28–50: outlines the major social, technical and legal issues that an email preservation project must address; is particularly useful in suggesting ways that system designs can support the effective implementation of policies.
Maureen Pennock, “Curating E-Mails: A Life-cycle Approach to the Management and Preservation of E-mail Messages,” 2006: Reviews the major challenges to email preservation and summarises some prospective approaches, with particular emphasis on the need to manage email effectively during its period of creation and active; also outlines the major conceptual approaches that can be used to preserve email, with somewhat less description of particular tools or services. http://www.dcc.ac.uk/resources/curation-reference-manual/completed-chapters/curating-e-mails
Richard Cox, “Electronic Mail and Personal Recordkeeping. In Personal Archives and a New Archival Calling: Readings, Reflections and Ruminations. Duluth, Minnesota: Litwin Books, pp. 201–42. Reviews the history of attempts that the archival profession has made in preserving email messages and their content, suggesting that the best approaches will understand and preserve them as the organic outcome of our professional and personal lives. Cox suggests that those wishing to preserve email draw on concepts and procedures from both the records management and manuscript archives traditions, but the chapter contains relatively little direct implementation advice.
Gareth Knight, InSPECT: Investigating Significant Properties of Electronic Content 2009: A report on email migration tools, completed for the InSPECT project, includes a description and analysis of the structure of an email message, identifying 14 properties of the message header and 50 properties of the message body that must be maintained during migration if an email is to be considered authentic and complete. The report also outlines a procedure for testing whether particular email migration tools preserve those properties and applies that procedure to three specific tools. http://www.significantproperties.org.uk/
Christopher Prom, Preserving Email, Digital Preservation Coalition Technology Watch Report: Provides a summary of social, legal, and technical challenges and opportunities for email preservation, reviewes and explains internet standards and technologies for email exchange and storage, and recommends particular approaches to consider in an email preservation project. http://dx.doi.org/10.7207/twr11-01.
- MailStore Home: http://www.mailstore.com/en/mailstore-home-email-archiving.aspx
- Aid4Mail: http://www.aid4mail.com/
- Muse: http://mobisocial.stanford.edu/muse/
- Outlook: http://office.microsoft.com/en-us/outlook-help/export-or-back-up-messages-calendar-tasks-and-contacts-HA102809683.asp
- Gmail Download: http://gmailblog.blogspot.com/2013/12/download-copy-of-your-gmail-and-google.html
- ePADD: https://library.stanford.edu/spc/more-about-us/projects-and-initiatives/epadd-project
- MailArchiva: https://www.mailarchiva.com/
- Mailstore Server: http://www.mailstore.com/en/mailstore-server.aspx
Exchange Server: A proprietary application developed and licensed by Microsoft Corporation, providing server-based email, calendar, contact and task management features. Exchange servers are typically used in conjunction with Microsoft Outlook or the Outlook Express web agent. Exchange servers use a proprietary storage format and messages sent using Exchange typically include extensive changes to the header of the file. Calendar entries, contacts, and tasks are also managed via extensions to the email storage packet. Depending on local system configuration, users may be able to connect to a specific Exchange server using an IMAP-aware client application.
Internet Message Access Protocol (IMAP): A code of procedures and behaviours regulating one method by which email user agents may connect with email servers and message transfer agents, allowing an individual to view, create, transfer, manage and delete messages. Typically contrasted with the POP3 protocol, IMAP is defined in the IETF’s RFC 3501. Email clients connecting to a server using IMAP usually leave a copy of the message on the server, unless the user explicitly deletes a message or has configured the client software with rules that automatically delete messages meeting defined criteria.
Multipurpose Internet Mail Extensions (MIME): A protocol for including non-ASCII information in email messages. Specified in IETF RFC 2045, 2046, 2047, 4288, 4289 and 2049, MIME defines the precise method by which non-Latin characters, multipart bodies, attachments and inline images may be included in email messages. MIME is necessary because email supports only seven-bit, not eight-bit ASCII characters. It is also used in other communication exchange mechanisms, such as HTTP. Software such as message transfer agents, email clients, and web browsers typically include interpreters that convert MIME content to and from its native format, as needed.
PST: .pst is a file extension for local ‘personal stores’ written by the program Microsoft Outlook. PST files contain email messages and calendar entries using a proprietary but open format, and they may be found on local or networked drives of email end users. Several tools can read and migrate PST files to other formats.
Simple Mail Transfer Protocol (SMTP): A set of rules that defines how outgoing email messages are transmitted from one Mail Transfer Agent to another across the Internet, until they reach their final destination. Defined most recently in IETF RFC 5321.
I have not been able to post in quite a while, since I’ve been wrapped up with other duties. But, to break the silence, here is a brief peice outlining some basic email preservation options, which I recently wrote for publication in an upcoming edition of the Midwest Archives Conference Newsletter:
The prominent Atlantic journalist and blogger James Fallows recently described how an email hacker destroyed records having great personal value: his wife’s entire Gmail archives, covering many years of her life. Although Fallows’ story ended happily, with the records being recovered through insider connections at Google, it seems likely that little email correspondence is currently being saved and preserved for its historical value, for the population generally.
In a follow-up piece, Fallows noted how the email records of a prominent journalist, records likely of great historical value, were similarly lost. At my own institution, one important university officer recently lost all email prior to 2010, apparently during a system migration. An important scholar with whom I’ve been in contact related a very similar story. The evidence I cite is anecdotal, but how many of our institutions are actually capturing records from email communications?
We in the archival community can and must help people save email in a way that makes it likely that email records will one day become research collections, openly accessible for their historical value. In order to do this, each institution will need to develop its own rationale for a long-term email preservation project in light of local needs, institutional profiles, mandates, and policies. Without denying the paramount importance of defining these policies, this article will provide some technical options that might be used to provide the building blocks for a set of local email preservation services, under the rubric of the two general approaches that are practicable using currently available technologies.
Continue reading »
This piece doesn’t refer to email management per se, but it offers very good email management advice (and a bit of evidence that, in spite of predictions to the contrary, email use is truly embedded into our lives):
In order to understand how to preserve (or fix) something, you need to understand the lingo. Once you know what the key terms mean, you’ll understand a lot more about how systems interact toward a common goal.
In other words, during the process of putting together my email preservation guidelines, which are now nearing completion of the first draft, I found it extremely useful to actually define the terms used; it was not until completing this task that I felt like I was really beginning to understand exactly what email is and how email actually moves around the Internet and might be preserved. Although not all of the terms I ended up defining can fit in the final report, and although some of the texts need to be further refined, I put a copy of my full glossary under the resources section of this blog:
In my forthcoming guideline to email preservation, I make the point that far too many email ‘preservation’ systems or policy guidelines ask too much of users. Who is going to read, much less understand-, a five page or more email policy document. And if they do, will users really make appropriate decisions about which particular messages to classify as ‘record’ or ‘non-record’ items (assuming they can even understand the distinction)?
To that end, I was refreshed to read this good advice in the New York Times. I really like the techniques it suggests for overcoming email overload, since they really mesh will with a medium term ‘do nothing’ preservation strategy–provided you keep messages on the server and have access to a more or less unlimited amount of server storage, and can come up with a way to migrate or capture messages at a later time, using either email migration software or an email archiving application.
If I have time later, I’ll expand a bit on some of the ideas laid out in this article.
Steve Bailey from JISC Infonet provided the fourth talk at the DPC Preserving Email Seminar in London, on July 29th. In a provocative set of remarks, Steve argued that the records management approach to email has shown little regard for users and the survival of a useful email record unlikely. He proposed an alternate way forward using new technologies such as “email archiving” software alongside a lightweight policy structure based on user needs and requirements. Below I’d like to summarize his thoughts as I interpreted them and am applying them to my email preservation guidance report.
Continue reading »
Last week, I had the opportunity to interview Crawford Nielson regarding the email archiving routine used by he Social and Public Health Sciences Unit (SPHSU) of Medical Research Council. The conversation revealed a fascinating approach to preserving a complete record of all email transactions in an organization, where the focus is on ensuring current access and potential long-term preservation.
Continue reading »
In the third presentation at the DPC Preserving Email Seminar, Susan Thomas from the Bodleian Library presented case studies concerning acquisition and preservation workflows that are being used by the Bodleian Libraries.
As the main research library at the University of Oxford, units in the Bodleian Library have been receiving a wide range of digital manuscripts, personal papers, and organizational records. With funding from the Andrew W. Mellon Foundation, the Bodleian’s FutureArch project is developing methods to deal systematically with hybrid (analog and born digital collections). This work is documented through a very useful blog.
Continue reading »
For my US readers, I’d like to a call out the very perceptive comments that James Lappin provides on his “Thinking Records” blog regarding the DPC Preserving Email event and in particular Steven Howard’s presentation. They really are worth a read, since James has a good perspective on where the records management market and trends are heading. In this respect, it is well worth readings James’s interesting comments regarding the MoReq2010 specification, the European analog to the DOD 5015.2 records management specification in the US, and to the ICA functional requirements for electronic recordkeeping. (As a side note, I understand from a comment by Adrian Cunningham’s comment on James’s blog, that the ICA will be offering detailed implementation guidance for that standard, aimed at lower-resourced archives.)
Both of the points cited above, as well as the other thoughts recorded by James on his blog, are really useful for archivists trying to get a sense of emerging trends in records management practice and software implementation. They are helping to shape my Preserving Email Report, as I make the push to complete it over the next few weeks.