The sustainability of the archive

July 9, 2007

manuscript.jpg

Citing the crucial need to access records on nuclear waste storage, or census returns, in five, 10 or even 100 years’ time, [Natalie Ceeney, chief executive of the National Archives] said: “This is a critical issue for us, and for UK society as a whole. We assume our personal records are secure, we expect our pensions to be paid, but anyone with a floppy disc even three or four years old is already having a hard time finding a computer that will open it.” [Source]

This is undoubtedly one of the most interesting and pertinent articles I’ve seen in the papers for a while: National Archive project to avert digital dark age.

First of all, it makes me nervous that Microsoft is a verbose partner in this. Isn’t the reliance on one or two companies’ proprietary formats what got us into this mess in the first place? MS are renowned for their distaste for open and accessible formats (witness their approach to web standards embodied in Internet Explorer, or the furore over the BBC’s MS-powered iPlayer), so while it is probably necessary that they should be involved to rescue these files, let’s hope the Archives have learnt their lesson and are moving towards the use of open, extensible, standards-based code.

I’m going to point again to this article about validation, because I think it says a lot of things very well about the importance of using this kind of code:

This is an attempt to make a code that can go decades and centuries, getting broader in scope without ever shutting out it’s early versions. Because that’s what we need the code to do: this code is for recording what we think. There are no paper backups of the web. Every day we put more on it that we’re not putting in our traditional medias. If we don’t use extensible code, then our current history evaporates with the next minor tech change. We’ve never had this problem before. Before a mark on a page could go centuries; there’d always be daylight to read it by. This is a new problem and it required a new solution. [Source]

This is as important in publishing as it is in other fields. As we move inevitably towards ebooks and beyond, it’s very easy to imagine a situation, twenty, thirty years from now when a decade-old literary work becomes inaccessible because it was composed on a computer, revised on others, and encoded in an obsolete, proprietary format for distribution – and never once written down on paper.

The solution, I’m afraid, is not to write everything down on paper – there’s too much of it now, and it’s wasteful and irresponsible to boot – but to make sure that we use the best, most open, most public formats right now, for everything we do.

Large sections of the music industry are already moving away from DRM-based systems (e.g. the latest version of iTunes) and publishers should take note, and not go down the bad old routes, which, experience is beginning to show, don’t help anyone in the long run. The International Digital Publishing Forum published the latest version of their XML-based Open eBook Publication Structure Specification at the end of last year, and it scored its first victory a few weeks back with its inclusion in the new Adobe Digital Editions (although this still lays open the possibility of DRM).

Yes, we need to find ways to make sure that authors and others are paid for their work, but we also need to make sure that their works – as well as those pension records and that nuclear waste data – are accessible to future generations. We owe them that.

Image detail from Illuminated by Chronicity, reproduced under CC Licence.

1 Comment

  1. […] e futuros / About files, books, past and future Jump to Comments Interessantíssimo esse post do Booktwo.org sobre edição digital, arquivos que estão se tornando absoletos, acesso a […]

    Pingback by Sobre arquivos, livros, passados e futuros / About files, books, past and future « costurando livros | sewing books — September 16, 2009 @ 3:29 pm

Comments are closed. Feel free to email if you have something to say, or leave a trackback from your own site.