PDF should be used to preserve information for the future

Press release from the Digital Preservation Coalition

Good news the already popular PDF file format adopted by consumers and business alike is one of the most logical formats to preserve today’s electronic information for tomorrow.

According to the latest report released today by the Digital Preservation Coalition (DPC), Portable Document Format (PDF) is one of the best file formats to preserve electronic documents and ensure their survival for the future. This announcement will allow information officers to follow a standardised approach for preserving electronic documents.

Information management and long–term preservation are major issues facing consumers and businesses in the 21st Century. This report is one of a series where The Digital Preservation Coalition (DPC) aims to think about and address the challenges facing us.

This report reviews PDF and the newly introduced PDF/Archive (PDF/A) format as a potential solution to the problem of long–term digital preservation. It suggests adopting PDF/A for archiving electronic documents’ as the standard will help preservation and retrieval in the future. It concludes that it can only be done when combined with a comprehensive records management programme and formally established records procedures.

Betsy Fanning, author of the report and director of standards at AIIM, comments, “A standardised approach to preserving electronic documents would be a welcome development for organisations. Without this we could be walking blindly into a digital black hole.”

The National Archives works closely with the DPC with issues surrounding digital preservation and will continue to do so. Adrian Brown, head of digital preservation at The National Archives said: “This report highlights the challenges we all face in a digital age. Using PDF/A as a standard will help information officers ensure that key business data survives. But it should never be viewed as the Holy Grail. It is merely a tool in the armoury of a well thought out records management policy. “

The report is a call to action, organisations need to act now and look hard at their information policies and procedures to anticipate the demand for their content (documents and records) in the future. Everybody has different criteria, types and uses for documentation so you need to find one that works for your organisation.

If you would like to read the full report please go to the Digital Preservation Coalition website. This can be accessed here: http://www.dpconline.org/graphics/reports/index.html#twr0802


Alternative File Formats for Storing Master Images

From Astrid Verheusen, National Library of the Netherlands

The Koninklijke Bibliotheek, National Library of the Netherlands, has published a report on possible alternative file formats for storing master images from mass digitisation projects. Uncompressed TIFFs, the KB’s preferred format so far, take up far too much storage capacity to be a viable storage strategy for the long term. The report is available from the KB website.

At the Koninklijke Bibliotheek mass digitisation projects are taking off. In the next four years millions of high resolution RGB master image files will be produced and will have to be (permanently) archived. However, if all projected 40 million images are to be stored as uncompressed TIFFs, the KB will need some 650 TB of storage capacity by 2011. This is quite a capacity challenge, and thus the need arose to develop a new strategy for storage of images.

The project considered whether it would be possible to distinguish between master image files which must be stored for all ‘eternity’ (because the originals decay rapidly and/or digitisation costs are so high that repeating the digitisation process is not a viable solution) and objects which are stored for access. The distinction would allow for a more pragmatic and economic storage policy, whereby projected usage would determine the storage strategy.

The draft of the report was reviewed by a group of selected specialists on digitisation, digital preservation and image science. Their feedback was in incorporated in the final version of the report which is available at: http://www.kb.nl/hrd/dd/dd_links_en_publicaties/links_en_publicaties_intro.html


JPEG2000 Ready for Use

From a message sent by the Digital Preservation Coalition

The Digital Preservation Coalition has examined JPEG 2000 in a report published today. The report concludes that JPEG 2000 represents a great stride forward for the archival community. The format now allows for greater compression rates and a recompression rate that is visually lossless.

The findings come as the Digital Preservation Coalition launch its latest ‘Technology Watch Report’ written by Dr. Robert Buckley, a Research Fellow with Xerox, ‘JPEG 2000 – a practical digital preservation standard?’. The report looks in-depth at the new format and the challenges it has to cope with. JPEG 2000 is widely used to collect and distribute a variety of images from geospatial, medical imaging, digital cinema, and image repositories to networked images. Interest in JPEG 2000 is now growing in the archival and library sectors, as institutions look for more efficient formats to store the results of major digitisation programmes.

The report is aimed at organisations involved in the management and storage of digital information. The in-depth report will help archives, libraries and other institutions make informed decisions about JPEG 2000 format and their future storage needs.

JPEG 2000 can reduce storage requirements by an order of magnitude compared to an uncompressed TIFF file. Dr. Buckley says, “This new format has come at a time of heightened awareness about the access to digital documents. Any format that can assist archives and libraries to do this is welcome.”

The format will also enable users to open as much of the file as they need at that time. This means a viewer, for example, could open a gigapixel image almost instantly. This is achieved by retrieving a decompressed low–resolution display sized image from the JPEG 2000 codestream. Coupled with this, the users’ ability to zoom, pan and rotate an image have been enhanced.

Adrian Brown, head of digital preservation, The National Archives said: “This is a very timely addition to the DPC’s Technology Watch Report series as many organisations are themselves reviewing the JPEG2000 format. This concise, comprehensive and clear guide will be of interest to practitioners across the digital preservation community.”

The report concludes that JPEG 2000 offers much more flexibility and features than JPEG, but at the cost of greater complexity. It is however a great stride forward, and of major significance for the information management community.

To download a pdf of the report please go to: http://www.dpconline.org/graphics/reports/index.html#twr0801


Different flavours of Office XML

JISC has just published a report comparing the between XML flavours of standards for office documentation (ie. word processing files and all). It compares ODF with the Microsoft-supported OOXML.

To the layman, the differences seem rather arcane. The real danger seems to be, according to the report, the simple fact that there will be two standards, with different capabilities, in place. This doesn’t make life on the computer as straightforward as it should be.


Flash as de facto delivery standard

Despite years of everyone emphasising the importance of open standards, the attractions of propietary standards like Flash are hard to resist for some projects

Two main reasons. 1. Most (but not all) people have Flash on their desktops. 2. Flash allows you to do funky things that would be impossible otherwise

YouTube is the most obvious example. Despite its poor quality, the YouTube platform has now established Flash as the way of delivering video.

In the cultural heritage business, two other recent projects have unveiled Flash based resources. One is the Rome Reborn reconstruction of classical Rome. The other is the 9.9 gigabtye image of Andrea Pozzo’s ceiling fresco from 17th-century Rome.

Impressive now, of course. But will these resources be around in five years’ time?


Follow

Get every new post delivered to your Inbox.