The dream of a single record keeping profession
It is roughly twenty years since Frank Upward began popularising the records continuum as a paradigm shift away from the previously prevalent records lifecycle model. It was the early 1990s, the digital revolution was about to hit organisations and Upward did not believe that a body of professional thought based on the lifecycle paradigm would cope with it.
Upward had both philosophical and practical concerns about the lifecycle model.
His philosophical concerns stemmed from the fact that the records lifecycle model depicted records as moving in a straight line through time: from creation of the record, through an active phase (where it is being added to and used); a non-active phase (where it is kept for administrative and/or other reasons); until final disposition (destruction, or transfer to an archive for long-term preservation). Upward pointed out that whereas Isaac Newton believed time moved in a straight line like an arrow, Einstein had proved that time and space were inseparable and that both were warped by the speed of light.
Upward compared records to light. Light carries information about an event through time and space. So do records. Upward based his records continuum diagram on a space-time diagram. The space time diagram depicts the way that light travels in every direction away from an event through space and through time. Noone can know about an event unless and until the light has reached them. The records continuum diagram showed that records need to be managed along several different dimensions in order to function as evidence of that event across time and space, and in order to reach different audiences interested in that event for different reasons, at different times and in different places.
Upward’s practical concern related to the fact that the lifecycle model had been used to underpin a distinction between the role of the records manager and of the archivist. The records manager looked after records whilst they were needed administratively by a creating organisation, the archivist looked after records once they were no longer needed administratively but still retained value to wider society.
The interest of wider society in the records of any particular event do not suddenly materialise 20 or 30 years after an event. The interest of society is present before the event even happens. A records system of some sort needs to be in place before the event happens in order for the participants/observers of the event to be able to capture a record of it. That system needs to take into account the interest of wider society in the event in order for the records to have a fighting chance of reaching interesting parties from wider society if and when they have the right to access them. This concern is particularly pertinent to digital records. Whereas paper records could left in benign neglect, digital records are at risk of loss if they, and the applications that they are held within, are not actively maintained.
Upward didn’t use the word archivist or the word records manager. To him we are all record keepers. What we do is records keeping, and we belong to the record keeping profession.
One of the big impacts of the digital revolution, and of the paradigm shift from the lifecycle model, was the shift of attention away from the form and content of records themselves, and towards the configuration of records systems. In this post I will compare the way records managers have gone about the business of specifying records systems with the way archivists have gone about defining digital archives.
The continuing divide between records managers and archivists
Upward’s plea for a united recordkeeping profession has gone largely unheeded in the English speaking world. Twenty years into the digital age we still see a profound cleavage between not just the roles of archivists and records managers inside organisations, but also their ambitions and strategies with regard to electronic records.
The DLM forum is a European group that brings together archivists (mainly from the various national archives around Europe) and records managers. When you are listening to a talk at a DLM event you can always tell the records managers and the archivists apart:
- The records managers refer to MoReq2010 (developed by the DLM Forum itself ) - a specification of the functionality that an application needs in order to manage the records it holds
- The archivists talk about OAIS (open archival information information systems) a standard for ensuring that a digital repository can ingest, preserve, and provide renditions of electronic records that have been transferred to the archive
This reflects a difference in the initiatives that the two branches of the profession have adopted towards electronic records
Records management initiatives
The records management profession have attempted to design records management functionality into the systems used by the end-users who create and capture records. Over the period 2000 to around 2008 this strategy was mainly centred around specifying and implementing large scale corporate wide electronic document and records management systems (EDRMS) Unfortunately a relatively small percentage of organisations succeeded in deploying such systems, and even those that did still found that many records were kept in other business applications and never found their way into the EDRMS.
We are now in the early days of the development of alternative records management models. MoReq2010 is the most recent attempt to influence the market and specify the functionality of electronic records management systems. MoReq2010 is framed to support several different models. It continues to support the EDRM model, but it also supports the following alternative models:
- building in records management functionality into business applications (and storing records in those applications) (in-place records management)
- storing records in the business applications into which they were first captured, but managing and governing them from a central place (federated records management)
- integrating the many and various business applications in the organisation into a central repository which stores, manages and governs records.
The key thing that all three of these newer approaches have in common is that they each involve records passing from one system to another during their lifetime. For example in the in-place model even if an organisation suceeded in installing records management functionality into every one of their applications (a distant hope!), it would still need to make provision for what would happen to the records held by an application that it wished to replace. For that reason MoReq2010 pays particular attention to ensuring that applications can export records with their metadata in a way that other applications can understand and use.
The strategy of archivists has been to design digital archiving systems which can capture records from whichever system(s) a record creating organisation deploys. In theory it would not matter what applications an organisation (government department/agency etc.) used to conduct their business, and it would not matter whether or not that application had records management functionality. The archives would still be able to accession records from them provided that they succeed in:
- building a digital archive that can receive accessions of electronic records and their associated metadata
- defining a standard export schema/metadata schema dictacting exactly how what metadata needs to be provided, and in what form, about records transfered to the archives
- enforcing that export schema/metadata schema so that all new transfers of electronic records come in a standard form that is relatively straightforward for the archive to accession into their digital repository
Unfortunately only a small number of national archives have succeeded in making electronic accessions of records into anything remotely resembling a routine. Some archives have succeeded in building digital archive repositories, The UK National Archives has got one, so has Denmark, the US, Japan and others. But the process of accepting transfers of electronic records into these archives is problematic. Every different vendor sets up their document management systems to keep metadata about the content their applications hold in a different way. The first time the archives accepts a deposit of records from each different system there is a lot of work to do translating the metadata output from that system to a format acceptable to the digital archive repository. The resources required to do this work either has to be provided by the Archives, or by the contributing body.
Jon Garde summed the situation up when he said in a talk to the May 2012 DLM Forum members’ meeting that ‘most records never leave their system of origin’ This comment serves as a sorry testament to the success of the initiatives undertaken by both sides of our profession hitherto.
Lack of a join between records management and archival initiatives
It is rare to see examples of a joined up strategy between records management and archival initiatives. In the United Kingdom the National Archives (then the Pubic Record Office) started out in the early years of the digital age by taking a great interest in the way government departments managed their electronic records, in the hope that this would make it easier for the Archives to accept electronic records from those departments. The records management arm of TNA defined the UK’s electronic records management system specification. Between 2002 and around 2008 they supported government departments in implementing the EDRM systems that complied with those specifications. But the archivists at TNA derived little benefit from this.
The TNA issued both versions of its electronic records management specification without a metadata standard and without an xml export schema. This meant that each compliant EDRM system from each different vendor kept metadata about records in a different way, and hence the transfer of records from those EDRM systems to the National Archives would need to be thought through afresh for each product. By the time the National Archives did get round to issuing a metadata standard they had already made the decision to stop testing and certifying systems (in favour of the Europe-wide MoReq standard). The abscence of a testing regime meant that vendors had no incentive to implement it into their products. But even if vendors had implemented their metadata standard, the TNA would have benefited little from it on the archive side. This is because TNA decided not to use that metadata standard for their own digital archive repository.
The OAIS model
The OAIS model is a conceptual model of what attributes a digital archive system should possess. It makes clear one of the key differences between a traditional archive of hard copy/analogue objects, and the digital archive.
A hard copy archive, in the main, produces to the requestor the very same object that they have been storing in the archive, which in turn is the very same object that was transferred to the archive by the depositing organisation.
In a digital archive this does not hold true. The object originally transferred to the archive may need to be changed or migrated to new software or hardware, so the digital object actually stored in the archive will differ from the digital object originally submitted to it . At the point in time when a requestor asks to see the record the digital archive will usually make a presentation copy for them to view, rather than providing them with the object that they store in the repository. The object that they provide to the requestor may differ in some respects from the object stored in the archive, for example if the archive wishes to present a version of the object that is better adapted to the browser/software/hardware available to the requestor.
The OAIS model came up with a vocabulary to describe these three seperate versions of the record:
- The object originally transferred to the archive is a submission information package (SIP)
- The object stored in the archive is an archival information package (AIP)
- the object provided to the requestor is a dissemination information package (DIP)
OAIS has no certification regime, so there is no way for proprietary products, open source products or actual implementations to be certified as compliant with the model. At various times the archives/digital preservation community has debated whether or not it should have a certification regime (see this report of an OAIS workshop run by the Digital Preservation Coalition). Some archivists have felt that it is is an advantage that OAIS does not have a certification regime, because it allows vendors and organisations the flexibility to implement the model in different ways. Others have felt that the lack of a certification regime hinders interoperability between archives.
An example of the OAIS model way working well – The Danish National Archives
I had a tour of the Danish National Archives on 31 May 20102 (the first day of the members meeting of the DLM Forum). The Danish National Archives has a very well functioning process based on the OAIS model. They have laid down a clear standard for the format in which Danish government bodies transfer records plus their metadata (submission information packages) to the Archives. Government bodies send records on optical disk or hard drives. The archives gives each accession a unique reference. Then it tests the accession to ensure it conforms to the standard. The testing is performed on a stand alone testing computer. Each accession is called ‘a database’, because the accession always comes in the form of a relational database. Such relational databases typically hold metadata together with the documents/content that the metadata refers to.
I asked whether a government department could deposit a shared drive with the archive. They replied that the department would have to import the shared drive into a relational database first in order to format the metadata needed for the accession. This brought home to me the fact that when an archive imposes a standard import model it does not reduce the cost of transferring records from many and various different systems used by organisations to one digital archive. It merely places a greater proportion of the cost of the migration on the shoulders of the transferring bodies.
It is not necessarily easy for other national archives to replicate the success of the Danish National Archives. An archivist from the Republic of Ireland accompanied me on my tour of the Danish National Archive. He is in charge of electronic records at the archives of the Republic of Ireland. The Irish archives have not been able to get a standard format agreed for government departments to send accessions of electronic records to them. From time to time government bodies send accessions of electronic records, principally when a government body is wound down. The archives can do nothing more than store the accessions on servers and make duplicate copies. They have no digital archive repository to import them into. Even if they did have an archive repository the fact that the accessions are in such different formats means that the process of ingesting the accessions into such a repository would be an extremely time consuming and lossy process. The chances of the archives persuading the rest of the Irish government to accept a standard format and process for transferring electronic records are slim because in times of austerity it would be seen as an extra administrative burden.
(For more details on the approach of the Danish National Archive watch this 25 minute presentation by Jan Dalsten Sorensen)
An example of the records management approach working well – the European Commission
The European Commission has taken a records management approach to managing records from their creation until their disposal or permanent preservation.
They started off with a fairly standard electronic document and records management system (EDRMS) implementation with a corporate file plan, and later with linked retention rules. But then they expanded on this model. They are currently in the process of integrating one-by-one their line-of-business document management systems into the EDRM repository. The ultimate aim is that a member of staff could choose to upload a record into anyone of the Commission’s document management tools and still have the record captured in a file governed by the Commisssion’s filing plan and retention rules. They are also developing a preservation model for the historical archives. This module will enable records to pass from control of the Directorates-General (DG) of the Commission that created them, into the control of the Historical Archives without leaving the EDRM repository itself.
The model is not perfect (like every other organisation they find it difficult to persuade colleagues to contribute e-mail to the EDRMS), and it is not finished (not all the different document management systems have been integrated yet, not all the functionality needed to manage the process of sending records to the control of the Historical Archives has been added yet), but it is a very well thought through and solid approach, that has successfully scaled up to cover nearly 30,000 people.
As with the Danish National Archives, it would not be easy for other organisations to replicate the success of the European Commsision’s approach.The Commission’s success has come as a result of a records management programme that was started in 2002, it has taken a considerable amount of time (ten years) and a considerable amount of political will to draft the policies, build the filing plan, draft the retention schedule, establish the EDRM, and to commence the integration of other document management systems into the EDRM. The integration of each document management system into the EDRM is a new project each time, requiring developers to work on the document management system in question in order that it can use the EDRM systems object model to deposit records into the respository.
In these turbulent times of economic austerity it is hard to envisage many organisations embarking on a records management programme that would take 6 to 8 years to deliver benefits.
How do we make it more feasible to manage records over their whole lifecycle?
The facts that these two excellent examples, from the Danish National Archives and the European Commisson are so difficult to replicate is a concern for both the records management and archives professions.
In an ideal world every records management service would operate a records repository, every archive would run a digital archive. In an ideal world the records managers would not need to get developers to do any coding to enable business applications to export their records into the records repository – the applications would be configured so that they could export records and all accompanying metadata in a way that the repository understood.
In an ideal world an Archive running a digital archive would not have to specify to their contributing bodies that they need to tailor and adjust the exports of their application. In an ideal world those bodies could run a standard export from any of its applications, that the Archive could import, understand and use.
The key enabler for both of these things is a widely accepted standard on the way metadata on things like users, groups, permissions, roles, identifiers, retention rules, containers and classifications are kept within applications, coupled with a standard export schema for the export of such metadata. If such a standard schema existed then a records repository owner or digital archive owner could specify to the owners of applications that needed to contribute records to the repository/digital archive that they either:
- implement applications that keep and records and associated metadata in that standard format, OR
- implement applications that can export metadata in the standard export format , even if the metadata within the application had been kept in a different way
- develop the capability to transform exports from any of their applications into the standard export schema. This last point should be helped by the fact that any widely accepted export schema would lead to the growth of an ecosystem of suppliers with expertise in converting exports of records and metadata into that format. Indeed such a format could become a ‘lingua franca’ between different applications.
The opportunity for a link between MoReq2010 and the OAIS model
The only candidate for such a standard export format at the moment is the MoReq2010 export format, published by the DLM Forum. The DLM forum comprises both archivists and records managers, but most of the archivists have hitherto taken relatively little interest in MoReq2010. On June 1 this year (the day after our visit to the Danish National Archives) I gave a presentation to the DLM members forum meeting suggesting that the archival community should develop an extension model for MoReq2010, such that any system compliant with that module would also have the functionality necessary to operate in accordance with the OAIS model.
This would have a number of beneficial effects. For the first time in the digital age we would have a co-ordinated specification of the functionality required to manage records at all stages of their lifecycle including managing archival records.
It would also be a huge boost for MoReq2010. The first two products to be tested against MoReq2010 will be SharePoint plug-ins – one produced by GimmalSoft, one by Automated intelligence. Let us assume that both products pass and are certified as compliant. Both products will be able to manage records within the SharePoint implementation that they are linked to. Both will be able to export records in a MoReq2010 compliant format. But there still won’t exist in the world a system capable of routinely importing the records that they produce. This is because the import features of the MoReq2010 specification are not part of the compulsory core modules of MoReq2010 – instead they are shortly to be published as a voluntary extension module.
Let us imagine that a National Archive somewhere in the world deploys a digital archive, that complies with the OAIS model, and that can import records exported from any MoReq2010 compliant system. All of a sudden there is a real incentive for that archive to influence the organisations that supply records to it to deploy MoReq compliant applications (or applications that can export in the MoReq2010 export schema, or MoReq2010 compliant records repositories). It works the other way around too. Let us imagine there is a country somewhere whose various government departments deploy MoReq2010 compliant applications. All of a sudden there is an incentive for their National Archives to deploy a digital archive that is compliant with the import module of MoReq2010 and can therefore routinely import records and metadata exported from those MoReq2010 compliant applications.
Debate at the DLM members forum on an OAIS compliant extension module for MoReq2010
The suggestion of an OAIS compliant extension module for MoReq2010 sparked off an interesting debate at the May DLM forum members meeting. Tim Callister from The National Archives (TNA) in the UK. and Lucia Stefan, both criticised that OAIS model. They said it was designed for the needs of a very specialised sector (the space industry, with their unique formats and data types) and was not tailored for the needs of national archives who are largely tasked with importing documents in a small range of very well understood file formats (.doc, .pdf etc.). Jan Dalsten Sorensen from the Danish National Archives defended OASIS, saying that it had given archivists a common language and common set of concepts with which to design and discuss digital archives.
I said that any digital archives extension module for MoReq2010 should be compatible with OAIS – if only because otherwise it would lose those archives (like the Danish National Archives) who had invested in that model. It would also lose the connection with all the thinking and writing about digital archives that has utilised the concepts of the OAIS model
After the debate I spoke to an archivist from the Estonian national archive. He said that his archive didn’t want lots of metadata with the records that they accession. I said that was because the more metadata fields that they specified in their transfer format the greater the amount of work that either they or the contributing government department would have to do to get the metadata into the format needed for accessions. If their contributing government departments had systems that could export MoReq2010 compliant metadata, and if the digital archive could import from the MoReq2010 export schema, then they wouldn’t need to be pick and choose the metadata – they could take the lot.