The mechanics of manage-in-place records management tools

The idea of a manage-in-place records management tool is that it holds your records classification scheme and retention rules, and applies them to content held in a variety of different content repositories/applications (SharePoint, line of business systems, e-mail archives etc.).

At the IRMS conference in Brighton in May I had conversations with several vendors of manage-in-place records management tools about how they went about ensuring that their products could connect with the applications in day-to-day use within organisations

The importance of APIs (application programming interfaces)

In order for the manage-in-place tool to work it needs to have a ‘connector’  to each content repository that it wishes to govern.

The connectors are typically built to use the API (application programming interface) of the content repository.  The API exposes a subset of the content repository’s functionality.   It specifies how any authorised external application (in this case  a manage-in-place tool) can issue commands to the content repository.

Some of the things that a records manager might want their manage-in-place tool to do inside the various content repositories of your organisation include:

  • adding metadata to a document or aggregation of documents
  • linking an aggregation of documents to a node in a records classification
  • preventing editing or deletion of  a document or aggregations of documents
  • linking a retention rule to a piece of content or an aggregation of content

The beauty of the concept of an API is that the two applications can interact with each other without you having to customise either application.  It does not matter if the two applications are written in entirely different programming languages.  Nor does it matter if one or both of the applications are based in the cloud.

In theory:

  • you could replace your manage-in-place tool with a new manage-in-place tool from a different vendor, and none of the content repositories need notice any difference (provided that the new manage-in-place tool carried on issuing the same commands to their API)
  • you could replace a content repository with a successor repository from a different vendor without the manage-in-place tool noticing any difference (provided that the new content repository offered a similar API that enabled them to make the same commands)

In practice each vendor constructs the API for their content repository differently, and this creates two challenges for the makers of manage-in-place tools

1) they have to construct a different connector for each different vendor’s content repository.  Two of the  manage-in-place providers I spoke to at the conference (RSD and IBM) both provided connectors to over 50 different commonly used content repositories.

2)  some APIs are better than others.  Some applications expose more functionality through their API than other applications, and hence let the manage-in-place tool do more things to their content.  One example cited was that the manage-in-place tool can get some document management systems to display the organisation’s records classification (fileplan), so that users of the document management system can link or drag and drop content to the appropriate node in the classification.  Other document management system do not have that functionality exposed in their API.

CMIS (Content management interoperability services)

CMIS  is a specification that aims to overcome the first of these two problems.  The specification was drawn up by a coalition of vendors in the ECM space under the auspices of the OASIS Technical committee.

The idea is that vendors  add a CMIS layer to their applications.  Just like an API, the CMIS layer exposes a subset of the functionality of the native application, so that an external application can make use of that functionality.  The difference is that whereas each vendor’s API is constructed and expressed in a different way, a CMIS layer is standardised. This means that a similar function (for example ‘add a document’) would be expressed in the same way in the CMIS layer of each vendor’s products.

A mange-in-place tool vendor could choose to build connectors to the CMIS layers of content repositories, rather than through the API.  In theory this saves a manage-in-place vendor from building seperate connectors for every different type of content repository they want their product to be able to govern.

Screen Shot 2013-06-25 at 11.18.49

In practice the vendors of the manage-in-place tools that I spoke to told me that they prefer to write connectors that use the API of each application, rather than the CMIS layer.   This is simply because most repositories expose more functionality through their API than through their CMIS layer.
Screen Shot 2013-06-25 at 11.14.02

CMIS and records management

The disadvantage of CMIS being writtten by vendors is that a coalition of vendors have to agree for functionality to be put into the specification. They have tried to capture concepts and functions that are common to all or most existing repositories. Functionality such as records management, which some repositories have and some don’t, has not received prominent treatment in CMIS.  The first version of CMIS had concepts such as a document, and a folder, but it did not support retention rules, nor a records classification/fileplan (although it did have the concept of a folder structure).

The latest version of CMIS (1.1) does have retention functionality in for the first time.  But that has not pleased all of the vendors.  Jeff Potts, of Alfresco wrote this in his blogpost announcing the approval of CMIS 1.1

This new feature allows you to set retention periods for a piece of content or place a legal hold on content through the CMIS 1.1 API. This is useful in compliance solutions like Records Management (RM). Honestly, I am not a big fan of this feature. It seems too specific to a particular domain (RM) and I think CMIS should be more general. If you are going to start adding RM features into the spec, why not add Web Content Management (WCM) features as well? And Digital Asset Management (DAM) and so on? I’m sure it is useful, I just don’t think it belongs in the spec.

This is the dilemma for CMIS:

  • if they do not give full coverage of sets of functionality such as records management then manage-in-place tools will bypass the layer and just use the  APIs of the content repositories.
  • the more detailed and precise their definition of records management functionality is, the harder it is to get the coalition of vendors to agree on it

From a records mangaement point of view what we want out of CMIS (or any other standard in the API space) is to set out a minimum set of records management functionality that the API of every business systems sbould have.

In theory, if CMIS specified a set of API commands that would expose the functionality needed by one or more of the current electronic records management specifications,  then vendors would never have to re-architect their product to meet that electronic records management specification,  All they would need to do is expose the relevant functionality  in their CMIS layers and let the manage-in-place tools use that functionality to govern the content they hold.

Of course this would not solve all of our problems –  one of the biggest content repositories in most organisations are simple shared network drives, that don’t have an API (never mind a CMIS layer!).

Electronic files that tell the whole story of pieces of work – will we ever get there?

Here is a quote from one of the respondants to the recent survey by State Records New South Wales of e-mail usage in public authorities

Even when emails are captured in our EDRMS (electronic documents and records management system) users focus on capturing emails from their inbox (i.e. email received) and forget about the need to capture sent emails. While it is easy to set up automated links between email folders and the EDRMS, a set and forget method, users fail to save their sent emails to the linked folder. I have failed to find an elegant, non-intrusive method to achieve the capture of the whole ‘story’.

2013-06-19-02-cartoon

The vision of having colleagues co-operate together to maintain a file that tells the whole story of a piece of work remains  tantalisingly out of reach, even in the case of the organisation quoted above, who seem to have done all the right things.

They have integrated their electronic records management system with the e-mail client so that folders in staff e-mail accounts can be linked to folders in the records system.  What more can they do?

The market will bring a solution to the specific ‘sent items’ problem that the respondent mentioned,through some sort of conversation threading so that sent e-mails are treated together with the e-mails that they responded to/received in response.

But at the same time it will bring different disruptive technologies – for example-mail access on smart phones that are too small to support drag and drop to folders;   cloud e-mail that might prompt an organisation to dispense with the e-mail client that had been integrated to the electronic records system etc.

Technology gives with one hand and takes with the other.  And the sheer fact of constant change means that colleagues/end-users do not have enough time in any one technological configuration to develop the shared routines and habits that would lead to them keeping a complete electronic file for each piece of work that their team undertakes.

How long should an e-mail account be kept after a member of staff leaves?

On 30 May 2013 two postings appeared that between them shed light on how organisations are currently managing the archived e-mail accounts of staff who have left:

    • The first was a post by Rebecca Florence to the IRMS Records-Management-UK listserv that kicked off a debate on e-mail account retention and deletion
    • The second was a blogpost by Emma Harris of State Records New South Wales reporting the findings of a survey they had conducted into how public offices in NSW are managing their e-mail

Rebecca Florence posted a description of the situation in her organisation:

The current arrangement is that for a period of time post-leaving, access to the mailbox and email archive (in our case we use the Symantec Enterprise Vault) can be passed to a designated member of staff.

After that period of time has elapsed the mailbox/archive is deleted by IT, with the contents being exported to a separate restricted access area. Access is granted to the exported contents on a case by case basis. Currently the exported content is held indefinitely.

I should add that as you would imagine there are policies and guidance in place which advises staff to save emails where necessary outside Outlook for longer term retention and also assigning responsibility post-leaving allows for a review of any remaining emails for ongoing business use. I’m sure as most of you will have experienced, there is disparity across departments in regards to how well this is managed.

Phil Bradshaw replied that keeping records indefinitely is not the same as keeping records permanently:

  • keeping records permanently means we have assessed the records and found them to have enduring long term value
  • keeping records indefinitely means we cannot find a basis to set a retention rule on them

Is it possible to deal with e-mail by reviewing e-mail accounts when members of staff leave?

Lawrence Serewicz responded to Rebecca’s post by pointing out the legal costs and risks of maintaining all e-mail accounts indefinitely:

  • e-mail accounts generally contain personal data and the indefinite retention of entire e-mail accounts may  breach several of the EU data protection principles.
  • information held in an e-mail archive may be subject to discovery in the event of a legal case, and to disclosure in the event of an access to information request

Lawrence recommended that e-mail accounts get deleted three months after a member of staff leaves, but only after:

  • a pre-exit process in which the line manager and the employee go through the e-mail account together and decide how to deal with the mails OR
  • a post exit process (in cases where the pre-exit process was not carried out )- where the specific service the employee worked for, Legal, HR and internal audit would all review the account.  The specific service would look for e-mails the service needed to carry on with the employees work; Legal would look for e-mails needed for possible legal claims, contracts or agreements; HR would look for e-mails needed for possible grievance or disciplinary issues; Internal audit would look for any illegality

The approaches described by Rebecca and Lawrence are similar in two respects:

  • both approaches reflect a belief that colleagues can not be relied upon to comprehensively and routinely deal with individual e-mails as they go along by filing and deleting
  • both approaches  rely on a big effort just before or after  the member of staff leaves to deal with what is left in the e-mail account.  This is problematic.   All of our experience as records managers tells us that it is very hard to deal with backlogs.   E-mail communications are exchanged with such frequency that backlogs quickly scale up to a size that makes patient sifting and sorting impossible.  An e-mail account at the end of a person’s employment is in effect a filing backlog.

The only difference between the two approaches is that:

  • Rebecca’s organisation cannot guarantee that  the line manager /designated person of the departed staff member will review the e-mail content thoroughly, and move important mails to a more appropriate, more accessible place.  As a result they keep all the e-mail accounts as a back up, just in case there is an overriding need (legal or investigative) to find an e-mail from an ex member of staff.
  • Lawrence’s approach requires organisations to ‘feel the fear and do it anyway’.   There is still no guarantee that reviews have been carried out/carried out properly,  but this time the organisation presses the delete button after three months regardless.

Is it possible to deal with e-mail by asking staff to move important e-mails into an electronic or paper file as they go along?

Simon McCauley responded to Rebecca’s posting by saying that in his organisation  staff are expected to save important e-mails into the electronic document and records management system (Livelink) as they go along.

Simon’s organisation are planning to implement a policy of moving e-mails from people’s e-mail accounts to an e-mail archive six months after the date of the e-mail, then deleting them from the archive after a further twelve months.

I assume that the thinking behind such a policy is that:

  • they have confidence in the capacity of their colleagues to file important e-mails as they go along
  • they know that colleagues are much less likely to file as they go along if they  have the comfort of knowing that the e-mails are kept for them in their e-mail account anyway

The  State Records Authority of New South Wales (NSW) has given similar advice to NSW public offices.   They summarise their policy as follows:

State Records advises NSW public offices to capture email messages that are sent or received in the course of official business into a corporate recordkeeping system. State Records suggests two principle methods for capturing messages:

– capturing messages into an EDRMS (electronic document and records management system)

– printing messages and capturing them on paper files

In her blogpost reporting the findings of their  recent survey of  e-mail management in NSW public offices,   Emma Harris of State records reported that:

– 81% of public offices agreed with the statement that in their offices ‘e-mail messages with corporate value are stored only in personal email accounts and are therefore at risk of loss or premature destruction’

– 33% of respondents advised that employees in their organisation neither capture messages to an EDRMS nor print and file them.

– few organisations have investigated alternative approaches to managing e-mails’[as opposed to asking colleagues to move e-mails into EDRMS/print to file].

The blogpost went on to report:

– half of the responding organisations have implemented an archiving solution, with two products (Symantec Enterprise Vault and Quest Archives Manager) being the most commonly implemented.

– A number of email archiving solutions have retention and disposal functionality (e.g. the ability to set retention periods and disposal actions on messages and to destroy messages when retention periods have expired). However the results of the survey suggest that organisations with email archiving solutions are not actively managing the retention and disposal of messages using this functionality.

The findings betray a lack of confidence on the part of the NSW public offices in the adherence of their staff to the policy of moving e-mails to electronic or paper files. This lack of confidence is presumably what lays behind the fact that NSW are, like Rebecca’s organisation, keeping e-mail accounts indefinitely.

Can we still set a blanket retention rule on e-mail accounts if we know they contain important messages that we need as records?

There is a similarity between all four approaches – Lawrence’s, Rebecca’s, Simon’s and the New South Wales approach.  All four are based on moving e-mails out of e-mail accounts.

If, like Lawrence and Simon, we are confident that we can move important e-mails out of e-mail accounts, then setting a blanket retention period on those accounts not a problem.  We set a blanket retention period covering all accounts, and we make it as short as we possibly can to concentrate peoples minds

But what if, like Rebecca’s organisation, like New South Wales public offices, and like most of the organisations I have worked with and spoken to over the last decade, you are not confident that important e-mails are being moved out of e-mail accounts?   Then setting a retention period is a different type of exercise.  All of a sudden we are having to recognise that the e-mail account is a record – a record of the work correspondence of that member of staff.

A blanket retention period, however short or however long, is not appropriate for organisations whose e-mail accounts contain important correspondence that is not available elsewhere.   This is because the roles people play in organisations vary greatly in their significance and impact – you are unlikely to need a record of the correspondence of an accounts clerk in your finance department for the same length of time as the correspondence of your chief executive (with all due respect to both parties).

We need to find a rationale on which to base a retention rule on e-mail accounts.   This is something we as a profession have not hitherto thought through for the simple reason that we have been battling for over a decade to avoid having to treat e-mail accounts as records.  Even starting to think through the consequences of treating e-mail accounts as records feels like an admission of defeat.  In reality this is not an admission of defeat.  Defeat would come up if we gave up trying to keep manageable records of people’s work correspondence.

Getting people to move individual e-mails one-by-one to electronic files is a tactic not an end in itself.   Most organisations have not been able to make that tactic work – at the very least we need an alternative.

Establishing a defensible rationale for retention rules on e-mail accounts that we treat as records

We can set a retention period for a record of a particular type of work by considering all the different reasons why we need a record of the work in question, and then keeping  the record for the longest period that any of those needs is likely to stay valid.

The  e-mail account of an ex member of staff is simply a record of the correspondence exchanged by a particular individual in the course of their work, minus any e-mails that have been deleted/moved.

There are multiple legitimate reasons why someone might need to look at the work correspondence of a colleague or  predecessor who has left :

  • They might need to see what correspondence their colleague/predecessor had exchanged with a particular external stakeholder/partner/customer/supplier/citizen in order to inform their continuation of that relationship.
  • They might need to see what correspondence the colleague/predecessor had exchanged in the course of a piece of work because they need to continue with the piece of work. restart it,  learn from it, evaluate it, copy from it etc.
  • They might need to account for their colleague/predecessor’s work, in response to audit, investigation, criticism, access to information request or legal discovery
  • Depending on the nature of the role of that individual, they might need to transfer the correspondence to a historical archive on account of the enduring public interest in the work of that individual

In most parts of most organisations we cannot adequately meet those record keeping needs without retaining the e-mail account of the member of staff concerned.   The challenge of setting a retention value on e-mail accounts is that such accounts will typically contain corresondence arising from many different pieces of work, and  those pieces of work may have very different retention values.

A nice, neat approach is simply to keep the e-mails of an individual for as long as you keep the records of the main type of work that they carried out.

  • If they were an accounts clerk in a finance department, and your organisation’s retention rule on accounting work is to delete the records after seven years, then apply that rule to their e-mail account also
  • If they were a senior civil servant working on policy issues and on new legislation,  and your retention rule for work on the development of legislation, and on the development of national policy, states that records should be kept for  for 20 years and then reviewed for permanent preservation and transfer to a historical archives,  then apply that rule to  their  e-mail account also
  • If they worked on staff recruitment, and the retention rules for recruitment work is to delete records three years after the recruitment exercise,  then retain their e-mails for three years too.

One choice to make is whether to have the retention rule:

  • applied to the entire e-mail account – so the retention rule is triggered from the moment of the individual’s departure from the organisation (this has the disadvantage that some staff may have had long and varied careers in the organisation)
  • applied to e-mails by date (month or year)  –  so the retention rule is triggered by the end of the month or year that the e-mail was sent/received in (a better option)

The problem of personal data of a sensitive nature in e-mail accounts

So far so good – we have a defensible logic to base our  retention rules on e-mail accounts, to meet the full range of records management needs.  But there is a problem.  The problem is the widespread presence of personal data of a sensitive nature in e-mail accounts.  By ‘sensitive nature‘   I mean

  • information about the e-mail account holder that they would not want even their closest colleagues or their successor to access; and
  • information about a third party that the e-mail account holder corresponded with, or had discussed in e-mails, where that person could be disadvantaged if the information were to be made available even just to the account holder’s successor and closest colleagues

Even if an individual never used their work e-mail account for non-work correspondence with friends and family, their account is still likely to contain personal information of a sensitive nature, exchanged with colleagues.  Think of an e-mail exchange between a line manager and a member of their team who had to take time of work for personal or family reasons.

The fact that most e-mail accounts have not had such e-mails filtered out means that most organisations in my experience (centred around the UK and Europe) cannot currently allow colleagues routine access to the e-mail accounts of their predecessor, or their former colleagues.

Most organisations struggle to set access rules on e-mail accounts

Most electronic document management systems work on the principle that access permissions can be set for objects or aggregations of objects (file/folder/site/library/document etc.).   A person or group of people is either permitted or forbidden to access that object/aggregation.   There are no grey areas in between.  If I  am authorised to see a document then the system merely asks me to authenticate myself (so the system knows it is indeed me who is asking) .   It does not ask me why I want to see it.

Rebecca’s organisation allows access to archived e-mail on ‘a case-by case’ basis.  In other words they are unable to tell their e-mail archiving tool who is authorised  to access each e-mail account.

With e-mail archives the information contained in the archive is so sensitive that organisations are imposing an extra control – people are having to say why they need to access the e-mail account, and that request is either permitted or denied, not by the e-mail archive itself, but by people in the department responsible for overseeing the archive.

I worked with one organisation where any application to see e-mail accounts of former staff had to be approved by their human resources (HR) department, who would only allow consultation in exceptional circumstances where there was no other way of getting the information.   One  individual told me that any that they had wanted to access the correspondence that a former colleague had exchanged with a supplier about a particular contract, but HR had refused.

That HR department had no option but to be restrictive.  Imagine this scenario:  I work with a colleague, and  develop malicious intent, or an unhealthy curiosity, towards them.  They leave.  I think of a project that they worked on and say to the IT department that I need to look through their e-mail account to find records relating to that project.  What else might I look for/find?  That is why governance of e-mail archives is vital , including keeping non- deletable records of who searched for what terms under what authority , and what e-mails they opened and looked at.  This must include any searches made by any staff, whether end users or IT system administrators.

Is there any point in setting a retention rule that covers all the record keeping needs arising from an e-mail account if we cannot allow colleagues to access the e-mail accounts for those purposes?

The retention rule that we arrived at above was based on the full range of recordkeeping needs that we have in relation to the correspondence of an individual who is a close colleague or predecessor.  We now find that we cannot allow access to the e-mail accounts, even to close colleagues, for most of these purposes, because of the presence of personal information of a sensitive nature that is unmarked, unflagged, and undifferentiated from the rest of the mails in the e-mail account.

If we can only access e-mail accounts in response to overriding imperatives such as access to information requests, e-discovery requests and the need to defend or prosecute any legal case we might be involved in,   then should that be the only consideration we take into account in setting our retention rule? Should we only retain e-mail accounts for the period in which it is useful for us to have them in case of legal dispute?

If we only take into account the overriding imperatives of legal disputes and access to information requests then the logic for setting a retention rule becomes much more arbitrary:

  • if we adjudge the cost/risk of the e-mail accounts being subject to an access to information/e-discovery requests to be greater than the benefit of being able to use the e-mail accounts to support any case we would need to make in court,  then we would impose  a short retention period – perhaps the three months that Lawrence suggested
  • if we adjudge the benefit of being able to use the e-mail accounts of former members of staff to support any legal case we might want or need to make to be greater than the cost/risk of servicing access to information and e-discovery requests then we are likely to set a retention rule equivalent to a standard limitation period of seven years as Simon suggested (though you need to be careful with limitation periods – in some cases the clock of a limitation period may not start ticking until well after a member of staff leaves – for example if the person was working on designing a bridge, or a drug, or with children etc.)

The problem with this very pragmatic approach is that we will continue to fail to meet the day-to-day record keeping needs of our colleagues when they start a new job, and when they need to look back at the work of former colleagues.   And we will not not be able to make the record of the work correspondence of people playing important roles in society available to  future generations of policy makers, researchers and historians.

In his excellent Digital Preservation Coalition Technology Watch Report   on e-mail  Christopher Prom reported:

Winton Solberg, an eminent historian of American higher education, remarked … ‘historical research will be absolutely impossible in the future unless your profession finds a way to save email’ (Technology Watch Report 11-01: Preserving Email [PDF 916KB] by Christopher J Prom 2011,  page 5)

I will go one further and say that if we could solve the challenge of how we  provide an individual with routine access to the e-mail account of their predecessor, then we will be able to solve the challenge of how we provide access to that an e-mail account to historians or other researchers further down the line.  The two challenges are inextricably linked.

Many of our organisations have e-mail archiving tools, but these archives function as a murky sub-concious of the organisation, full of toxic secrets, inaccessible to the organisation in its normal day to day functioning,  and they pose a huge, ongoing,  information governance risk.

What we need is an approach to e-mail that results in staff leaving behind an e-mail account that their colleagues and successor can routinely access and use, without unduly harming either the account holder or people mentioned in their correspondence; and that we as an organisation can apply defensible access rules and retention rules to.

It is beyond the ability of a single organisation to develop such an approach (because it involves changes to available tools, changes to the way we think of an e-mail account, and changes to how we ask our colleagues to treat e-mail).  But it is well within the capability of the records management/archives professions to articulate such an approach, and then incentivise and cajole  venders (particularly the ecosystem around the big on-premise and cloud e-mail products/services) to create offerings that match it.

As a starting point I would like to see us as records managers and archivists getting this issue on the agenda of our organisations and of society more widely.

Two quick suggestions to get the ball rolling:

  • For records managers –  if you are concerned that important e-mails are not being moved out of e-mail accounts,  consider broaching the emotive subject of e-mail accounts when  building or revising your organisation’s records retention schedule.  Include in the retention schedule a list of those post holders in your organisation whose e-mail account contents you require be retained for a minimum of 20 years
  • For archivists working for the national archives of our nations -if you are concerned that important e-mails in government departments/ministries in your country are not being moved out of e-mail accounts,  then when you draw up or revise your selection policies,  include a list of posts in the various government bodies from which you require e-mail account contents to be appraised for permanent preservation in your archives