The core components of the new generation of records management/information governance tools

In my last post I drew a distinction between two generations of records management tools:

  • The first generation of tools are those that hit the market between 1997 and 2009 and we called them electronic document and records management (EDRM) systems
  • The second generation are those that hit the market after 2009 and we seem to be calling them information governance tools

In this post I will look again at this distinction – this time comparing the components and capabilities of the old EDRM systems with the components and capabilities of the newer information governance tools.

The core components of the first generation of records management tools (EDRM systems)

The first generation of tools consisted of six core components/capabilities:

  • an end- user interface  to allow end-users to directly upload documents to the system
  • an integration with the e-mail client (usually Outlook) to allow end-users to drag and drop e-mails into folders within the system
  • document management features:  such as version control, check-in and check out, generic workflows and configurable workflow capabilities
  • a repository:  to store any type of documentation that the organisation might produce
  • classification and retention rules:  capability to hold, link together and apply a records classification (business classification scheme) and a set of retention rules
  • records protection – capability to protect records from amendment and deletion and maintain an audit trail of events in the life of that record

When implementing  such EDRM systems the records managers drew a ‘line in the sand’.  They aimed to implement  a system that would manage records going forward in time.  They did not attempt to deal with legacy content that had already accumulated on shared drives and in email.

The weakness of EDRM systems was that end users did not move all or most significant content into the records system.  Shared drives and e-mails continued to grow and continued to contain important content not captured into the records system.

Added to this a range of disruptions happened:

  • Microsoft’s entry into the content management space with SharePoint 2007 took away the collaboration space from the EDRM systems.   Unless they had complex requirements, organisations with SharePoint no longer needed the version control, check-in check out or workflow capabilities of the EDRM tools.
  • E- discovery/freedom of information/subject access enquiries caused more and more pain to organisations, and tended to focus on material in e-mail and shared drives rather than content in the EDRM
  • The move to smart phones and tablets made the user-interface problematic – smartphones have screens that are too small for the full functionality of an EDRM end-user interface.
  • The move to the cloud made e-mail integration problematic – cloud e-mail services do not allow customisation of their user-interface.

The seven core components of the new generation of records management/information governance tools

The second generation of records management tools, which we are calling information governance tools, consists of seven key capabilities:

  • Indexing engine  the ability to crawl and index content in many different applications and repositories (shared drives, SharePoint, e-mail servers, line of business systems etc)
  • Connectors  a set of connectors to the most common applications and repositories in use in organisations today (SharePoint, Exchange, ECM/EDRM systems etc).   The connectors enable the records system to take action on content in a target repository – for example to delete, move or place a legal hold on it.  They also enable the crawler to extract context to index.
  • Metadata enhancement and auto-classification the ability to add, through the connectors, extra metadata fields to content, and the ability to assign content to a classification either by setting rules based on parameters, or by using auto classification algorithms
  • Analytics dashboard to surface patterns in content repositories, for example to identify duplication, redundancy, trivia and high risk content
  • Classification and retention capability to hold and apply a records classification and a set of retention rules   – this is the main point of continuity between the first and second generation of records management tools.
  • In-place records management  the capability to protect records from amendment and deletion, maintain an audit trail of events in the life of that record, and assign a retention and classification rule to the record, even where the record is held in a different application than the records system itself.  From the end-user point of view this has the advantage that they can stay in the applications they are used to work in – they do not have to learn how to use the records system.
  • Repository  a repository to store any type of documentation that the organisation might produce .   The in-place records management features reduce,  but do not eliminate the need for a records repository.  Records repositories are necessary when an organisation wants to decommission an application, but still wants to retain the content from that application.  In cloud scenarios the repository comes in useful when the organisation wants the content to be available via a cloud application but not stored by the cloud provider

Notice what has been taken away and what has been added:

  • The components that an end-user interacted with – the end-user interface and the document management functionality, have either disappeared entirely or become an optional extra.
  • What comes in their place is the connectors,  indexing engine,  analytics and in-place records management capability necessary in order for a central administrator to understand and act on content held outside of the records system itself


The importance of the analytics dashboard

The key difference between the new generation of information governance tools and the old generation of EDRM systems is that the information governance tools pay as much (often more) attention to existing content as they do to shaping the way future content will accumulate.

The most stark illustration of the change is this:

  • ten years ago if you saw a system demonstration by a vendor at a records management event they would start by showing you their end-user interface for an individual to upload a document.
  • In 2014 a vendor will start by showing you their analytics dashboard

The analytics dashboard is the key to the new generation of  records management/information governance tools

Without the dashboard having an indexing engine crawling across shared drives, e-mail and SharePoint would be useless to the records manager.

The dashboard enables the records manager to actively interrogate the index to hone in on targets for action – information that should be deleted/moved/protected/classified/assigned to a retention rule etc.

392-analytics 1

A typical dashboard shows the records manager  how much content is held. where it is held, what file types there are, what departments it belong to,  what is redundant/outdated/trivial etc.   The dashboard also enables the records manager to use these different dimensions in connection with each other – for example to hone in on content of a particular department in a particular time period.

These are powerful tools in the hands of a central administrator, and it is important that they have workflows and audit trails in them so that:

  • the records manager can get the approval of content owners before making disposal decisions on content
  • the system can record that approval, and record the actioning of the decision

Note however that these tools are more effective at helping records managers make decisions on content that has build up in the shared drive and SharePoint environment than they are at dealing with content that has built up in e-mail accounts.

One of the challenges with EDRM systems was that it was very hard to measure benefit and give a tangible ROI.   The  business case for the new infromation governance tools often arises from savings produced by dealing with legacy data – something that the EDRM systems were not set up to do.  The ROI might come from:

  • savings from storage optimisation (moving less active content to second or third tier storage)
  • savings from reduction of content that has to be reviewed for eDiscovery/access to information requests

The benefits might be

  • capability to move content from legacy applications
  • capability to process the shared drives of functions acquired or divested in mergers and acquisitions

At the ARMA Europe conference last month Richard Hale from Active Navigation and Lee Meyrick from Nuix both gave presentations urging records professionals to be pragmatic and concentrate on targeting particular improvements one at a time.  The dashboard suits that approach – gone is the utopian wish to create a perfect records system, instead we have an incremental approach whereby a central administrators hones in on particular areas of content for protection/enhancement/migration.

The strengths and weaknesses of the information governance approach to records management

It is clear that a new records management approach is emerging.   We can see the signs:

  • a new set of tools has emerged that seek to enable organisations to mitigate and adapt to the the fact that their records are largely held in places such as e-mail accounts, shared drives and SharePoint team sites that are difficult for the organisation to manage.    In contrast the previous records management approach (the electronic records management approach) sought to move records out of these repositories into an environment that was easy for the organisation to manage – a standards compliant electronic records management system
  • a new set of beliefs are forming about records – the belief that the burden of records management on end-users should be minimised. The belief that end-users cannot be relied upon to make consistent decisions on whether or not particular documents or e-mails needs to be captured as a record.  The belief that organisations should no longer try to distinguish between records and non-records, records systems and non-records systems, but should instead realise that all content they hold needs to be managed and accounted for.
  • new guidance has been issued from influential bodies – the US National Archives has told federal agencies to designate the e-mail accounts of key staff for permanent preservation if they have not been able to find an alternative way of routinely capturing important e-mails as records.   It has asked agencies to innovate and to look for new ways of automating records management to reduce the burden on end users.  It has asked vendors for their ideas and help for ways of automating records management

The best way of understanding the new approach is to compare it with what has gone before.   In the history of records management  we have had periods where the profession has had a coherent approach to offer organisations, interspersed with periods of disruption during which technological and communications developments have made an existing approach untenable:

  • The registry approach  (1950s to early 1990s) In the paper age the registry approach involved employing teams of records clerks (grouped into ‘registries’) to maintain files and place correspondence and documentation onto those files.     This meant in effect that an organisation of 1,000 people that wanted good records management across all its activities would deploy around 40 people to capture all the records of the organisation.  Post would arrive in a post room and then it would be sent to a records registry for filing before being delivered to the individual addressee for action.
  • The disruption of the arrival of e-mail (1993 to 1999)   The arrival of e-mail meant it was no longer possible to route the flow of incoming and outgoing correspondence through registries – instead correspondence went directly from sender to recipient with no space for intermediaries
  • The  electronic records management system approach (2000 to 2007).  With this approach an organisation asked every individual within the organisation to declare every important e-mail/document that they created or received to the electronic records management system.   This in effect means that an organisation of 1,000 people was asking all 1,000 people to take decisions on what gets captured as a record and where it is placed within the records classification/filing structure.  The  benefit was that every document declared into those systems were well described and well governed.  The problem was that records capture was no-longer routine and no longer integrated into the process by which people exchanged written communications.  Instead records capture was an after-thought, dependant on the motivation, awareness and workload of individual staff.
  • The disruption of the rise of SharePoint (2008 to 2011)  the rise of SharePoint destroyed the market position or electronic records management systems by capturing the collaborative space, without in itself offering a workable records management alternative.
  • The information governance approach (2012 –  )  in this emerging approach an organisation gives a central information governance or records management unit  tools to index/apply classification and retention rules to / clean up/ the various  applications/repositories that the organisation is using.  This means an organisation of 1,000 people is asking 3 or 4 members of staff to make records organisation and disposition decisions for the whole organisation.

Strengths of the information governance model

Ability to deliver quick wins

The main strength of the information governance approach is that it enables organisation to take pragmatic measures in the short term to tackle key pain points or cost points:

  • If they are paying a fortune keeping their entire shared drive on expensive first tier shortage they can use analytics tools to identify redundant, outdated, trivial or rarely looked at content and either dispose of it or move it to cheaper storage.  Organisations often lease such tools on a short term basis for particular projects/cases, rather than taking out perpetual licences.
  • If organisations are facing problems responding to eDiscovery requests they can deploy (or their eDiscovery service provider can deploy) an indexing engine to index shared drives, e-mail accounts and SharePoint sites; apply legal holds; and support the review process

This is a contrast with the electronic records management system approach which typically took several years before beginning to deliver benefits.

Less need for change management

The second strength of the approach is that it is not dependent for its success on end-users changing their behaviour.   In-place records management tools and SharePoint records management  plug-ins, and to an extent e-mail archive systems, allow the application of records classification and retention rules without end users leaving the applications that they work in.   It isn’t that these tools have no-impact on end-users (they might require an end-user to act when creating a new folder/document library for example), but they have far less impact than the electronic records management systems approach.

The problem with the electronic records management system approach being so dependant on end-users changing their behaviour was not so much the resource implications of the training necessary.    It was the fact that there was no certainty that the change management would succeed.  A significant number of electronic records management system projects were abandoned due to lack of user buy-in.

Possibility of extending, rather than abandoning, electronic records management systems

Another strength of the information governance model is that it enables those organisations that did manage to establish electronic records management systems to extend the reach of those systems to shared drives, e-mail accounts and SharePoint sites.   This could be done by using an analytics/indexing/clean-up/in-place records management tool to move selected content from repositories such as shared drives into the electronic records management system.

Weaknesses of the information governance model

Lack of a fully worked through theory

The information governance model is an emerging model, it is still not fully worked through.  In particular there is no coherent body of theory and guidance yet.   As an illustration of this we saw in 2013 the US National Archives appealing to vendors for ideas on how to automate records management.  This is in stark contrast to the situation at the end of the 1990s when various national archives around the world were specifying to vendors exactly what functionality an electronic records management system should have.

Focus on the compliance requirements of external stakeholders at the expense of the day-to-day needs of internal end-users

If one of the main strengths of the information governance model is the reduction in burden on the end-user, the main weakness of the model is a lack of clear benefit for end-users.

For example:

  • indexing engines (such as those provided by Nuix and Zylab) act in effect like an enterprise search engines, albeit with the additional ability to be able to  take action on content rather than simply find content.  They can be shone across e-mail servers, shared drives, SharePoint, line of business systems etc.   But unlike enterprise search engines these indexing engines are not intended for end-users.  They are intended for central administrators and those charged with dealing with eDiscovery and access-to-information requests.   Most organisations buying indexing engines such as Nuix do not provide end-users with an interface to these products. Providing such an interface is not feasible because the whole point of such tools is that they can be used to search ‘dark data’ – material in e-mail accounts which contain a mixture of harmless,useful, harmful, useless and private content.  This means that indexing engines can be used by administrators/legal counsel to service the needs of external requestors (often hostile to the organisation), but cannot be used to service the day -to-day information needs of internal users who wish, for example, to know what a predecessor had said to a particular stakeholder/customer/client/citizen/lobbyist/regulator.
  • Electronic records management systems aimed to create an electronic ‘file’ that told the whole story of a piece of work (much like good paper files used to), and that functioned as a single point of reference for that piece of work.  In other words they were trying to ‘shape’ the way records accumulated in a way that was useful (or was thought to be useful) to both the individuals carrying out that work and any future stakeholders.  In contrast in-place records management tools attempt to apply policies (classifications and linked retention and perhaps access rules) to content held in different repositories. They are not looking to shape the way records accumulate.   They are not looking to create a single source of reference for a particular activity.  They are instead looking to make sure that the organisation can apply an appropriate classification and retention rule to all content.

The lack of focus on the information needs of internal end-users is not surprising.  This is an information governance approach that is being adopted as a records management approach.   Information governance and records management are different professional perspectives, with different histories and aims.  Neither can be reduced to the other without giving up some of its core aspirations.  The aspiration that records management is giving up if it takes the information governance approach as it is, without adding to it or reshaping it, is the aspiration to design records systems that are equally useful  for the day-to-day needs of internal end users as they are for compliance with the requirements of external stakeholders.   The information governance model, as it stands at the moment, has a strong emphasis on the latter to the neglect of the former.

The Ontario e-mail deletion scandal – part 10 – Why had no-one mentioned the e-mail archive?

The story so far….
In September 2011, just before a general election, Ontario’s Minister of Energy announced the cancellation and relocation of a controversial gas plant.

In May 2012 the Estimates Committee of the Parliament of Ontario requested to see the correspondence of the decision.  They received no correspondence from any of the political staff working in  the Office of the Ministry of Energy, nor from those working in the Office of the Premier of Ontario.

Craig MacLennan was Chief of Staff to the Minister of Energy when the gas plant decision was taken.  He left his post and Ontario’s public service in August 2012.

In April 2013 MacLennan was questioned as to why he had not returned any records responsive to the Estimates committee’s request.  He said that he had been unable to return any responsive records because he kept ‘a clean in-box’ and routinely deleted his e-mails.

MacLennan’s statement was reported to Ann Cavoukian, Ontario’s Privacy and Information Commissioner.  She investigated and reported in June 2013 that MacLennan’s e-mails were not recoverable (Ontario’s policy is to delete e-mail accounts when members of staff leave the service).

In July 2013 Cavoukian was contacted by the Ministry of Government Services who told her that a portion of MacLennan’s e-mail account had been found in their ‘Enterprise Vault’ e-mail archive.   This portion of his account comprised 39,000 e-mails of which 1,800 related to the gas plant issue.







Next episode

The Commissioner issues an addendum to her report

The Ontario e-mail deletion scandal (part 9) – a portion of the e-mail account is found

The story so far:

Craig MacLennan was Chief of Staff to Ontario’s Minister of Energy when controversial decisions were made to cancel and relocate two gas plants.   In April 2013 he told a committee of the Ontario Parliament that the reason he had not provided any documents in response to a committee request for all correspondence relating to the gas plant decisions was on account of his habit of ‘keeping a clean inbox’.

MacLennan’s comment sparked an investigation by Ann Cavoukian,  Ontario’s  Information and Privacy Commissioner, as to whether or not this practice of e-mail deletion constituted a breach of Ontario’s Archives and Recordkeeping Act.

In June 2013 Cavoukian concluded that MacLennan’s e-mail account was not recoverable (Ontario’s policy was to delete e-mail accounts when staff leave) , and that a practice of routinely and indiscriminately deleting e-mails was indeed contrary to the Archives and Recordkeeping Act.


Ontario-pt9-2 1




Next episode:

Why had nobody mentioned the e-mail archive to the Information and Privacy Commissioner when she conducted her original investigation?

The new wave of information governance tools – what do they mean for records management?

A new wave of tools has hit the records management space over the past two or three years:

  • eDiscovery indexing engines that aim to index all of an organisation’s content across however many repositories/applications it uses
  • in-place records management tools that aim to apply classification and retention rules to content regardless of the repository/application in which it is kept
  • e-mail archive tools that are more ambitious than the previous generation of e-mail archives, and which, for example, offer features supporting the auto-classification of e-mail
  • plug-ins for SharePoint to bridge the gaps in records management functionality
  • clean-up tools for shared drives

All of these tools can be placed under the the collective description of ‘information governance tools’.  But they are very different from each other.

One way in which we can categorise these new tools is by the ambitions of the organisations deploying them:

  • An organisation with a big eDiscovery bill but little records management ambition will turn to indexing engines to give them the reassurance that they can identify material responsive to litigation cases even if individual staff continue to leave correspondence in their e-mail accounts and do nothing more with documents than stick them in a folder on a shared drive.
  • Organisations wanting to reduce the load of burden of maintaining (and reviewing when an eDiscovery/access to information request comes in) vast volumes of documentation on shared drives will deploy a shared drive plug in tool to protect and classify material of value, whilst identifying and getting rid of ROT (redundant, outdated and trivial) documentation.
  • Organisations concerned about e-mail volumes, and  with the potential existence of toxic comments and information within e-mails, will deploy e-mail archiving tools. They will either use auto-classification features to filter e-mails into categories and apply disposition rules,  or they will use analytics features aimed at identifying  high risk/trivial/private communications .  The auto-classification tools are indeed getting more and more sophisticated but the applications of  auto-classification tools are still crude.  You need to train an auto-classification tool how to recognise material relevant to every single category in whatever classification you are using.  The more granular the classification the more training you have to give the auto-classification tool.  For the time being at least you can only realistically auto-classify into ‘big buckets’.
  • Organisations with records management ambitions and with big investments in SharePoint will deploy SharePoint plugs ins.  The plug-ins will enable them to:  link retention rules to their records classification;  apply the classification and retention rules to different types of SharePoint objects:  (folders/content types/libraries/sites) ;  and better import and export content into and out of SharePoint.  The plug ins will also give them e-mail integration so that staff can drag and drop e-mail into SharePoint libraries.  The challenge here is that even a drag-and-drop facility does not on its own motivate end-users to consistently move significant material out of their e-mail accounts
  • Organisations wanting to manage records across several different environments (typically shared drive, e-mail SharePoint), without asking staff to move records into a separate electronic records management system, will deploy in-place records management tools.  In effect these tools will give you much of what a  SharePoint plug-in tool and a shared drive clean-up/governance tool will do, and some of the things an e-mail archive might do. They will also offer connectors to the major enterprise content management system/document management system products on the market.   So long as the vendor of the tool remains viable and keeps coming up with connectors for new repositories you in theory have an approach to managing records across your whole IT estate.   The success of in-place vendors will depend upon the extent to which they can convince organisations that the synergy of having one tool to manage records across many applications outweighs the option to pick best-of-breed solutions for each particular application/repository (e-mail/shared drive/SharePoint).     The challenge for organisations is that each repository/application that they are trying to govern has its own unique features, structure and functionality.   This means that the in-place tool has to work in a different way to govern content in each separate application. The deployment of an in-place tool to each different application/repository is a separate project in its own right.

In practice we are already seeing convergence between these products as vendors either move into each others territory, or ally with each other.  We are also seeing convergence between these products and the previous generation of  electronic records management systems

  • We are seeing vendors of indexing engines such as Zylab and Nuix moving more deeply into  information governance by adding functionality to  clean-up and apply rules to the repositories that they have indexed.
  • We are seeing the vendors of traditional electronic records management systems such as IBM and HP Autonomy use their electronic records management systems as repositories  behind their in-place records management  offerings.  Even when an organisation adopts an in-place records management approach they are still going to want to decommission applications at some point and hence need to be able to move content out of those applications into a repository.
  • We are seeing alliances between vendors of these different products – for example that between RSD and Nuix to offer both in-place records management and indexing/eDiscovery capabilities.

There are numerous  questions for us to explore concerning the implications of the rise of these tools are for records management.

  • Is records management being subsumed into information governance – or is it a separate discipline that will help to shape information governance but will retain its own distinct identity and purpose?
  • What are the fundamental differences between this wave of information governance tools, and the wave of ‘electronic document and records management systems’ that dominated the records management market between 1999 and 2009?
  • Does this new wave of information governance form part of a wider change in records management paradigm? Are we seeing a new model of how records management should be tackled?    If so to what extent is it a fully worked through paradigm?  What are the underpinning set of beliefs behind it?   Is there a body of theory behind it?
  • To what extent will such a new records management paradigm meet the aspirations of the profession?   Will it work in practice?  What would it need from organisations, from records managers, from end-users and from vendors in order to make it work?

The Ontario e-mail deletion scandal (part 8) – the Commissioner makes her report

The story so far:  In April and May 2013 Ontario’s Information and Privacy Commissioner carried out an investigation into allegations that Craig MacLellan, former Chief of Staff to the former Ministry of Energy, had deleted e-mails relating to controversial gas plant cancellations in 2010 and 2011.   She was now ready to report.


Ontario-pt8-2Ontario-pt8-3 1

Next episode:   Debates in Committee

The Ontario e-mail deletion scandal (part 7) – reaction to the retention schedule

The story so far:    Ontario’s Information and Privacy Commissioner was interviewing Craig MacLennan (former Chief of Staff to Ontario’s Minister of Energy),  as part of her investigation into whether his practice of routinely deleting his e-mails contravened both Ontario’s Archives and Recordkeeping legislation, and the retention schedules drawn up by Ontario’s archives.    The Commissioner has shown Craig the retention schedule and asked for his reaction

Ontario-pt7-1gOntario-pt7-2 g2Ontario-pt7-3 gOntario-pt7-4gOntario-pt7-5gOntario-pt7-6g

Next episode:   the Commissioner reports