The SharePoint records retention model in Office 365

At an IRMS public sector group meeting in London yesterday I heard Rob Bath compare the records retention model offered within the on-premise SharePoint with that offered in Office 365.

Rob described the two models available in on-premise SharePoint:

  • the record centre model in which important items are moved to a records centre – the disadvantage of this model is that it rips content out of context by taking it from the SharePoint sites in which users had interacted with it;
  • the in-place record management model in which end users can click a button to identify individual items as records – the disadvantage of this model is that SharePoint gives no reporting capability for information managers to see and manage the items scattered across their SharePoint implementation that have been declared as records.

Both the record centre model and the in-place model are still available within SharePoint Online in Office 365.  However Office 365 offers another way of managing the retention of SharePoint Online content.  This new method does not sit within SharePoint Online itself.  It sits in the Office 365  Security and Compliance centre which exists to provide a means of managing content across the Office 365 family of applications.

The Security and Compliance centre provides a facility to set up retention labels and retention policies:

  • retention labels can be applied to containers within SharePoint sites such as libraries or folders;
  • retention policies can be applied at SharePoint site level.

Rob’s conclusion was that the retention labels/retention policies model offered in the Office 365 Security and Compliance centre was both a simpler and more effective way of managing SharePoint content than the two models available within SharePoint itself.  One member of the audience asked him whether there were any circumstances in which he would recommend the use of either the record centre model or the in-place model in place of (or in conjunction with) the retention labels/retention policies model within Office 365.  Robert thought for a second and then said ‘no’.

In the rest of this post I will offer some thoughts on why it is that after so many years of coming up with such unwieldy retention offerings in SharePoint, Microsoft have come up with something so much better for Office 365.

The need for Office 365 to have a retention model that went beyond SharePoint

Microsoft wanted a records retention model for Office 365 that was not unique to any one particular Office 365 application, but which could be applied to all of the major applications within the Office 365 family.  This forced them to come up with a model that was not based on any features specific to SharePoint itself.  In particular it led to them to moving away from the linkage between records retention and SharePoint content types.

The need to come up with a model that could be applied to applications as diverse as SharePoint Online, the Exchange Online email system, and the OneDrive filesharing application meant that Microsoft had to look for a common denominator between the different applications.  The one common denominator between the applications is that they all aggregate content.

The records retention model in Office 365

The retention model in Office 365 is very simple.   For each application a fundamental aggregation is identified.   In Exchange it is the email account.  In SharePoint it is the site.  In OneDrive it is the OneDrive account, etc.. The Office 365 Security and Compliance centre allows you to use the fact that all content in SharePoint is held within sites, all content in Exchange is held within email accounts etc. to apply your retention rules.

The Security and Compliance Centre offers essentially two different strategies for linking retention rules to content:

  • The most basic approach is that you apply retention at the level of the fundamental aggregation.   You set up retention policies in the Security and Compliance Centre and you identify which SharePoint sites (and/or which Exchange email accounts, which OneDrive accounts etc.) you wish to apply each policy to.
  • A more sophisticated approach is that you manage retention at a level below the fundamental aggregation.  In this model you set up your retention rules as retention labels (rather than retention policies) in the Security and Compliance centre, and then you target the rules at the SharePoint sites/Exchange email accounts etc. in which you want them to be available for users to apply to content

Applying retention policies and/or retention labels to content in Exchange Online and in SharePoint Online 

It is possible to use different strategies to apply retention labels/policies in different ways in different Office 365 applciations,

My preference for email is to set a retention policy on email accounts based on the business value of the correspondence (which will vary according to the role played by the individual email account holder), and in addition to allow users to use a retention label to flag up personal correspondence so that it can be subject to a shorter retention period.

After his talk I asked Rob Bath whether he preferred to apply retention policies or retention labels in the SharePoint environment.  He said that he thought that a SharePoint site was typically too big an aggregation to apply one retention policy to.  He prefers to apply retention labels to libraries and folders within sites.  The way to do this is to set up your retention rules as retention labels and then identify for each label the SharePoint sites that it is relevant too.  The result will be that most SharePoint sites will only have a small number of retention labels available to manage content within them. In the process of creating any new library or new folder the library/folder can be configured to apply one of those retention labels to all content stored within it (see here).

It is important to ignore Microsoft’s pushing of retention labels as a tool for end-users to tag individual items in SharePoint.  There is no incentive for an end user to choose a retention period for an individual document.  In general it is not good practice to ask end users to do something you know they will not do.   In the SharePoint environment you should either:

  • Set retention labels as default on libraries or folders; OR
  • set retention policies as default on SharePoint sites (do this if setting retention labels on libraries/folders is not possible, perhaps because your SharePoint installation is too big, your number of information governance staff is too low, your roll-out schedule is too short, or you are applying retention labels/policies in a legacy environment)

 

 

Predictions for the application of machine learning to the management of email

Last year I gave two presentations, one to the DLM Forum Triennial and one to the IRMS conference, in which I developed a fictional case study of an organisation that decides to apply machine learning and analytics to email.

In my case study a public sector organisation:

  • is concerned about the low capture of email into its record system (SharePoint) and embarks on a programme to apply machine learning to remedy the shortfall;
  • uses machine learning to apply its existing policy of moving important email into a records system;
  • seeks to apply the machine learning capability on all email accounts on a corporate wide basis.

In reality I think that the attitude of public sector organisations to the application of  analytics and machine learning to email will be rather different to the attitude taken by the organisation in my case study.   My predictions are that public sector organisations in the UK:

  • will be reluctant to apply machine learning to email accounts because of the risks involved;
  • will be just as concerned about the prospect that the application of machine learning might result in very large volumes of email being captured into their record system as they would be about the existing under-capture of emails as records;
  • would use machine learning (or an analytics capability) to look for certain specific types of correspondence that are valuable to the organisation in certain specific accounts rather than applying machine learning/analytics to all accounts across the business;
  • would not move emails identified as important or valuable into their corporate records system, but would instead leave them within email accounts and either place them under a hold to prevent deletion, or move them to an email archive.

Here is the video of my monologue explaining how the organisation applying machine learning to all of its email accounts got on:

 

Managing email in Office 365

What is an email account in Office 365?   It is a special type of document library, that doesn’t need version control, doesn’t need extra metadata fields and doesn’t live in SharePoint.

It now seems a little incongruous for organisations to ask their staff to move important emails out of email accounts and into a ‘corporate record system’.  If Office 365 is your corporate record system then email accounts are within it already!

One of the things that Microsoft had to do in order to make Office 365 work as a service offering was to get their SharePoint team working with their Exchange team – something that famously never happened whilst both products were predominantly on-premise offerings.  Microsoft customers implementing on-premise SharePoint alongside their on-premise Exchange email system  had to deploy third-party plug-ins if they wanted staff to be able to drag and drop an email into a SharePoint document library without leaving their Outlook email client.

There are two routes Microsoft could have gone with the relationship between Exchange and SharePoint within Office 365:

  • the integration route – building in features that make it easier to move emails from Exchange to SharePoint;
  • the governance route – making common governance features available so that emails in an email account could be governed using the same policies as documents in a document library in SharePoint.

Microsoft’s choice of direction for Office 365 has implications for the policy decisions that organisations need to take on email:

  • If Microsoft were to go down the integration route then it would fit in with the records management belief that an email system is not a ‘record system’ but is instead a ‘communications tool’.  Many organisations over the course of the past decade have designated SharePoint as their corporate records system and asked staff to move important emails into SharePoint.
  • If Microsoft were to go down the governance route then it would fit with the information governance belief that distinctions between record systems and non-records systems are meaningless and unhelpful because organisations are under legal, regulatory and ethical obligations to manage all their business information systems in accordance with information governance principles.

From a marketing point of view, there are clear advantages to Microsoft from going down the governance route rather than the integration route:

  • If Microsoft went down the integration route it would imply that they viewed a SharePoint document library as a better place to store business email than an Exchange email account. This is despite the fact that Exchange was built for and designed around the storage of email messages, whereas SharePoint document libraries were not designed with email in mind.
  • By going down the governance route Microsoft can stay neutral on the question of whether an email is better stored in SharePoint or in Exchange, and can gradually remove any necessity for organisations to move emails out of Exchange and into SharePoint.

It is therefore no surprise to see Microsoft putting their emphasis on the governance route rather than the integration route.

Office 365 comes with a ‘Security & Compliance Centre’ that sits separately from SharePoint or Exchange or any of the other component parts of Office 365.   The Security & Compliance Centre gives you two different means of applying retention rules to content:

  • retention policies which are applied to the containers within which content sits (SharePoint sites, email accounts etc.);
  • retention labels which are applied to individual items of content (emails/documents etc.).

This effectively gives you three alternative options for applying retention to email:

  • apply retention policies to email accounts without applying retention labels; OR
  • ask end users to apply retention labels to emails (or automate the application of labels if and when you develop automation capability), without applying retention policies; OR
  • use a combination approach by applying a default retention policy to email accounts whilst allowing staff (or machines!) to apply a retention label to particular emails that deserve a retention rule that differs from the default.

Note that in applying retention from the Security & Compliance Centre to content in OneDrive, Office 365 groups or SharePoint you will be faced with variations of the three options listed above.   The variation relates to the type of container that you would be applying retention policies to, and the type of content that you would be applying retention labels to.

The fact that Microsoft allows an email account to be treated in the same way as a document library for retention purposes will not stop organisations wanting to apply different retention periods to email accounts than to document libraries even when they arise from a similar business function.  The cost and risk profile of an email account differs significantly from that of a document library.

However Office 365 is a game changer in two ways:

  • it brings the application of retention rules to email in email accounts firmly into the information governance, rather than the IT domain.  The retention policy  and retention label menus in the Office 365 Security & Compliance centre can be used to apply retention policies and/or retention labels to Exchange email accounts and SharePoint sites (as well as other parts of Office 365 including Office 365 groups and OneDrive accounts);
  • it creates the possibility of applying different types of policy towards email. For example if you wanted to apply a Capstone policy towards email you could do so out of the box in Office 365 by simply:
    • setting two retention policies on email; a Capstone retention policy for application to the relatively small number of email accounts that you wish to retain permanently, and a non Capstone retention policy for application to email accounts that you do not wish to retain permanently;
    • deploying  retention labels to enable staff with Capstone email accounts to identify trivial and personal emails so that those emails are exempt from the permanent retention applied to the rest of the correspondence in their email account.

Is it possible to solve the email problem?

I will be giving a presentation in London on Friday March 15 2019 for the IRMS Public Sector Group.

Below is a summary of the presentation:

James Lappin has recently had an article published by the Records Management Journal The defensible deletion of government email in which he reports on the evaluation he has carried out with Loughborough University of the policy of The National Archives (TNA) towards UK government email.

In this presentation he will attempt to answer three questions:

  • what proportion of an official’s email correspondence is likely to be needed as a record?
  • what proportion of an official’s email correspondence are public sector bodies likely to be able to capture into their record systems?
  • what proportion of an official’s email correspondence are public sector bodies likely to want to capture into their record systems?

On the basis of the answers to the questions above,  James will propose an answer to two further questions:

  • why has no solution to the email problem yet been found, after nearly a quarter century of its dominance of business correspondence?
  • is it likely that a solution to the email problem will be found that will be acceptable to both public sector bodies and to wider society?

The basic premise of the talk is as follows:

  • if the proportion of email correspondence that is needed as a record exceeds the proportion of correspondence that public sector bodies are capable of capturing into their records systems, then we as a records management/information governance profession have a problem that we can and will solve, because technical solutions that we identify will be welcome to our organisations;

 

  • if the proportion of email correspondence that is needed as a record exceeds the proportion of correspondence that a public sector body perceives as being in their interests to treat as records, then we have a problem that we cannot solve, because any technical solution we could come up with would be unwelcome to our organisations, and against their interests to adopt.

Solutions to the email problem

Analyticsandemail01Analyticsandemail02

When I started telling people, back in 2015,  that I had started a doctoral research project on email, my records management colleagues would typically smile, pat me on the shoulder and say  ‘great! let us know when you have found the solution’.

Two years later.  There is still no solution.  Solutions do not drop out of the sky.  Solutions emerge from people trying to fix things.  If there is a broken situation and noone tries to fix it, it stays broken.

You do not have to undertake a doctoral research project to work out that organisations are not consistently capturing business email into systems that they designate as ‘records systems’.

Yes there are potential approaches.  Not just one,  many different approaches, suitable in different circumstances, using different technologies or different ways of harnessing human input:

  • you could do what some CRM systems do and match the email addresses of external contacts with locations in your record system (when colleague sends an email to person/organisation  it is about business matter x, when they email person/organisation c it is about business matter y etc.);
  • you could set up one container in your record system to mirror each email account that you provision.  You could give each email account owner a simple means of indicating for every email they send, whether or not they want that email and the thread it is part of to be saved as a record to the mirror container for their account in the record system;
  • you could use analytics just before you run your routine deletion routine on an email account (after someone has left, or after two or three years) to rescue material which is obviously business related (and obviously not personal) from routine deletion.

We could go on and on with this list, you will have ideas of your own. We have not even mentioned machine learning yet.

But a solution for a problem is like a hammer waiting for a nail unless the problem owner wants the problem to be solved.

The problem owner for the email ‘problem’ is the organisation who owns the email accounts and the emails.

The one thing all of the above solutions have in common is that they all result in more emails being captured into record systems.   All of these solutions therefore add to storage costs,  they also add to the cost of servicing access to information requests, and, in highly politicised environments such as many central government departments, they add to the perceived risk of  being forced to disclose potentially embarrassing content.

Email is therefore a problem that the problem owner percieves a benefit from leaving unsolved.  Every conceivable solution to the problem of treating business emails as records would be more expensive, and would be perceived by the problem owner as being more risky,  than the current approach.   The current approach, if we are honest, is for organisations to make a token effort to persuade staff to save important emails as reords, who in turn make a token effort to occasionally place a few emails into a record system, whilst all the rest of the emails are routinely deleted after a designated but essentially arbitrary time period.

Where does this leave archivists and records managers?  Where does it leave records management and where does it leave record keeping?

We are also in a manner of speaking the problem owner.  The problem falls into our professional domain.

We are employed by organisations.   It is not our job to try to pursuade our organisations to do anything that is against their perceived interests (and we would not get very far if we did try).

On one hand we are in a good situation.  Organisations put us under no pressure to change our records management policy telling staff to move emails into the record system, and under no pressure to change the relationship between the email environment and our record system (which is typically a product primarily designed and configured for the capture and management of documents rather than emails).

On the other hand it puts us in a very difficult position. We go to all that trouble to develop corporate wide record systems and corporate wide records retention schedules (both labours of Hercules, demanding all of our professional skills, experience and energy) and yet we are regarded as failures because those systems and those retention schedules fail to embrace the bulk of an organisation’s business correspondence.

It is a problem that in effect we can do little about, because our organisations do not want to do anything about it.  Take any of the ideas listed above to your organisation and you will see what I mean.  Our vendor community cannot help us because our organisations do not want solutions that would result in them keeping significant volumes of email for significantly longer periods of time.

In the short term we can carry on like this.  Our organisations are happy for us to continue to try and continue to fail.  Because (and this is the paradox) our failure to capture business correspondence consistently into systems treated as record systems succeeds in reducing the cost and perceived risk of recordkeeping to our organisations.

But what about the long term?  I fear in the long term that this success-through-failure damages our professional reputation and damages the clarity and moral force of our theory and of our practice.