Skip to content

Using Continuous Active Learning (CAL)

| Written by Altlaw

As remote working has become essential during these unprecedented times, we are all relying on technology more than ever before. At Altlaw, our project managers have been working from home and leveraging technology to run projects as they would normally be run. Below is an overview case study ofa recent project that had pressing deadlines but without access to the resources, we would normally utilise. 

Custodians and devices had already been identified and marked for collection prior to the Covid-19 shutdown.  The data from these devices was collected remotely by Altlaw forensic team and transferred electronically via SFTP. 

The resulting data were processed as per normal and a series of filters were applied to the data.  The resulting dataset was then ready for review. 

The remaining dataset consisted of around 220,000 documents.   

With the option of a large team of reviewers not suitable, technology-assisted review was deemed to be the best available solution.  In this instance, the solution used was RelativityOne’s CAL (Continuous Active Learning). 

The client applied a team of three lawyers to begin designating documents as Relevant/Not Relevant.   

Documents were served up to the reviewers based on a ranking system – where the system learnt from the decisions previously made to serve up what it felt were the most relevant documents.  

 The system doesn’t just apply a binary decision about Relevant / Not Relevant, it also gives a ranking on how confident the system is with its’ decision.  These levels of confidence will initially be quite mixed – some with a high level of certainty and some low. 

This is illustrated in the distribution model well below. 

Prioritized Review

Every 20 minutes the system re-evaluates its’ decisions making incorporating the previous 20 minutes’ human decisions, applies a new ranking to each document and effectively serves up what it feels what are “the most relevant” documents next. 

This is a change from the traditional chronologically ordered review, but the key is for the most relevant documents to be viewed first and the less relevant or not relevant to follow.   

If the system can rank documents accurately then the ‘relevant’ documents being provided to the lawyers will eventually reach fall away reaching the point where the legal team can confidently stop the review, and be certain that they have seen a all of the statistically large portion of the relevant documents, or are satisfied that the system has accurately designated each documents relevancy is less likely to serve up relevant documents. The potentially relevant documents left in the unreviewed pile are called Eluded documents. 

At various stages a test of the ranking efficiency these potentially Eluded documents can be undertaken.  This is known as an “Elusion test” and it is a statistical sample of all of the documents that have will been discarded by the system upon project completion – not relevant documents, skipped documents or documents that were not ranked high enough by the system to earn a Relevant designation.  This process gives the lawyers a chance to correct any mistakes the system is making and effectively “sharpen” the systems’ decision making process statistical insight of potentially relevant documents left unreviewed. If the Elusion rate is acceptable, the CAL project can complete 

Over time the distribution model used above will move so that there is a clear separation between the Relevant, and Not Relevant and unreviewed documents as the system becomes more accurate. 

In our case, the three human reviewers reviewed coded approximately 15,000 documents, at which point there was an acceptable level of confidence  Elusion rate achieved, meaning that the system was accurately designating relevancy. 

As an interesting side note.  At this stage, a decision was taken to add some additional documents into the data set.  This resulted in 26,000 new documents, from an existing custodian, to be added to the review pile.  The system was able to integrate these documents and assign relevancy designations in a matter of minutes. 

When a final Elusion test completes reporting an acceptably low Elusion rate, the documents above the relevancy cut off, the documents the system designates as Relevant, were moved into the Second Pass Review (SPR) workflow.  

The discard pile is then subject to a structured analytics workflow to sample these left-over documents. The sampling workflow further confirms the Elusion rate and may also catch any relevant outliers under the relevancy cut off.  

For the review work undertaken for approximately 15,000 documents, the system categorised the remaining 205,000 documents for relevance to a 95% confidence level and a margin of error for +-1.5%. As both sides had agreed to utilise CAL with results expected to fall within 95% confidence and +-2.5% margin of error. More information on the statistics behind Elusion Testing can be found in this linked blog post. 

Overall this project allowed our client to quickly work through a significant number of documents safe in the knowledge that they came under the agreed error rate. 

If you are interested in any of our services, don’t hesitate to contact us today and a member of our team will be in touch with you shortly.