Slashing Document Review Costs with Sampling and The Cloud

By Eric Kirschner & Jorge Sirgo

Frequently, in litigation, a large number of claims [or other relevant data] need to be reviewed to establish critical evidence (e.g., damages, years of product exposure). However, under many circumstances, it would be prohibitively expensive to review each of these individual claims.

In these instances, constructing a sample of relevant claims and reviewing the sampled claims via a Cloud based application may dramatically cut document review and data management costs.

Sampling: Sampling is a process whereby a small subset of data is reviewed and then the results of that review are extrapolated to the larger population of data. For example, in many instances we are asked to evaluate the accuracy of a client’s underlying claims database. The database may contain tens (or hundreds) of thousands of records each summarizing a separate claim – reviewing every one of these claim files would be cost and time prohibitive – and is, as described below, unnecessary.

Instead of engaging in such an inefficient review, a more prudent and widely accepted approach is to select and review a representative sample of claim files. Relevant information is obtained from these files, interpreted, and statistically evaluated. If the sampled information produces an estimate within the expected level of precision, a review of the entire population is unnecessary. Typically, precise estimates can be obtained by reviewing less than 10% of a total population, thereby eliminating the need to review the vast majority of the individual files.

Additionally, the sample can be structured to address the specific needs of the litigation. For example, if the litigation requires an evaluation of the accuracy of a plaintiff’s damages – and those damages consist of numerous individual elements – the sample would be designed so that only a small sub-set of those elements are actually reviewed.

Let’s look at a more specific example of a situation in which sampling can be a useful tool. The task is to evaluate a database summarizing 5,000 separate claims that allegedly settled for an average of $1,000. Instead of reviewing the underlying documentation for all 5,000 claims to verify their settlement values, it would be more efficient to review a small sample of files – say 300 or 400 – for accuracy. This sample review might very well yield a result that confirms the $1,000 average settlement amount with a very high degree of accuracy at a fraction of the cost of a full review (see the graph below for an illustration of precision as sample size increases).


A modest sample size (the x axis) can yield a tight confidence interval at a modest cost.

Furthermore, the U.S. government and courts are becoming increasingly comfortable employing statistical sampling to cut the time and costs of complex, multi-claim litigation. Some examples of recent applications of sampling by the U.S. government are,

  • Determining over or underpayments related to Medicare — United States v. Fadul, Civil Action No. DKC 11-0385, 2013 WL 781614 (D. Md. Feb. 28, 2013); United States v. Rogan, 517 F.3d 449, 453 (7th Cir. 2008);
  • Determining the Basis in Property Acquired in Transferred Basis Transaction – Rev. Proc. 81-70, 1981-2 C.B. 729; and
  • False Claims Act cases – U.S. ex rel. Martin v. Life Care Centers, No. 08-cv-251, Dkt. 184 (E.D. Tenn. Sept. 29, 2014).


The Cloud: Of course, all this information is useless if not stored and analyzed properly. And this is an area where costs can vary wildly depending upon the effectiveness of the software employed. Fortunately, just as sampling can drastically reduce document review costs, a well designed, secure Cloud-based data solution can also drastically reduce review and analysis costs.

Cloud based systems provide a number of advantages over more traditional legal data systems. First, they are easy to set up. The system is installed on a single server and all users access the data from that server. There is no need for multiple, complex site and machine installations.

Second, data and documents are available to all team members from any location (subject to the privileges granted). Different people in different offices with different roles are able to seamlessly view and analyze data and documents. Similarly, counsel, client, witnesses and the Court can be granted rights to review specific data and documents as needed. Witness testimony is better focused and flows more smoothly while judges are similarly able to more easily follow the proofs being proffered.

Third, all this information is immediately available. If, for example, two staff members, Betty and Dan, are working on a document, and Betty in New York City finds a useful document, she can code the relevant information into the database, and call Dan in Washington. Dan can instantly summon the document to his screen and review both the document and Betty’s comments.

Fourth, a well designed Cloud based system is exceptionally cost effective. A basic review and coding system can be quickly set up and customized so that reviewers are focused only on critical and/or relevant information (see the sample screen capture below).

And fifth, they are immensely flexible. Documents, fields, queries, etc. can all be added to an existing Cloud based system with only minimal programming. Similarly, additional reviewers or parties (attorneys, witnesses, etc.) can also be looped into the system with minimal cost and zero downtime.


A Cloud based document review and analysis system can be easy to use
while providing instantaneous access to data and documents.
(the above data is fictitious, the document is public)


Summary: The above discussion covers two techniques that can be used to drastically reduce the costs of large claim file reviews. Sampling can be used to limit the scope of the review and Cloud based data can be used to reduce the cost of coding and data management. Taken together, these two techniques can dramatically reduce the expense involved in litigating cases that involve thousands – or hundreds of thousands – of individual claims.

