The Reality of Native Format Production and Redaction - from IGC and Atidan

The issue of production format in eDiscovery has long been discussed, argued and downright misunderstood. Historically, attorneys produced documents in paper form or electronically in TIFF or Adobe® PDF format. Even documents that originated electronically were often either printed and re-scanned or batch-converted to TIFF or PDF. The December 1, 2006 amendments to the Federal Rules of Civil Procedure (FRCP)—specifically rule 34(b)—made the default obligation to produce a document “in a form or forms in which it is ordinarily maintained or in a form or forms that are reasonably usable” unless the requesting party—or failing that, the producing party—specifies a different format.i Does this demand that the producing party must deliver all documents in their original, native format (e.g., Microsoft Word or Excel)?
  • 1. The Reality of Native Format Production and Redaction AN IGC WHITE PAPER By Christine Musil Director of Marketing Informative Graphics Corporation
  • 2. The issue of production format in eDiscovery has longbeen discussed, argued and downright misunderstood.Historically, attorneys produced documents in paper form or What Does “Native” Really Mean?electronically in TIFF or Adobe® PDF format. Even documentsthat originated electronically were often either printed and According to eDiscovery expert George Socha of Sochare-scanned or batch-converted to TIFF or PDF. The December Consulting and the Socha-Gelbmann Electronic Discovery1, 2006 amendments to the Federal Rules of Civil Procedure survey, confusion abounds about what native actually(FRCP)—specifically rule 34(b)—made the default obligation means. He notes that there are actually four categories ofto produce a document “in a form or forms in which it is electronic discovery formats in terms of production, reviewordinarily maintained or in a form or forms that are reasonably and processing—true native, near native, near paper andusable” unless the requesting party—or failing that, the actual paper.producing party—specifies a different format.i Does thisdemand that the producing party must deliver all documents True native files are copies of the original documents in thein their original, native format (e.g., Microsoft Word or Excel)? format created by the authoring application, like DOC or XLS. Metadata should be intact, if preserved properly. This is whatThe knee-jerk reaction of some has been to demand native most parties have in mind when asking for native production.production without really understanding why the nativeformat is or is not necessary for that case, and without Near native formats can include many different typesknowing whether or not they have the software necessary to of files, depending on the perspective you take. Attorneyactually access all of the data they are demanding. and eDiscovery expert Tom O’Connor defines near native as various ways to render native files so the content and metadata are electronically accessible. Socha adds that“ Too often one or both sides do relational databases and email sometimes fall in the nearnot understand the software native category. Some experts would include electronically converted, searchable PDFs in this category, like a Wordtechnology involved. “ document that has been converted and retains the searchability and some metadata of the original fileFor example, in Armor Screen Corporation v. Storm Catcher, (metadata is discussed in more detail in the next section).Inc., 2008 WL 5262707 (S.D.Fla. Dec. 17, 2008), defendants Socha disagrees, including these instead in the next category,requested native file production, but were then unable to near files with a .SAV extension. The defendants demandedthe plaintiffs then produce hard-copy printouts of the SAV Near paper are TIFF or PDF files that cannot be searched orfiles. Magistrate Judge Ann E. Vitunac refused to compel the indexed, and sometimes those that can. Lack of searchabilityplaintiff to grant the defendant’s request because SAV files, has increasingly made TIFF production objectionable and isopenable by a number of “statistical computer packages,” perhaps the most cited reason for requestors arguing that TIFFwould in fact qualify as a “reasonably accessible format.” Too does not comply with FRCP rule 34(b). These formats can beoften one or both sides do not understand the software rendered text-searchable by undergoing Optical Charactertechnology involved. This makes agreeing on a production Recognition (OCR), but OCR can yield imperfect results informat difficult and frustrating and often requires the court to terms of search accuracy and the results are generally inferiormake the ultimate determination. to electronically originated documents converted to PDF with text intact. Also, the text must be sent separately, usually as a TXT file, requiring that both the TXT file and TIFF image be reviewed and redacted separately. (See Electronic Redaction Native, Not Native or Both later in this document.)As long as both parties agree at the onset, TIFF, PDF, native or Paper includes documents that originated in paper forma combination of those formats may all be acceptable unless or digital files that have been printed to paper. Clearlythe parties fail to specify a format. According to Principle 12 of paper offers no searchability or other time-saving electronicThe Sedona Principles:ii review methods. Absent party agreement or court order specifying the form or O’Connor takes a slightly different view, preferring a more forms of production, production should be made in the form inclusive category of “reasonably usable” that is used in or forms in which the information is ordinarily maintained the rules and under the Sedona comments noted above. It or in a reasonably usable form, taking into account the includes documents that are not in their native format, but need to produce reasonably accessible metadata that will are searchable and have the metadata intact so they are enable the receiving party to have the same ability to access, highly functional for discovery purposes. Socha has a search, and display the information as the producing party different take on “reasonably usable,” seeing it as an attribute where appropriate or necessary in light of the nature of the of one of the four forms of production summarized above information and the needs of the case. rather than as a separate form. Says O’Connor, “the category ‘reasonably usable’ is the one used in the rules so I prefer thatPage 2 - The Reality of Native Format Production and Redaction
  • 3. phrase, but I think that George and I are on the same page metadata production at the time were still new and fairlywith regards to the handling of documents, he just drills down ambiguous.a little deeper with his definitions.” Even if you aren’t sure if metadata will be relevant, you always should retain an untainted set of the files, and only process Metadata Preservation/Production from copies. In his 2005 paper “Beyond Data about Data:Preservation and production of metadata in a usable formis at the heart of many arguments against converting nativedocuments to TIFF or PDF. O’Connor notes, “Metadata wasoriginally a term used to refer to computer data such asdate created/modified, author, etc. Now, the definition ofmetadata has expanded to include hidden material that doesnot appear when a document is printed out. Examples ofthis would include hidden rows, cells and formulas in Excel,and Track Changes and comments and markups in Word.”Just to show that this, too, is an unsettled area, according toSocha metadata was originally defined as “data about data,” adefinition he says continues to apply today. (For those who Metadata can show history of places the document has been stored. Thiswant to dig deeper into the breadth and depth of metadata, example is from a British dossier on Iraq’s security infrastructure andSocha suggests searching on “metadata” as well as terms such reveals that the document was compiled by copying content from outsideas “Dublin Core.”) documents, including a post-graduate student.ivConverting documents to PDF (even text-searchable ones)or TIFF may alter or fail to include the original creation date The Litigator’s Guide to Metadata,” Craig Ball, trial lawyer andfor the document or strip out all or most unseen content, special master of ESI for numerous Federal and State courts,again causing some requestors to demand native production states, “Fail to preserve metadata at the earliest opportunitywith metadata intact. However, the created and modified and you may never be able to replicate what was lost.iii” “There is clear case law supporting the production of financial spreadsheets in their native format...”dates are not always relevant, so this does not necessarily Proper preservation is important because metadata includespreclude TIFF or PDF being used as the production format. more than the data within the file itself. System data, like aThe unseen data, such as text revisions or comments, are also file’s name and location, size, creation, modification and usagenot brought through to the new TIFF or PDF file. Since these are also important to assess tampering, for example.hidden elements may contain privileged information, this It is important that both sides understand the potentialcould be helpful to the producing party. It should be noted, impact of metadata, if any, on the case. Without sufficienthowever, that the mere suspicion of privileged content is evidence that metadata is relevant, the court may not grantnot a sufficient basis for a blanket withholding of metadata. requests for it to be produced.Spreadsheets are a notable exception, where eliminatingformulas and hiding cells may be committing spoliation Take Dahl v. Bain Capital Partners, LLC, 2009 U.S. Dist. LEXISbecause they are intrinsic to the integrity of the document. 52551 (D. Mass. Jun. 22, 2009), where the requestor sought all metadata associated with emails and Word documentsThere is clear case law supporting the production of financial produced by the producers. Bain Capital responded byspreadsheets in their native format and discussion of spread producing just 12 fields of metadata, which the courtsheets warrants particular consideration when determining supported stating that “many courts have expressedproduction formats. In Williams v. Sprint/United Management reservations about the utility of metadata” and ultimatelyCo. Case (230 F.R.D. 640) (D. KAN. 2005), Magistrate Judge finding that:Waxse from the Kansas Federal District Court decided thatthe defendant, when specifically instructed to produce digital Rather than a sweeping request for metadata, [requestors]spreadsheets “in the manner in which they were kept in the should tailor their requests to specific word documents,ordinary course of business,” should be subject to sanctions if specific emails or specific sets of email, an arrangement that,they had scrubbed metadata and locked certain spreadsheet according to their memorandum, suits [producers]. This morecells prior to production. In this case, Judge Waxse ordered a focused approach will, the court hopes, reduce the parties’reproduction of the documents at the defendant’s expense, costs and work. Furthermore, it reflects the general uneasinessbut decided not to impose sanctions because the laws on that courts hold over metadata’s contribution in assuring prudent and efficient litigation. Page 3 - The Reality of Native Format Production and Redaction
  • 4. The issue of metadata should be discussed up front or the Browning Marean, Senior Counsel at DLA Piper, notes thatopposing side may have grounds to claim spoliation or “At these meet and confer sessions, it is crucial to haverequest re-production. knowledgeable technical people present who are familiar with the discovery data and who know about file formats,“The issue of metadata should be production and redaction. Without an IT representative who understands issues around all document types that may bediscussed up front...” discoverable, such as the feasibility of producing the entire volume of data, and the possible concerns around privacyIn Bray & Gillespie Mgmt. LLC v. Lexington Ins. Co., 2009 WL and proprietary information contained in the documents, any546429 (M.D. Fla. Mar. 4, 2009), Lexington requested that Bray decision made could be completely unrealistic.”& Gillespie (B&G) produce data in native format, but B&G didnot comply. While B&G converted their native files to TIFF and The meet and confer is not only important for determiningstored the files’ metadata separately, they gave Lexington only production formats, but it is also an opportunity to be up frontthe TIFFs and held back the metadata. When it was discovered about what data will be held back for cause of privilege orthat the metadata had been preserved and that they had privacy. Marean suggests that parties agree on what data theviolated Lexington’s native format request, B&G was subject requestor is entitled to and be clear about how it will be madeto sanctions and the courts ordered them to produce the available. “If a large H.R. database contains social securitymetadata to Lexington anyway. numbers which clearly constitute privacy information, explain to the other side and tell them which tables you’ll make available to them and in what format, and explain what has Importance of the Meet and Confer been redacted or omitted and why. Give them what they’re entitled to—no more, no less.”Discussions about production format, metadata and redaction Even judges are pleading for parties to conduct the meetshould occur at the meet and confer session. FRCP rule 26(f ), and confer, since the lack of it can cause an unfortunatethe “meet and confer” rule, requires parties to meet at an early domino effect that wastes a lot of time and money. In Aguilarstage in the litigation process to discuss what information v. Immigration and Customs Enforcement Div. of the U.S. Dept.they have and how they will share it. Unfortunately, according of Homeland Sec., 2008 WL 5062700 (S.D.N.Y. Nov. 21, 2008),to O’Connor, the meet and confer process often gets short- Magistrate Judge Frank Maas emphasized the importancechanged or skipped entirely. This leaves the producing party of the meet and confer meeting, saying “This lawsuitexposed to potentially costly and unexpected demands for demonstrates why it is so important that parties fully discussnative formats later in the process, perhaps after already their ESI early in the evolution of the case. Had that beenproducing in PDF or TIFF. done, the defendants might not have opposed the plaintiffs’ requests for certain metadata. Moreover, the parties mightIn Covad Communications Co. v. Revonet, Inc., 2008 U.S. Dist.LEXIS 104204 (D.D.C. Dec. 24, 2008), the court ordered the "Give them what theyre entitled to—producer to re-produce data in electronic format after havingproduced it in hard copy, but it ordered the two parties to no more, no less." - Browning Mareanshare the $4,000 cost of privilege review, concluding: have been able to work out many, if not all, of their differences This whole controversy could have been eliminated had without court involvement or additional expense, thereby [requestor] asked for the data in native format in the first furthering the ‘just, speedy and inexpensive’ determination of place or had [producer] asked [requestor] in what format it this case. “ wanted the data before it presumed that it was not native. Two thousand dollars is not a bad price for the lesson that the courts have reached the limits of their patience with having to resolve electronic discovery controversies that are Electronic Redaction expensive, time consuming and so easily avoided by the lawyers’ conferring with each other on such a fundamental Redaction, the removal of privileged or privacy data from question as the format of their productions of electronically documents, represents another problem that arises in stored information. producing native documents. The typical method of redaction has been to print the documents, use a black marker to maskIn a perfect world, the meet and confer would be thorough, the information, then photocopy the marked-up pages severalcivil and productive, and parties would clearly understand times to ensure complete obscuration before re-scanningwhat they were requesting or expected to produce, but this is back into the system. As the volume of ESI has escalated, thisnot often the case. method of redaction is cumbersome, expensive and even unrealistic given deadlines to produce.Page 4 - The Reality of Native Format Production and Redaction
  • 5. Electronic redaction involves redacting documents using a a particular document be called into question, you cancomputer application like Adobe® Acrobat® or Informative always produce the original source file with metadataGraphics® Redact-It®. Such tools can save significant amounts completely intact.of time by allowing users to search for privileged phrases orautomatically find privacy information, and they generally What is clear, however, is that the meet and confer is thecreate a new, redacted rendition of the original document in best, and in fact the required, setting to discuss issues ofTIFF or PDF format. Clearly, the biggest advantage is for text- production formats, redaction and metadata so they dosearchable formats like PDF. not end up being decided by the court. Lawyers need to understand more about the inner workings of theBut how do you perform electronic redaction when native technology or need to bring along IT personnel that can helpformat is required? Redaction, by its nature, changes the them navigate successfully through the preliminary stages.document and must be saved to a new version, regardless Attorneys needs to take eDiscovery technology trainingof format. According to Ball, “The very nature of redaction is seriously and educate themselves to protect their ownalteration of the document. No mechanism that you could interests and that of their clients. It is irresponsible, perhapsuse maintains an identical hash.” (A hash value is a unique even malpractice, for them to think and do otherwise.identifying number that defines each digital file and isoften used for forensic purposes to validate a document’sauthenticity.) You start with the original document, selectareas for redaction, and output the final, redacted version to Summarya new document with necessarily different metadata (e.g.,time stamps and author) and a new hash. No matter which Electronic discovery often presents complex problemsformat you choose for the redacted document, the redacted with only partial solutions. Parties need to understand thatversion must be tracked and managed in addition to the dictating the same format across the board may not suit theirunredacted original. ultimate goals. Each data set is different, and what must be revealed and concealed is unique for each case. By gaining“The very nature of redaction is a greater understanding of file formats, electronic redaction tools and metadata, lawyers can do right by their clientsalteration of the document.” and can avoid regrettable situations in the courtroom which can range from embarrassment to sanctions to permanent -Craig Ball damage to reputation of the firm and the client. Don’t take these risks unnecessarily. Take the time and bring the ITSpreadsheets pose unique challenges to redaction. Ball talent to the table to have a quality meet and confer. Afternotes, “Native redaction may be best realized in spreadsheets all, an ounce of prevention is worth a pound of cure.because you have the ability to remove whole categoriesof fielded information by row or column. But becausespreadsheet data often entails dependencies—valuesthat change based on other values—redaction can haveunforeseen consequences if implemented incautiously.Disclosure is essential, so lawyers understand they are dealingwith an altered document. There are no commercial tools toredact spreadsheets in their native format at this time, so it’stypically done using the native applications. Spreadsheetsrequire careful attention during the meet and confer.”Socha cautions that with true native redaction, unless you’recareful and savvy, you’ll likely change things you don’t reallyrealize you’re changing. He notes, “This can break things.If there are formulas in a spreadsheet, you can ruin thedocument. Key numbers can disappear or change as part of achain reaction.” He continues, “There are a series of challengesthat differ from application to application; the industry hasn’tyet determined how to address native redaction. I have yetto see a proprietary approach or a broadly accepted method.True native review can be done with XML versions of Word,but needs to be done by someone who knows how.”Redacting to a near-native or near-paper format is still themost prevalent method. Converting native files to TIFF orPDF and redacting them is safe, convenient, inexpensiveand allows you to use reliable tools that are proven. Should Page 5 - The Reality of Native Format Production and Redaction
  • 6. i Committee on the Judiciary. Federal Rules of Civil Procedure. 111th Cong., 1st sess., 2009. ii A Project of The Sedona Conference Working Group on Electronic Docu- ment Retention & Production (WG1). The Sedona Principles: Second Edition, Best Practices Recommendations & Principles for Addressing Electronic Document Production. June 2007. pdf iii Craig Ball. Beyond Data about Data: The Litigator’s Guide to Metadata. 2005. iv Nerino Petro and Bryan Sims, “Avoiding ethical pitfalls with electronic docu- ments: Part 1 – metadata,” The State Bar of Wisconsin Inside Track. June 16 2010. CustomSource/InsideTrack/contentDisplay.cfm&ContentID=93679About the AuthorChristine Musil is Director of Marketing at Informative Graphics, a leading developer ofcommercial software products for secure content viewing, collaboration and redaction.Founded in 1990, Informative Graphics products are deployed by thousands of corporations worldwide. For more information, please contact: Informative Graphics Corp. 4835 E. Cactus Road, Suite 445 Scottsdale, AZ 85254 Phone: 800.398.7005 (intl +1.602.971.6061) URL: Email: © Copyright 2011 Informative Graphics Corporation

