A wise trial lawyer of the greyish set said to me, not without exasperation: “This e-discovery stuff is a bother. When I started practice, I would ask the client for the hot documents, and they would be sent to me.” How times have changed.
Generally, clients aren’t actively trying to keep hot documents out of their lawyer’s hands. Clients appreciate that the sooner the lawyer understands the facts, the sooner that lawyer can give them good advice. And few clients are thrilled by the prospect of paying for lawyers to pour over trivial documents, which can happen when e-discovery is poorly planned or implemented.
Lawyers don’t usually just get the hot documents from their clients anymore as they can’t really produce them because they are now buried and intermingled with other documents, such as within an e-mail box, or on a computer with one or more disorganized drives, or those documents may be somewhere else like the cloud or on a thumb drive. Lawyers know that few clients are really capable of organizing their information anymore. Or, even when a client is organized, it may not be holding the information in the way required for litigation. Very few clients store data in file folders conveniently labelled “duty of care” or “special damages — supporting calculations.”
Which is why any substantial e-discovery effort turns to search, including search strategy and technology. Let’s spend some time dissecting search, because there is a whole world of search that most lawyers leave uncharted.
Lawyers are most familiar with keyword searches and Boolean operators. That is because almost every lawyer has learned to do legal research using an online case law system designed and structured to be used with keyword search. Additionally, lawyers who have used many document review tools have also relied heavily on keyword searches.
The good news is keyword searching can be very effective in online case law systems and, in some circumstances, can be effective to find potentially responsive or privileged documents in document review tools.
That is because the data within those tools has already been curated — specially prepared and organized for keyword searches. Moreover, those tools are also designed to ensure keyword and Boolean searches will work.
When you pull up the search function it is usually designed with a space to enter in keywords and a space for Boolean operators. In previous articles discussing indexation and noise words and their role in search, I have explained why and how these tools have been designed this way. The important takeaway is that keywords work in these systems because they have been designed for this.
The bad news is lawyers cannot assume other electronic systems that hold information potentially required for litigation have been designed the same way. They haven’t. For example, searching in Google works a lot differently than searching in Westlaw. Choose a Boolean search and try it in both and explore the difference.
Moreover, most other applications, including computer operating systems, e-mail systems, and other common productivity applications, are not really designed for keyword search, or may have very limited keyword search capabilities. Test this yourself, for example, on Facebook. You’ll see search works differently there than in Google or Outlook.
It follows that it is risky to assume you can perform keyword searches in a client’s particular computer system and that collection or e-discovery within that system that relies on keywords may yield inaccurate results. I am frequently surprised that lawyers agree opposing counsel can have their clients do keyword searches in on-site computer systems as a way of locating potentially responsive documents without understanding what systems those are, how they search, and the limitations of the search capabilities.
A series of questions need to be asked by lawyers before collection by keywords is validated as good strategy. The search functionality within enterprise systems is improving, and some systems are quite good, but understand what you are dealing with before you implement any collection or culling strategy based only on keywords.
The best news? Lawyers should understand search need no longer be limited to keywords within a curated population. Now, various forms of search are available within e-discovery applications if the data is processed (treated and prepared to be used in e-discovery platforms). There are great tools available now, all bundled under the term “analytics.”
For example:
E-mail threading takes e-mails and pulls related conversations in a string together. Threading is very helpful in seeing how an e-mail conversation unfolds, who was copied and blind copied on e-mails, whether there was any splintering in the conversation, and who might have been forwarded any or part of the string. E-mail threading will work even when people are dropped off chains, or where the subject line was changed.
Domain searching or alias searching quickly pulls out every e-mail address and every domain, so you can quickly see whose e-mails you have. This can be very helpful in reconciling the identity of a person to a single e-mail address. Alternatively: want to search for any person or e-mail from a certain domain, like @yahoo.com or a particular law firm? Easy: just search on the domain field for the domain you are looking for. Domain search features can really help on privilege review.
Concept searching is a search feature that pulls up related concepts to a word or phrase. It is more powerful than a synonym detector. For example, suppose your case was about dogs. Searching for dog or its variants will get you some good results, but a concept search will pull together dog-themed things, such as walks, ball, and kibble. Those additional concepts could prove helpful. Concept searching does not work exactly like keywords though, so make sure you get properly and fully trained on its use, or you might get frustrated the first time you use it.
Finally, near-duplicate detection can pull together related documents. This feature can help to quickly locate similar documents, for example drafts or versions of documents. Many applications also have a redline function that can show you the difference between near duplicates. I have also used this feature to find standard forms for mass or bulk tagging. Some technologies have variants of near duplication where you can highlight a paragraph or word snippet and find related items in other documents; when combined with concept searching, these search features can be very effective at locating hot documents.
My advice to lawyers embarking on e-discovery is this: don’t needlessly narrow your search tools. Educate yourself on options and be smarter about how you find information related to your case. Search does not need to be limited to keywords and the brute force of reviewing all the hits generated by a keyword search.
Understanding search can help you recover some of the elegance of yesteryear, but with the computer returning the hot documents, so you can get on with the job of building your case.
Dera J. Nevin is the director of e-discovery services at Proskauer Rose LLP. The opinions in this article are entirely her own.
Generally, clients aren’t actively trying to keep hot documents out of their lawyer’s hands. Clients appreciate that the sooner the lawyer understands the facts, the sooner that lawyer can give them good advice. And few clients are thrilled by the prospect of paying for lawyers to pour over trivial documents, which can happen when e-discovery is poorly planned or implemented.
Lawyers don’t usually just get the hot documents from their clients anymore as they can’t really produce them because they are now buried and intermingled with other documents, such as within an e-mail box, or on a computer with one or more disorganized drives, or those documents may be somewhere else like the cloud or on a thumb drive. Lawyers know that few clients are really capable of organizing their information anymore. Or, even when a client is organized, it may not be holding the information in the way required for litigation. Very few clients store data in file folders conveniently labelled “duty of care” or “special damages — supporting calculations.”
Which is why any substantial e-discovery effort turns to search, including search strategy and technology. Let’s spend some time dissecting search, because there is a whole world of search that most lawyers leave uncharted.
Lawyers are most familiar with keyword searches and Boolean operators. That is because almost every lawyer has learned to do legal research using an online case law system designed and structured to be used with keyword search. Additionally, lawyers who have used many document review tools have also relied heavily on keyword searches.
The good news is keyword searching can be very effective in online case law systems and, in some circumstances, can be effective to find potentially responsive or privileged documents in document review tools.
That is because the data within those tools has already been curated — specially prepared and organized for keyword searches. Moreover, those tools are also designed to ensure keyword and Boolean searches will work.
When you pull up the search function it is usually designed with a space to enter in keywords and a space for Boolean operators. In previous articles discussing indexation and noise words and their role in search, I have explained why and how these tools have been designed this way. The important takeaway is that keywords work in these systems because they have been designed for this.
The bad news is lawyers cannot assume other electronic systems that hold information potentially required for litigation have been designed the same way. They haven’t. For example, searching in Google works a lot differently than searching in Westlaw. Choose a Boolean search and try it in both and explore the difference.
Moreover, most other applications, including computer operating systems, e-mail systems, and other common productivity applications, are not really designed for keyword search, or may have very limited keyword search capabilities. Test this yourself, for example, on Facebook. You’ll see search works differently there than in Google or Outlook.
It follows that it is risky to assume you can perform keyword searches in a client’s particular computer system and that collection or e-discovery within that system that relies on keywords may yield inaccurate results. I am frequently surprised that lawyers agree opposing counsel can have their clients do keyword searches in on-site computer systems as a way of locating potentially responsive documents without understanding what systems those are, how they search, and the limitations of the search capabilities.
A series of questions need to be asked by lawyers before collection by keywords is validated as good strategy. The search functionality within enterprise systems is improving, and some systems are quite good, but understand what you are dealing with before you implement any collection or culling strategy based only on keywords.
The best news? Lawyers should understand search need no longer be limited to keywords within a curated population. Now, various forms of search are available within e-discovery applications if the data is processed (treated and prepared to be used in e-discovery platforms). There are great tools available now, all bundled under the term “analytics.”
For example:
E-mail threading takes e-mails and pulls related conversations in a string together. Threading is very helpful in seeing how an e-mail conversation unfolds, who was copied and blind copied on e-mails, whether there was any splintering in the conversation, and who might have been forwarded any or part of the string. E-mail threading will work even when people are dropped off chains, or where the subject line was changed.
Domain searching or alias searching quickly pulls out every e-mail address and every domain, so you can quickly see whose e-mails you have. This can be very helpful in reconciling the identity of a person to a single e-mail address. Alternatively: want to search for any person or e-mail from a certain domain, like @yahoo.com or a particular law firm? Easy: just search on the domain field for the domain you are looking for. Domain search features can really help on privilege review.
Concept searching is a search feature that pulls up related concepts to a word or phrase. It is more powerful than a synonym detector. For example, suppose your case was about dogs. Searching for dog or its variants will get you some good results, but a concept search will pull together dog-themed things, such as walks, ball, and kibble. Those additional concepts could prove helpful. Concept searching does not work exactly like keywords though, so make sure you get properly and fully trained on its use, or you might get frustrated the first time you use it.
Finally, near-duplicate detection can pull together related documents. This feature can help to quickly locate similar documents, for example drafts or versions of documents. Many applications also have a redline function that can show you the difference between near duplicates. I have also used this feature to find standard forms for mass or bulk tagging. Some technologies have variants of near duplication where you can highlight a paragraph or word snippet and find related items in other documents; when combined with concept searching, these search features can be very effective at locating hot documents.
My advice to lawyers embarking on e-discovery is this: don’t needlessly narrow your search tools. Educate yourself on options and be smarter about how you find information related to your case. Search does not need to be limited to keywords and the brute force of reviewing all the hits generated by a keyword search.
Understanding search can help you recover some of the elegance of yesteryear, but with the computer returning the hot documents, so you can get on with the job of building your case.
Dera J. Nevin is the director of e-discovery services at Proskauer Rose LLP. The opinions in this article are entirely her own.