Digital Information Glossary

A VPN is an essential component of IT security, whether you’re just starting a business or are already up and running. Most business interactions and transactions happen online and VPN
Digital Information Glossary

Digital Information Glossary

There are numerous terms and concepts related to eDiscovery and the management of electronically stored data that industry professionals must know and understand. Below is a sampling of some of the most common terms used in the eDiscovery world.


Bayesian Search: An advanced search, developed by an 18th century mathematician and clergyman, that utilizes statistical probability rules to compute the likelihood that a document is relevant to a query

 Clustering: Unsupervised machine learning in which thematically similar files are grouped together based on the text of the individual files

 Container File: A compressed file containing multiple files; used to minimize the size of the original files for storage and/or transporting

 Early Data Assessment: It’s the process of separating possibly relevant electronically stored information from non-relevant electronically stored information using both computer techniques, such as date filtering or advanced analytics, and human-assisted logical determinations at the beginning of a case

 Ephemeral Data: Data that exists for a very brief, temporary period and is transitory in nature, such as data stored in random access memory (RAM)

 ISO 27001: An ISO standard that formally specifies an Information Security Management System (ISMS), a suite of activities concerning the management of information security risks. The ISMS is an overarching management framework through which an organization identifies, analyzes, and addresses its information risks.

 Stop Words: Common words (e.g., all, the, of, but, not) that are purposefully excluded from a search index when it is created in order to make the index more efficient

 Elusion: The percentage of documents of a search’s null set that were missed by the search, usually determined with review of a random sample of the null set;

 Supervised Learning: Use of machine learning to analyze data, using training examples that have been coded by humans

 Hash Coding: A mathematical algorithm that calculates a unique value for a given set of data, similar to a digital fingerprint, representing the binary content of the data to assist in subsequently ensuring that data has not been modified.


For a more comprehensive glossary, The Sedona Conference Glossary, eDiscovery & Digital Information Management, Fifth Edition can be found here.

For more Tidbits & Thoughts, please click here.

Leave a Replay

Sign up for our Newsletter