Using its enhanced visual search technology, ZyLAB, a leading provider of eDiscovery and information management solutions, has located an additional set of more than 60 previously undetected items in the EDRM Enron PST Data Set that contain explicit content, privacy, health and financial information. Responding to the invitation of EDRM, the leading standards organization for the eDiscovery and information governance market, to assist with an ongoing effort to cleanse the Enron data set, ZyLAB has shared this information with EDRM.
The Enron Data Set is an industry-standard collection of email data that was previously hosted by EDRM and in 2012 became an Amazon Web Services Public Data Set. The Enron Data Set has served for many years as an industry-standard collection of email data for electronic discovery training and is a valuable public resource for all sorts of researchers from all disciplines.
It has never been a secret that the data set that was originally made available by the Federal Energy Regulatory Commission (FERC) contained a high level of personally identifiable information (PII) about the company’s former employees.
By using the brand new ZyLAB Visual Classification technology in combination with the existing deep processing, content analytics and search capabilities, several hard to find items like documents containing social security and credit card numbers, protected health information, 1040 tax forms, and even indecent pictures have been identified.
For over 30 years, ZyLAB’s unique search and content analytics technology has been developed to help customers find more relevant information than any other product on the market, regardless of spelling errors, OCR errors, deliberate hidden data, aliases, code words, digital format, location, or language and even regardless of the fact that the data contains explicit text in cases of images, video or audio recordings. The results of this effort, shows the power of the unique ZyLAB search technology.