Wednesday, April 11, 2012

“P” is for Processing: Part 2



http://ow.ly/adeto

An article by Chuck Rothman posted on the eDiscovery Journal website.

This article, which is part 2 of a series, discussing processing, and looks at what takes place during the de-duplication process.  The article looks at emails, attachments, and Microsoft Office Documents.

The author states, "...when conducting a native review, what format should the native email, without attachment, take? In many cases, it ends up being the email extracted from its container, with all the attachments still embedded!. This leads to some issues:

1. From a cost perspective, the volume is nearly doubled – you pay for the email file including the size of all attachments, plus you pay for each extracted attachment.

2. If the review environment is web based (as many of the more modern ones are), the amount of time a reviewer waits for a record to appear on their screen is directly related to the size of the file being downloaded. If a very short email contains many attachments, a reviewer could wait 30 seconds or more before being able to review one or two lines of text. If a review database contains 100,000 such emails, that’s an extra 50,000 minutes, or 833 hours of review time that has been added solely because of the way the email was processed.

3. If producing native files, it is impossible to redact attachments to an email if the email is produced.

A more sensible way to process emails is to think of it as like a zip file – the email container contains an email body and zero or more attachments. Each attachment, as well as the email body, is extracted from the email container file, and the whole group is linked together."

No comments:

Post a Comment