Wednesday, February 29, 2012

A Stream of Auto-Classification Consciousness by Randolph Kahn, ESQ.

“But how will a court judge our use of auto-classification technology to do the heavy lifting regarding what information was a record and what was junk?”


“I want to be comfortable with our decision that using algorithmic classification software technology to apply our records retention rules and clean up the contents of our shared drive won’t get us flogged by a regulator or a court.”


“I am concerned that if we get rid of this data without having our employees review it manually, that we are open to attack in a court.”


We have empirical data to support the proposition that employees classify and code information way worse than computers, by a long shot. Yet most companies continue to rely on their employees to manage information. “[T]echnology-assisted process, in which only a small fraction of the document collection is ever examined by humans, can yield higher recall and/or precision than an exhaustive manual review process, in which the entire document collection is examined and coded by humans.” Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, Maura R. Grossman, JD., Ph.D. and Gordon V. Cormack, Ph.D.


Most big companies have petabytes of structure and unstructured content, which is billions of files. Do you think a judge would say it was "reasonable" to expect employees to classify and review billions of files before they could be purged? According to the Council of Information Auto-Classification’s “The Information Explosion Survey”, 98% of organizations reported rapid information growth that they predict will extend into the future and that growth is creating a variety of challenges and consequences. Half of the respondents indicated they are forced to recreate information previously created because they cannot find it. 74% of the organizations stated valuable information is being lost (i.e. can’t find, disposed of, misplaced) due to the lack of proper technology solutions. 73% of respondents reported their organization misses business opportunities because they can’t efficiently access information.


Technology is amazingly powerful at uncovering value from information and connecting dots, at the same time people are impotent in the face of the mountain of data to make it make sense. People bet their life on the Genome project made possible by technology unearthing the connections in data, but you are not sure if you should use auto-classification technology to determine if an email is a record.


In an article entitled “Search, Forward: Will manual document review and keyword searches be replaced by computer-assisted coding” US Federal Magistrate Judge Andrew Peck wrote, “[p]erhaps they are looking for an opinion concluding that: “It is the opinion of the court that the use of predictive coding is a proper and acceptable means of conducting searches under the Federal Rules of Civil Procedure, and furthermore that the software provided for this purpose by [Insert name of your favorite vendor] is the software of choice in this court.” If so, it will be a long wait… Until there is a judicial opinion approving (or even critiquing) the use of predictive coding, counsel will just have to rely on this article as a sign of judicial approval. In my opinion, computer-assisted coding should be used in those cases where it will help “secure the just, speedy, and inexpensive” (Fed. R. Civ. P. 1) determination of case in our e-discovery world.


And Judge Peck strikes again in Moore v. Publicis Groupe, in his February 22, 2012 order in this case, “Computer-assisted review appears to be better than the available alternatives, and thus should be used in appropriate cases. While this Court recognizes that computer-assisted review is not perfect, the Federal Rules of Civil Procedure do not require perfection…Counsel no longer have to worry about being the “first” or “guinea pig” for judicial acceptance of computer assisted review.”


If computer-assisted review is ok for finding relevant information for a lawsuit, why shouldn’t you be comfortable with using these types of technologies to classify records? Clearly applying records management rules is a far less risky proposition than responding to discovery. Auto-classification is a way to better manage and if appropriate, defensibly dispose of huge volumes of data when people can’t. The courts are now making that decision easier.

Monday, December 12, 2011

Is It Potentially Relevant?

Remember, no matter what Daisies you WANT to chuck, you can’t take ANY action unless you make sure the information is not even potentially relevant and/or needed for threatened, imminent or active lawsuits, investigations or audits. Take for example the Hackergate investigation. That’s the case of News Corp. journalists hacking into the voice mails of people in the news to learn insights into their lives so they could report on the information. Lots of folks are under scrutiny (and they have already shut down the offending newspaper in England) because of the scandal. However, as reported in the Wall Street journal on December 12, 2011 in an article entitled ‘Hacking Investigation Questions Who Erased Voice Mails’, the investigation is focusing on the deletion of the certain voice mail messages as such action might point to the guilty party.

Thursday, October 20, 2011

Randy Kahn says . . .

Press Release: Autonomy Unveils Meaning-Based Policy Control Solution for Governance and Compliance

"Businesses need a comprehensive tool to automatically and consistently administer policies and determine risk," said Randolph Kahn, founder of Kahn Consulting and author of "Information Nation". "Autonomy is taking a unique approach to implementing and managing policies to govern the lifecycle of information."

Read the full Press Release here.

Thursday, September 8, 2011

Records Memorialize

NY Times reports that an “independent investigative committee found that the governor of Saga prefecture told the operator, Kyushu Electric Power, to send e-mails supporting the restart of two reactors at the company’s Genkai Nuclear Power Station. The company has already admitted to ordering employees to pose as regular citizens by sending e-mails during an online town hall-style meeting in June over whether to allow the restart of the reactors.”

Records memorialize and can hurt. Time for some business ethics, risk management and email training. Nice job Japan - Allow a disaster to happen and add insult upon injury until you have no credibility. Brilliant.

Tuesday, July 26, 2011

Your Business Needs to Rightsize

A Dozen Really Good Reasons Why Your Business Needs to Rightsize its Information Footprint

“Rightsizing Your Information Footprint” is my made-up term for turning your Information Parking Lots into a Goldie Locks and the Three Bears amount of information — not too much, not too little, but just the right amount. There is too much digital content with more created continuously. We need to clean up the past in a defensible way. While the daisies are beautiful at the beginning of their life, they lose their appeal as they decay. The same is generally true for information. Businesses also need a better path forward so that content comes into being because the business needs it, and all records are better managed.

Too much stuff, you fail to be business efficient and you get your clock cleaned when litigation strikes.
Too little information, you can’t run your business and you fail to comply with record keeping requirements, among other things.

So here are 12 remarkably compelling reasons to Rightsize, right now:

1. Information is growing at such a rapid rate that costs related to storing, finding, using, migrating, extracting, preserving information are too high
2. Knowing what information exists and where it is parked to be able to efficiently run your business is too complex
3. Technology has failed to find a good way to manage content with little impact to employee productivity (but Kahn is working on auto-classification to help)
4. Employees get too much content to be able to properly manage it
5. Content has sat for years in old Information Parking Lots and it is a decaying asset (Working on my new book called Chucking Daisies to help companies deal with this precise issue)
6. Companies spend too much time looking through way too much irrelevant stuff to respond to litigation, audits and investigations
7. Companies have out of date records used against them in litigation, which could have been disposed earlier
8. Systems are breaking down or no longer work as efficiently as they should, due to information volume burden
9. Data parking lots are being ill-managed and that failure is causing other failures, not the least of which is failing to harness needed information to be “faster, better and cheaper.”
10. Going Green. No list is complete until it has a bit of Green. Technology is using all kinds of energy and by cutting your energy, emission and every other relevant footprint, you are greener, you look better to the outside world and maybe the marketers have something Green to say about the effort
11. Information finds itself on unsanctioned data Parking Lots, when sanctioned ones fill up, making life more challenging
12. Along with volume, growth has been the creator of many new Information Parking Lots (Smart phones, Cloud, Twitter, Blogs, etc.) which makes management that much more challenging

Rightsizing will never be as easy as it is right now as information Parking Lots grow and grow. Clean house of digital data junk. Develop a thoughtful plan for future information retention. Rightsize now because it’s good business.

Monday, July 11, 2011

Too much data is bad

The June 30, 2011, The Economist covers a story about “Too Much Information” and “How to cope with data overload.” At a minimum that means the folks across the pond are also realizing at some point too much data is a bad thing. The business world is at the place where we are over run with digital stuff and it is now taking away a competitive advantage, negatively impacting customer response times and impacting our ability to be the nimble business machine honed to win.

I have been writing about this topic for years but now it is at a point that business executives need to act. We have more technologies making more content with or without our involvement 24-7. Data volume nearly double every year and we couldn’t manage last year’s stuff efficiently. It only gets harder and something has to give. The real answer is not building bigger clouds of storage stacks. We can’t keep everything forever and there must be a prudent way to make wheat/chaff decisions about what should exist and what can be disposed of.

Three things you need to think to do right now:
1. Develop a team to start to clean up the past. Existing data needs to go away according to law and policy now.
2. Better decisions need to be made about what comes into existence. Not everything needs to be retained.
3. Directives that stop the wheels of progress due to FUD (Fear,Uncertainty and Doubt)should not rule the day. Fight the lawyer’s shotgun approach to preservation. For example, if back up tapes are recycled regularly don’t stop that process if a lawsuit if filed, unless required to.

Get your information house in order. Your business depends on it.

Thursday, May 19, 2011

The Three Bears Solution

Can I get rid of all that “old” information tomorrow?

“My company is so full of info debris that we are no longer efficient." “Hey Randy, why can't we just get rid of everything right now and have clean servers and start fresh tomorrow?" Seems like a wonderful idea. After all its spring time—the perfect time for spring cleaning.

Don’t hit the delete button so fast, bucko. You can’t just blow everything away tomorrow and here is why.

Four compelling reasons why jail would be so not fun.
1. I don’t want a forced roommate.
2. I like going OUT for Asian food.
3. I don’t do well when told when to eat, sleep, relax, exercise etc. I like freedom.
4. I like to travel and jail would severely limit my freedom of movement.

Ok, so the law requires that records are retained. Every business, big and small is required to retain records of their business.

Four compelling business reasons why destroying everything immediately is stupid.
1. How can you manage your day-to-day and long range business activities without records?
2. How do you know what your business rights and obligation are if you don’t have documentation?
3. How will you manage employees and customer relationships without something to rely upon?
4. How will you keep managers, board members, and executives apprised of what’s going on?

Ok, so there are business reasons to manage records and have a way to access and retrieve content to run your business.

The issue of over-retention of information is a major issue for most businesses today. Way too many companies are storing too much stuff, way too long. That equates to real money which could be better used for other business activities. So, more is not necessarily better. All is not tenable. Too little is a business impediment and a legal headache waiting to happen. So, I need a Three Bears Solution—“This pile of information is too big, this pile of information is too small. Oh—this pile of information is just right.” Easy in the porridge business. Not so easy in the information management business.

So, let me help you start to think about getting your business to the place that says we have just the right amount—not too much, not too little.

In order to retain the right amount of information, you first have to know what information you have, what business value it provides and the many legal, regulatory and compliance needs for the information. Then, by considering all those inputs you can determine how long to retain the information. As with anything, there is always an end to the value. This explains why you shouldn’t keep everything forever.

Now, I’m sure there is a whole bunch in the email system, on shared drives, on old servers, etc. just screaming to go to the info graveyard right now. But, how can you get rid of the data that has been stored and ill-managed over time. First, you need to do due diligence around what information exists. Second, you need to determine what information is subject to any audit, investigation or litigation preservation obligations. In that case, the information has to continue to exist until the matter is over and lawyers say it’s OK to destroy. Finally, you need to assess what record retention rules apply. It gets rather complicated pretty quick, so if you have question, please don’t hesitate to ask. Send your questions to RKahn@kahnconsultinginc.com. Better being safe than sorry.

Finally, I strongly believe in cleaning house but in today’s litigation environment you need to do it in a defensible way. No doubt leaner running is better business. But, innocent house cleaning can be considered “destruction of evidence” so clean with a documented plan that is followed and blessed by the business folks and the lawyers.