Archive for the ‘Competing on Analytics’ Category

[tweetmeme source=”atripathy” only_single=false] Lately, I have been thinking about the entire big data trend. Fundamentally, it makes sense to me and I believe it is useful for some enterprise class problems,  but something about it had been troubling me and I decided to take some time and jot down my thoughts.  As I thought more about it, I realized my core issue is associated with some of the over simplified rhetoric that I hear about what big data can do for businesses. A lot of it is propagated by speakers/companies at big name conferences and subsequently echoed by many blogs and articles. Here are the 3 main myths that I regularly hear:
1. More data = More insights
An argument which I have heard a lot is that with enough data, you are more likely to discover patterns and facts and insights. Moreover, with enough data, you can discover patterns and facts using simple counting that you can’t discover in small data using sophisticated statistical methods.

My take:
It is true but as a research concept For businesses the key barrier is not the ability to draw insights from large volumes of data, it is asking the right questions for which they need an insight. It is not never wise to generalize the usefulness of large datasets since the ability to provide answers will depend on the question being asked and the relevance of the data to the question.

2. Insights = Actionability = Decisions
It is almost an implicit assumption that insights will be actionable and since they are actionable business decisions will be made based on them.

My take:
There is a huge gap between insights and actionability.  Analysts always find very interesting insights but a tiny fraction of it will be actionable, especially if one has not started with a very strong business hypothesis to test.

Even more dangerous is the assumption, that because an insight is actionable, an executive will make the decision to implement it. Ask any analyst who has worked in a large company and he /she will tell you that realities of business context and failure of rational choice theory stand in the way of a lot of good actionable insights turning into decisions.

3. Storing all data forever is a good thing
This is the Gmail pitch. Enterprises do not have to decide which data they need to store and what to purge. They can and should store everything because of Myth 1. More data means more insights and competitive advantage. Moreover, storage is cheap so why would you not store all data forever.

My take:
Remember the backlash against Gmail which did not have a delete button when it started. The fact is there is a lot of useless data which increases noise to signal ratio. Enterprises struggle with data quality issues and storing everything without any thought to what data is more useful for which kind of questions does more harm than good. Business centric approaches to data quality and data architecture have a significant payoff for downstream analytics and we should give them their due credit when we talk about big data.

In summary,

1. There is a lot of headroom left for small data insights that enterprises fail to profit from.
2. There are indeed some very interesting use cases for big data which are useful for enterprises (even the non-web related ones)
3. But the hype and the oversimplification of the benefits without thoughtful consideration of issues and barriers will eventually lead to disappointment and disillusion in the short run.

Some interesting perspectives on the topic: James Kobielus , Rama Ramkrishnan


Read Full Post »

[tweetmeme source=”atripathy” only_single=false]I have earlier written about that Insight at the point of decision making/action is critical. I came across a great example of it from the good folks at Sunlight Foundation, who are trying to bring transparency to political influence.

Inbox Influence is a browser extension that adds political influence data to your Gmail messages. With Inbox Influence installed, you’ll see information on the sender of each email, the company from which it’s sent, and any politician, company, union or political action committee mentioned in the body of the email. The information is added unobtrusively and nearly instantaneously, and includes campaign contributions, fundraisers and lobbying activity. You can use it to add context to news alerts, political mailers and corporate emails, or just to see who your friends donated to in the last election.

By focusing on email they have provided a tool which provides insights where the action (solicitation, support, contribution commitment) is most likely to happen and makes it part of the normal workflow.

I played around with the tool a bit and it was interesting to see the campaign contribution and lobbying activity of financial institutions, cable and cell phone companies from their statement notifications that they sent to my Gmail account.

This blog entry explains the technical challenges that the developer had to overcome to build this nifty tool and description of the back-end databases it searches. The key take away is not to underestimate the effort it takes to overcome the last mile infrastructure issues as they are thinking about their BI architecture. It is normally the difference between a success and failure of the project from a business perspective.

Read Full Post »

[tweetmeme source=”atripathy” only_single=false]Think about the large successful organizations which are known for harnessing information for competitive advantage; P&G, Goldman Sachs, Capital One, Harrah’s, Progressive Insurance and you will find one thing in common. Their C level executives drive data driven decision making top down. And the more organizations I see, the more I get convinced that it is one of the most important factors for a company which wants to ‘compete on analytics’.

Here is my hypothesis of why it is so:

There is a fundamental Catch 22 situation in most large companies. Organizations do not have consistently good quality data (mainly due to process issues during intake) and unless the data is used to making real business decisions, it is hard to improve its quality.

This Catch 22 can only be resolved by very senior executive (read C level)  who commits himself to making decisions and measuring performance based on analysis done with imperfect data (but good enough for many types of decisions/relative measurements). Once middle management understands how the data is being used, it spurs process changes to fix the quality issue which in turns increases the accuracy and reliability of analysis. The virtuous cycle is key for large companies which ‘compete on analytics’

In contrast, the middle management never wants to be in a situation to justify their decisions knowingly made using imperfect data. It is easier to justify subjective gut feel than objective decisions made with data with known quality issues.

In summary – the culture of analytics is a top down phenomenon

What do you think? Do you agree with this observation?

Photo credit

Read Full Post »