<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: 3 Big Data Myths for Enterprises</title>
	<atom:link href="http://amareshtripathy.com/2011/06/17/3-big-data-myths-for-enterprises/feed/" rel="self" type="application/rss+xml" />
	<link>http://amareshtripathy.com/2011/06/17/3-big-data-myths-for-enterprises/</link>
	<description>Data&#124;Analytics&#124;Visualization&#124;Integration</description>
	<lastBuildDate>Thu, 24 Jan 2013 14:06:31 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Krishnan Sakotai</title>
		<link>http://amareshtripathy.com/2011/06/17/3-big-data-myths-for-enterprises/#comment-169</link>
		<dc:creator><![CDATA[Krishnan Sakotai]]></dc:creator>
		<pubDate>Tue, 15 Nov 2011 10:21:09 +0000</pubDate>
		<guid isPermaLink="false">http://amareshtripathy.com/?p=194#comment-169</guid>
		<description><![CDATA[One more point to add to this discussion is that despite all the analytical insights that may be produced out of data, someone may neglect to perform a &quot;sanity check&quot; on whether this is the right insight for us. We may end up creating an answer in search of a problem. Therefore I think it is really important for business to ask the right questions all the way through the process of generating analytical insights.]]></description>
		<content:encoded><![CDATA[<p>One more point to add to this discussion is that despite all the analytical insights that may be produced out of data, someone may neglect to perform a &#8220;sanity check&#8221; on whether this is the right insight for us. We may end up creating an answer in search of a problem. Therefore I think it is really important for business to ask the right questions all the way through the process of generating analytical insights.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Amaresh Tripathy</title>
		<link>http://amareshtripathy.com/2011/06/17/3-big-data-myths-for-enterprises/#comment-155</link>
		<dc:creator><![CDATA[Amaresh Tripathy]]></dc:creator>
		<pubDate>Mon, 20 Jun 2011 13:28:56 +0000</pubDate>
		<guid isPermaLink="false">http://amareshtripathy.com/?p=194#comment-155</guid>
		<description><![CDATA[@deepanalytics 

I like the way you talk about real time decision making and knowledge discovery

@Rama, really liked the way you put it.]]></description>
		<content:encoded><![CDATA[<p>@deepanalytics </p>
<p>I like the way you talk about real time decision making and knowledge discovery</p>
<p>@Rama, really liked the way you put it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rama Ramakrishnan</title>
		<link>http://amareshtripathy.com/2011/06/17/3-big-data-myths-for-enterprises/#comment-154</link>
		<dc:creator><![CDATA[Rama Ramakrishnan]]></dc:creator>
		<pubDate>Mon, 20 Jun 2011 11:23:23 +0000</pubDate>
		<guid isPermaLink="false">http://amareshtripathy.com/?p=194#comment-154</guid>
		<description><![CDATA[Excellent observations, Amaresh and @deepanalytics.

An implicit assumption behind a lot of the breathless commentary in the blogosphere seems to be this: if you can unearth an insight from data and present it to the client/customer/end-user, they will immediately embrace it, act on it, get value from it, and thank you profusely for  it.

In reality, the response from a customer is more like: &quot;oh great. one more so-called insight. now i have to worry about what the heck to do with this darn thing, on top of all the stuff that&#039;s part of my day job. thanks a lot!&quot;

What customers need are decision recommendations with supporting evidence, not &quot;insights&quot;.]]></description>
		<content:encoded><![CDATA[<p>Excellent observations, Amaresh and @deepanalytics.</p>
<p>An implicit assumption behind a lot of the breathless commentary in the blogosphere seems to be this: if you can unearth an insight from data and present it to the client/customer/end-user, they will immediately embrace it, act on it, get value from it, and thank you profusely for  it.</p>
<p>In reality, the response from a customer is more like: &#8220;oh great. one more so-called insight. now i have to worry about what the heck to do with this darn thing, on top of all the stuff that&#8217;s part of my day job. thanks a lot!&#8221;</p>
<p>What customers need are decision recommendations with supporting evidence, not &#8220;insights&#8221;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: deepanalytics</title>
		<link>http://amareshtripathy.com/2011/06/17/3-big-data-myths-for-enterprises/#comment-152</link>
		<dc:creator><![CDATA[deepanalytics]]></dc:creator>
		<pubDate>Fri, 17 Jun 2011 22:26:23 +0000</pubDate>
		<guid isPermaLink="false">http://amareshtripathy.com/?p=194#comment-152</guid>
		<description><![CDATA[Amaresh:

  Totally agree with your assessment: starting with raw data is asking for trouble.

The way we have set up our analytics pipelines is that they always work towards a decision point and then we have a feedback mechanism to test in the future if our decision was the right one. If it turns out to have been the wrong decision, we can go back and figure out what we did wrong in our analysis. For real-time decision making it is typically a bit easier to be disciplined as the analytics coding is all done to trigger some test. For knowledge discovery I believe you need two ingredients: 

1- critical mass of statistical thinking, and 
2- deep statistical skills with a solid historical perspective

Critical mass is needed so that you don&#039;t have one person in the corner generating reports that the rest of the organization simply ignores. Deep skills and historical perspective are needed to properly direct the proper algorithms and sentiment. I truly believe in the value of  historical perspective as it grounds the analytics with a richer context.

By focusing on a decision it is easier to properly allocate the right analytical resources, although that is still a very difficult process if you don&#039;t have infinite skills and resources.]]></description>
		<content:encoded><![CDATA[<p>Amaresh:</p>
<p>  Totally agree with your assessment: starting with raw data is asking for trouble.</p>
<p>The way we have set up our analytics pipelines is that they always work towards a decision point and then we have a feedback mechanism to test in the future if our decision was the right one. If it turns out to have been the wrong decision, we can go back and figure out what we did wrong in our analysis. For real-time decision making it is typically a bit easier to be disciplined as the analytics coding is all done to trigger some test. For knowledge discovery I believe you need two ingredients: </p>
<p>1- critical mass of statistical thinking, and<br />
2- deep statistical skills with a solid historical perspective</p>
<p>Critical mass is needed so that you don&#8217;t have one person in the corner generating reports that the rest of the organization simply ignores. Deep skills and historical perspective are needed to properly direct the proper algorithms and sentiment. I truly believe in the value of  historical perspective as it grounds the analytics with a richer context.</p>
<p>By focusing on a decision it is easier to properly allocate the right analytical resources, although that is still a very difficult process if you don&#8217;t have infinite skills and resources.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Amaresh Tripathy</title>
		<link>http://amareshtripathy.com/2011/06/17/3-big-data-myths-for-enterprises/#comment-151</link>
		<dc:creator><![CDATA[Amaresh Tripathy]]></dc:creator>
		<pubDate>Fri, 17 Jun 2011 21:16:45 +0000</pubDate>
		<guid isPermaLink="false">http://amareshtripathy.com/?p=194#comment-151</guid>
		<description><![CDATA[Totally agree with you and your use case.  

In your example you have a business problem that you are trying to solve, a defined set of data, a process to systematically test on a continuous basis and measurable metrics you want to influence.  You are certainly using more data to your advantage. 

However, when I hear about big data in most cases the starting point is the data and not the problem/metric, which I contend is not the best place to start.]]></description>
		<content:encoded><![CDATA[<p>Totally agree with you and your use case.  </p>
<p>In your example you have a business problem that you are trying to solve, a defined set of data, a process to systematically test on a continuous basis and measurable metrics you want to influence.  You are certainly using more data to your advantage. </p>
<p>However, when I hear about big data in most cases the starting point is the data and not the problem/metric, which I contend is not the best place to start.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: deepanalytics</title>
		<link>http://amareshtripathy.com/2011/06/17/3-big-data-myths-for-enterprises/#comment-150</link>
		<dc:creator><![CDATA[deepanalytics]]></dc:creator>
		<pubDate>Fri, 17 Jun 2011 20:34:51 +0000</pubDate>
		<guid isPermaLink="false">http://amareshtripathy.com/?p=194#comment-150</guid>
		<description><![CDATA[I would like to submit the following refinement of the big data concept.

Assume you are a business and in your daily process data is generated to support the at that point actionable data stream. Say you operate like that for two years, at which point you realize that there has been no correlation between your actionable data stream and your top three key performance metrics. Now what do you do? If you threw away data that didn&#039;t pertain to the BKMs of that time you might have thrown away the real actionable information that would have allowed you to discover what the business processes were that really would help the business. That is how I interpret the &quot;more data generates more insight&quot; concept.

Twenty years ago when a GByte on a filer was a big deal, we constantly fought with this problem. We would spent 6 months generating data and calibrating models and then we would get the round trip done where we tested our predictions, we would learn that we should be tracking different metrics. Having to throw away data because you can&#039;t store it made learning much more difficult because you couldn&#039;t go back and test if your new insights would have produced better results. This lead to this constant churn where we had to redo experiments or simply would not be able to relate new findings to the past.

What we do now with our data sets is we define the decision universe as broadly as we understand the problem. Our data generators will err on creating too much information, for the simple reason that now we can store a couple hundred TBs without trouble. The actionable decision processes will grab from this data set and data mining algorithms continue to analyze the raw data to see if there are better metrics hidden in the data. Once you really understand your decision universe and the business processes that it can affect, you can scrub the data. We still prefer to compress/dedup/archive it so that we can generate a decade or more of operational data. The ability to back test is so valuable for insight.]]></description>
		<content:encoded><![CDATA[<p>I would like to submit the following refinement of the big data concept.</p>
<p>Assume you are a business and in your daily process data is generated to support the at that point actionable data stream. Say you operate like that for two years, at which point you realize that there has been no correlation between your actionable data stream and your top three key performance metrics. Now what do you do? If you threw away data that didn&#8217;t pertain to the BKMs of that time you might have thrown away the real actionable information that would have allowed you to discover what the business processes were that really would help the business. That is how I interpret the &#8220;more data generates more insight&#8221; concept.</p>
<p>Twenty years ago when a GByte on a filer was a big deal, we constantly fought with this problem. We would spent 6 months generating data and calibrating models and then we would get the round trip done where we tested our predictions, we would learn that we should be tracking different metrics. Having to throw away data because you can&#8217;t store it made learning much more difficult because you couldn&#8217;t go back and test if your new insights would have produced better results. This lead to this constant churn where we had to redo experiments or simply would not be able to relate new findings to the past.</p>
<p>What we do now with our data sets is we define the decision universe as broadly as we understand the problem. Our data generators will err on creating too much information, for the simple reason that now we can store a couple hundred TBs without trouble. The actionable decision processes will grab from this data set and data mining algorithms continue to analyze the raw data to see if there are better metrics hidden in the data. Once you really understand your decision universe and the business processes that it can affect, you can scrub the data. We still prefer to compress/dedup/archive it so that we can generate a decade or more of operational data. The ability to back test is so valuable for insight.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
