Feeds:
Posts
Comments

Archive for February, 2010

[tweetmeme source=”atripathy” only_single=false]Earlier this week, I suggested a potential business application of IRS’s internal migration data for a moving and relocation company.

Folks at Neilsen Claritas just found a far more interesting correlation which should have driven a lot of business decisions.  They note:

Today’s presence of underwater mortgages, or homes with negative equity, seem to be correlated to two common regional U.S. population trends: 1) domestic immigration from the Northeastern region to the South and Southwestern regions of the U.S., and 2) migration from coastal California inland

While such retrospective analysis is interesting for reports and blogs, it is not particularly useful for businesses. Maybe as means to generate  interesting hypothesis for future. It would have been useful had the chart been available to the strategic planning or risk group of businesses signing up people for these housing loans in 2006 and 2007.

Data is valuable only when it is used to drive decisions. Most companies have a huge opportunity to do a better job in bringing together data, analytics and visualization and delivering them to the points of decision.

Read Full Post »

[tweetmeme source=”atripathy” only_single=false]One subject that has not received a lot of coverage in the analytics blogging circle is the current administration’s data.gov project. While still in its infancy the data.gov is an outcome of the government’s transparency initiative called Open Government Directive. In December, all government agencies were asked to produce and publish three new ‘high value’ public data feeds on data.gov website.

The data.gov site still has to work through some of the kinks but eventually it will become a wonderful resource for the data analytics industry. Probably as critical as the US Census data and its American Factfinder tool, which has spawned multiple companies and supports all kinds of interesting analysis across a wide range of industries.

The Sunlight Foundation tracks the new datasets that are being released. For example one of the Labor departments datasets is the “weekly reports of fatalities, catastrophes and other events.” The data, compiled by the Occupational Safety and Health Administration, briefly describes workplace accidents, identifies the company at which and the date when the accident occurred. I think a lot of  insurance companies with worker compensation insurance products will be interested in analyzing the data to better price their products. Or take for instance the IRS internal migration data by state and county based on tax returns. Can it be used by moving companies to better understand the shift in demand for their services? There are thousands of such datasets available, and a lot of them will potentially be valuable to businesses. The value of a dataset to business like beauty, is in the eyes of the beholder. This makes the categorization challenging but at the same time makes it interesting for businesses as it can be a potential source of competitive advantage. If you can figure out to interpret the IRS migration data to better align your marketing campaigns for your moving and relocation assistance business, you can get better return on investment on your spend than your competition.

It is time for organizations to look outside their firewalls and build a strategy of collecting, incorporating and analyzing external data into their analytics and strategic planning efforts.  Companies like Infochimps, which is a private clearinghouse and market place for third-party data are betting on this trend. They already collect, cleanse and  format the data.gov datasets so that it is analysis ready.

Take out the time to check the datasets that are available. You never know what you may find.

Read Full Post »

Theme 2:  Modeling Strategic vs. Operational Decisions

[tweetmeme source=”atripathy” only_single=false]

In the first post of this eight part series, I wrote about the importance of understanding the cost of a wrong decision prior to making an investment in a predictive modeling project.

Once we determine the need for the investment, we need to focus on type of modeling approach.  The modeling approach depends on the type of decision that we want predictive model to drive. Decisions can be broadly categorized as either operational or strategic .

I define operational decisions as those that have a specific and unambiguous ‘correct answer’, whereas in strategic decisions an unambiguous ‘correct answer’ is not available. Moreover such decisions have a cascading effect on adjacent and related decisions of the system.

Think about a health plan predicting a fraudulent insurance claim versus predicting the impact of lowering reimbursement rates for patient re-admissions to hospitals.

An insurance claim is either fraudulent or not. The problem is specific and there is an unambiguous correct answer for each claim. Most transaction level decisions fall in this category.

Now consider the second problem. Lowering the reimbursement rate of patient readmission will certainly incent the physician and hospitals to focus on good patient education, follow-up outpatient care and to ensure complete episodes of care for the patient during their time in the hospital. This should result in lower cost for the health plan. However, it can also lead to hospitals delaying the discharge of patients during their first admission or physicians treating patients in an out-patient setting when they should be at the hospital and ending up in emergency room visits. This is the cascading effect of our first decision that will increase the cost of care. Strategic decisions have multiple causal and feedback loops which are not apparent and an unambiguous right answer is hard to figure.  Most policy decisions fall in this category.

The former is an operational decision and requires established statistical (regression, decision tree analysis etc.) and artificial intelligence techniques (e.g. neural networks, genetic algorithms). The key focus of the problem is to predict whether a claim is fraudulent or not based on historical data. Understanding the intricacies of causal linkages is desirable but not necessary (e.g.  neural networks).  The latter needs predictive modeling approaches that are more explanatory in nature. It is critical to understand causal relationships and feedback loops of the system as a whole.  The idea is to develop a model which accurately captures the nature and extent of relationships between various entities in the system based on historical data, to facilitate testing of multiple scenarios. In the re-admission policy example, such a model will help determine the cost impact based on the various scenarios of provider adoption and behavior change (percentage of providers and hospitals that will improve care vs. those that will not adapt to the new policy). Simulation techniques like systems dynamics, agent based models, monte carlo and scenario modeling approaches are more appropriate for such problems.

Bottom line, it is important to remember that strategic and operational decisions need different predictive modeling approaches and the two questions you have to ask yourself:

  1. Is the decision you want to drive operational or strategic in nature?
  2. Are you using the appropriate modeling approach and tools?

Cross-posted on TheInformationAdvantage blog

Read Full Post »

[tweetmeme source=”atripathy” only_single=false]The theme of this blog is to understand how actionable information in form of decision support tools will lead to next wave of efficiencies and competitive advantage. However, the reverse is probably more stark. Not investing in the table stakes data aggregation and reporting process capabilities can also hurt, and it can hurt big time.

The financial crisis in Greece is a case study on how easy money and uncontrolled government spending during boom time can come back to hurt in a weak economy. However, one of the confounding factors has been the Greek government’s repeated revisions of its budget deficit data. In 2008, it reported the deficit to be 5.0% of their GDP in April. Later that year they revised it up to 7.7%. Similarly, in 2009 April, the official forecast figure for the deficit was 3.7% of the GDP which was later revised to 12.5% of GDP. It is the last revision that started the full blown crisis.

Digging a little bit deeper, it is easy to discover that one of the key reasons for revisions. It is the lack of a modern budgetary process and financial reporting system.

Past budgets have rested on some 14,000 separate expenditure lines. This year’s has brought the figure down to about 1,000. In this system, the evaluation of public spending in any particular area is almost impossible. The amount spent on education, for example, is defined as the total sum of money allocated to the Ministry of Education and it is very difficult to monitor where it goes. Currently, most of Greece’s 15 ministries and dozens of other government bodies handle their own payroll accounts, making it difficult to gain a complete overview of government spending.

No wonder, they could not trace reliably how much money was being spent!

Last year, the Greek government had also approached the OECD to conduct a study and recommend improvements in its budgetary processes, and one of the recommendations was around managing the deployment of the new accounting and financial information system.

Ill-defined processes and weak information management systems tend to exist in certain quarters of most organizations. The key question to ask yourself is whether this under-investment in information systems:
1) exposes you to a big risk
2) makes you inefficient or
3) prevents you from gaining some potential competitive advantage?

Read Full Post »

[tweetmeme source=”atripathy” only_single=false]I, along with two of my colleagues (Anand Rao & Dick Findlay), recently conducted a workshop at the World Research Group’s Predictive Modeling conference at Orlando. As part of the workshop, I spoke about a list of 8 things that organizations should keep in mind as they consider investing in predictive analytics.

In this post, I will list the 8 points and discuss the first one. Subsequent posts will explore the rest of the themes.

  1. Understand the cost of a wrong decision
  2. Strategic and operational decisions need different predictive modeling tools and analysis approaches
  3. Integration of multiple data sources, especially third-party data, provides better predictions
  4. Statistical techniques and tools are mature and by itself not likely to provide significant competitive advantage
  5. Good data visualization leads to smarter decisions
  6. Delivering the prediction at the point of decision is critical
  7. Prototype, Pilot, Scale
  8. Create a predictive modeling process & architecture

Theme 1: Understand the Cost of a Wrong Decision

Is it even worth investing the resources on developing a predictive analytics solution for a problem? That is the first question which should be answered. The best way to answer it is to understand the cost of the wrong decision. I define a decision as ‘wrong’ if the outcome is not a desired event. For example, if the direct mail sent to a customer does not lead the desired call to the 800 number listed, then it was a ‘wrong’ decision to send the mail to that customer.

A few months ago my colleague Bill told a story which illustrates the point.

Each year Bill takes his family to Cleveland to visit his mom. They stay in an old Cleveland hotel downtown. The hotel is pretty nice with all the trappings  that you would expect of an old and reputable establishment. Last time they decided to have breakfast at the hotel across the street at the Ritz. After the breakfast when Bill and his family were in the lobby, the property manager spotted him and the kids and walked over to talk. He chatted for a few minutes and probably surmised that Bill was a reasonably seasoned traveler and told the kids to wait for him.  He walked away and came back with a wagon full of toys.  He let each kid pick a toy out of the wagon.  Think about it. They were not even guests at the Ritz, all they did was have breakfast at the Ritz! The kids loved the manager and Bill remembered the gesture. Fast forward to this holiday season, and sure enough Bill and his family booked a suite at the Ritz for six days. For the price of a few nice toys, the manager converted a stay that generated a few thousand dollars in room charges, meals, and parking.

Now suppose Bill did not go back to the hotel, which was the desired outcome by the hotel manager. What would have been the cost of manager’s ‘wrong’ decision?  The cost of a few toys. The cost compared to the potential upside is negligible. Does it make sense for the hotel to build a predictive model to decide which restaurant diners to offer toys so that they come back and stay? I don’t think so.

Understanding the cost of wrong decision upfront saves one from making low value investments in predictive analytics.

PS: My colleague Paul D’Alessandro has also used this story to illustrate experience design(XD) principles.

Photo credit: GJones

Read Full Post »