Wednesday, May 9, 2007

Caveat emptor

AlloComrades of Yore ,
With Sales Teams of the BizIntelligence Companies doing a good job and getting good support from the Ivy Tower(pdf:Competing on Analytics) there has been a groundswell in using data to prove anything one wants to.

But the Data Analysts have lot to learn from Economists especially the Micro-Variety who have made empirical Analysis a science or atleast as close as it can possibly be.
Inspired by Godel's Theorem on limits of Mathematical Proofs, I have always been intrigued by limits of any scientific endeavor. What the science/methodology can't tell us.

Few of the major caveats one need to keep in one's mind as we crunch the numbers are listed below

Monday, April 30, 2007

Random Musing

Marraige,
is it a case of being Fooled by Randomness?
One of the assertions of the author Taleb is that any information at too low level of time grain is indistinguishable from random noise. Thats the thought for the day for the data analysts among us.

Other interestesting notes are
  • The behavioral lapse of confusing Correlation with Causality.
  • The fallacy of misconstruing Power Law Distributions (where most values are below average and a few far above) as Gaussian Distributions (where most data values center around the average)
There is a good review in WSJ for the latest book by the Author about Popperian Black Swans which is an extension of his argument central to earlier mentioned book.

Sunday, April 22, 2007

Second Time is the Charm?

Here I am staring at the monolith monitors for inspiration to blog,
Life for sure is tedious!

Coming back to technical trivia,
One of the main concerns of any self-respecting Data Analysts is the ability to deal with huge amount of data both as a consumer and producer.
This requires for varied skillset like performance tuning of SQL quries, compactification of the data and presentation of the data.

In this age of Tivo and Youtube with ever shrinking attention spans (Tata 5 day Cricket Test Matches) it's not reasonable to expect any multi-tasking human to make sense of a bunch of numbers. The standard option seems to be the ready-made Excel Charts.


Though this may look like a point and click operation, There is for more intellectual concern and nuance with showing the accurate picture hidden inside the millions of rows you chart is supposed to harvest.

But dont worry help is at hand with The Visual Display of Quantitative Information.
There is a good review to be found at http://www.typebooks.org/botw-096139210x.htm

For readers(hypothetical?) impatient enough to wait for my arguments in future regarding the books they should be reading, below is my ReadingList courtest the cool Google Docs.
http://spreadsheets.google.com/pub?key=px3EpcL_HPrBFuwKSptNkYg

Coolio,
A---

Thursday, April 19, 2007

Let the Fun begin...

As a self respecting CompSci Professional let me define the requirements i.e. objective of the blog
Though events of my life would make for an animating commentary on the don'ts of desi I will resist the temptation. The Intention is to speculate on Intellectual pursuit in the field of Data Analysis. As requirements go that is a sufficiently vague one to make a Business Analyst proud.(No Bar on taking digs at the BizAnalysts).

I take Data Analysis to be a multi-disciplinary thingy starting from mundane SQL trivia to exotic micro-economic lemmas.

I will post my original thoughts(No, its not an oxymoron) on the concerns of regular Joe, the Data Analyst as well as links to various interesting articles of interest to the Joe the upcoming Data Analyst.

Tighten your neurons, Here comes the Intellectual ride from the left field...