Thursday, October 10, 2013

Big Data vs Better Data vs NO Data

As mentioned in previous posts, the majority of my experience is from the “business” world.  I’ve done analytic models for companies on four continents and a wide variety of industries.  This has provided me with an diverse perspective, especially when looking at professional sports and their attempts at “analytics”.

I’ve told the folks that work with me at Real Sports Analytics several times that the book (and movie) “Moneyball” was great for sports analytics.  No, not because there was any significant analytical work done by Billy Beane in the story (trust me on that, his approach is what I call a “frog in the blender”), but the movie did raise awareness of a potential use of analytics in professional sports.  Some say I’m the “Moneyball of football”, well maybe I am, but what I’ve invented at Real Sports Analytics is much closer to the performance management work I’ve done in the business world and nothing at all like what Mr. Bean did at the Oakland A’s – mine is a frog safe approach. 

(side note: being played by Brad Pitt in a movie is just not me, George Clooney would be a much better choice)

With all of that said, let’s get to our topic of the day.  Big data is typically defined as taking a large amount of “raw” data, meaning data that doesn’t mean anything at all by itself and that has no intrinsic value with the exception of its volume, BUT given proper processing algorithms certain observational deductions can be made when these sets are jammed together and “mined” for data.  (see the Wikipedia article on data mining)  With this large data set, patterns can emerge or “nuggets” of important information (thus the “mining” analogy) can be discovered, but typically only after huge databases have been created and large amounts of computing power have been thrown at the problem and “experts” have analyzed the results.  Again, think mining analogy like the 49ers of old panning for gold and hoping to find a nugget hidden within tons of mushy dirt.

Now, again, with my experience in the industry I have some pretty strong opinions.  Read previous blog posts about the question I love to ask (SO WHAT?) when setting up any analytic model.  I think “Big Data” and “No Data” in many cases are equivalent.  How is that?  Because most people in the sports world are looking at the WRONG data without even knowing it (think the old 49er panning for gold in someone’s swimming pool).  For example, you may download from NOAA the entire meteorological record of New Orleans and correlate the weather patterns that are most favorable for the Saints winning football games, BUT unless someone makes you aware the the Saints football games are played in an INDOOR facility, you’d really be crunching all of those numbers just for the sake of crunching numbers.  (which a lot of people do)

Now, this is an extreme example, but so many times I’ve seen in business (and now in sports) people trying to count the “oranges” by looking at the rows in the apple orchard.

Case in point.  The Jacksonville Jaguars.  Any NFL fan is aware of the struggles of this organization over the past few years.  They have repeatedly failed to find a quarterback (denial), have failed to keep their best players (through poor management and bad trade evaluations), have not drafted well (again poor player evaluations), and now have a guy at the head of their analytics department whose only real qualification is that he was born to the right father (i.e. the current owner of the Jacksonville franchise).  Read the fourth paragraph of this article:

Jacksonville Jaguars seek “the truth” 

Ok, so maybe I’m being picky, but if I’m presenting someone with information concerning a multi-million dollar decision I’m not going to present the information on “three handwritten sheets of paper” that can hardly be deciphered.  It’s unprofessional and (in my mind) throws serious doubt on the contents.

This guy graduated college in 2007 and in the interim (before joining the Jaguars in 2012), wait for it…  helped to build a biodiesel fuel factility.  I’m not making this up - http://www.jaguars.com/team/football-staff/tony-khan.html.

If you Google Jacksonville Jaguars and analytics you’ll get two pages of hits on stories about how the Jaguars are “maximizing their returns” using analytics.  Yeah, not too many teams give their owner’s kid a cushy job that he’s obviously not qualified for, go 0-5 to start the year and even resort to giving away coupons for free beer in an attempt to fill the seats.  I’ll give them this, their press management is top notch.

The Jacksonville Jaguars are a great example where “Big Data” really equates to “No Data”.  If you’re not asking the right questions and you don’t have the right people with the right experience finding the answer – really if ANY of those are the case, HOW THE HECK can you expect to be delivering the right answers.

Ok, now for “Better Data”.  “Better Data” doesn’t always involve big server farms, huge spreadsheets or bunches of guys that look like this.  In fact, rarely does “Better Data” involve more than a few megabytes of information.  “Better Data” involves professional people with EXPERIENCE that come in and ask the right questions.  How do you currently measure performance?  What would make that simpler or more useful in what you are trying to accomplish?  Do you think your current performance metrics are correct for your organization?  What (in your opinion) works when you team is successful?  What are your teams current weaknesses?

There are hundreds of questions, thousands of permutations these conversations can take and I certainly am not going to give away the “ancient Chinese secret” in a blog post.  But, my point is this.  Always ask the question “SO WHAT?”.  I’m sure those guys at Jacksonville have a huge budget and are now asking themselves where is our return?  How long will it take to get that return?  Just because someone says they are experienced with “big data” doesn’t mean they are right for your organization – find someone that comes in and asks questions first.  (hopefully)  Also find someone with some experience in the game or at least someone who can bring those valuable people to the table.  I know at Real Sports Analytics we have a team of former NFL players advising us every step of the way and their experience and knowledge are INVALUABLE to how we build our analytic models.

Some teams are doing this effectively, it’s pretty obvious who they are in the NFL and NCAA ranks.  The most valuable aspect of this is knowing the difference between those three types of data and becoming an engaged partner with an experienced, knowledgeable analytics company.

Thinking ahead, one of my future blog posts will be about the ways analytics are used in the business world and how that relates (directly) to the world of professional sports.  Until then, keep this quote in mind.

“Appear weak when you are strong, and strong when you are weak.”
Sun Tzu, The Art of War

No comments:

Post a Comment