The Geography of Big Data: How to Get Started
December 19, 2012
If you’re a business executive you’ve no doubt heard a lot about “Big Data” and the promise of analytics. You may have even read a recent post in the Harvard Business Review about how to Get Started with Big Data. It’s a great article and offers good advice but probably should be retitled: “How to Get Started with Big Data if you Can Afford to Pay McKinsey & Co Consulting Rates“. It sort of skips past the Big Data 101 issues that I see as first steps and moves directly to what I would consider more advanced uses of Big Data.
Instead, let’s assume that your company isn’t a Fortune 500 company and maybe you’ve struggled a bit with technology strategy and operations. Maybe you’re still struggling but you can’t wait another year for IT to complete the decade long SAP/Oracle/Cognos/Any ERP implementation that cost millions and has yet to show any benefit. Perhaps you’re not a math genius, not a finance person and not even really what some might consider tech-savvy. But, you know your business, you know your customers and you know your products. You also know you have to keep up with changes in your industry and you don’t want to be left behind if Big Data is the next big thing. [And, I definitely think it will be, at least one of them.] You may be asking: where should I start?
My advice: make a map.
Huh? Why would I start by making a map? Our company manufactures sophisticated engine components. I need performance metrics, fancy algorithms and cutting-edge insights to drive strategy and profit. How is a simple map going to help me improve the bottom line? Sounds like a silly kindergarten activity with no possible ROI.
Well, give me a chance to explain. Before you can turn some Nate Silver-like econometrics modeling guru or Physics PhD genius loose you need good data and some ideas about what specific problems you want to try to address with analytics. Producing a map can be an excellent process for moving toward a more sophisticated Big Data program. So, how can making a map help start this process?
Here are 6 benefits of making a map:
1. Your company will be forced to take inventory of key data elements.
2. Your IT team will be required to deliver data in a usable format.
3. Any problems with customer data will become readily apparent in the geocoding process.
4. Geographic representations of company data will reveal new patterns that spreadsheets may be disguising.
5. Producing a map will allow everyone to get involved, not just the same old digit heads.
6. Seeing your company’s data on a map will generate new ideas.
In the coming days and weeks I will elaborate on each of these points. Stay tuned!
4 Comments
hi Justin.
First of all i wanna to think you about your effort.
The idea of presenting data in map seem very interesting and can reduce a cost of data management , but there are some points that i hope you can explain them to me.
The first point is about the existence of a software which can deal with hundreds of terabytes of data size.
Other point is how can we deal with data which hasn’t a geographic dimension.
thank you
Thanks for the comment! My short answers to your questions are (1) start smaller and (2) find the geographic dimension. Longer answers: (1) If you want to approach big data, start somewhere but not with everything all at once. If you try to wrangle “hundreds of terabytes” right out of the gate you’ll be overwhelmed and not get anywhere fast. I prefer an iterative approach whereby you start small, make progress quickly and build from there gradually. You don’t need specialized software if you start small. You can find the right software for managing huge volumes later and you’ll learn something about your requirements along the way. (2) If you’re in business and have customers, you have a geographic dimension. If you’re an organization of any type that involves people, you have a geographic dimension because each person lives, works, shops and travels somewhere. You just need to gather the data. If you want to focus on non-geographic data then you should find another domain expert…I can’t help too much. Not sure this is really what you were looking for but hopefully it helps a bit. Cheers, J.
[…] wrote an interesting blog post suggesting that a great place to start a Big Data journey is to build a map. He argued that most firms and managers don’t have the resources or technical capability […]
[…] Perhaps for the same reasons, Python is becoming a leading programming environment for statistical computing. It may usurp the R language/environment because of greater flexibility and simplicity. The jury is still out but my money is on Python. I am also willing to bet SAS and SPSS will move toward endangered species status in the coming years. Since I teach statistics I want to be able to provide guidance for students who have the interest and inclination to pursue emerging careers in Big Data. […]