Big data is one of those inescapable things. Even if we don’t think we need to worry about it, it’s going to be forced on us by vendors selling directly to the business.
Yes, a new wave of “this is good; I read about it on my last flight” is about to hit.
That says to me that we should be preparing the way now: you might not rush to implement huge data collections and analysis tools today, but you should think through how you’ll handle them, at least.
Some open source methods will help on the technical side. What about the policy framework?
After all, there are a lot of data warehouses and business intelligence tool implementations that got a bad rap. “We built it, but they didn’t come.” Or, at the very least, it’s not used in the ways that were envisaged.
The challenges of big data are threefold:
How Much is Too Much? is a question that needs to be asked and answered before someone’s favourite project gets pushed on you. Just because more is available, it’s not always a help. We’re all used to situations where multiple metrics for a function are defined, and they end up telling us simultaneously that all is well and that there are serious problems. More for the sake of more is similar.
This means setting some ground rules about analysis — because while infrastructure’s a lot cheaper than it used to be, it’s not free. It may also require that you have a competency centre ready to help people accomplish their goals, rather than stumble around on their own.
How Prepared are we to Trust This? is another question of principle that has to be decided before investing a lot of time and money into big data solutions. Some organizations simply don’t trust any external data: if they didn’t create it themselves, it’s automatically suspect. (Some departments don’t trust other departments in the same way: you should know in advance if that’s a problem.)
Likewise, handling masses of data often requires new presentation methods. If you’re one of those enterprises where spreadsheets rule, a three dimensional visualization of billions of data points where you look for signals and clusters simply won’t be believed if presented — yet, often, that’s how to present results effectively. (If you’d like to test some reactions, see some of the outputs from SenseMaker, the toolset used by Cognitive Edge practitioners to handle masses of textual data.)
What Counts as Success? is the third key question. There’s always a temptation to do another analysis — and to require that every analysis traverse all the available data. Neither is necessary: there’s a point where you should have enough confidence in the results to stop the process. How do you figure that out? Who says “that’s it: deal with it”? What risks, in other words, is the enterprise willing to bear?
These are business governance decisions that impinge heavily on IT governance, and the process of building and running an IT Governance Board is very well suited to answering these questions and setting business policy for the use of big data before you’re in the throes of working with it.






