Data Preparation and Blending: The Prerequisite for Analytics [new e-book]

“Herding cats isn’t exactly in my job description,” the data analyst told me. “Why isn’t all the data I need sitting in one place — or even two places — and waiting for me to analyze it?”

     Photo Credit:  Kathleen Murtagh Licensed under CC BY 2.0


“You’re right,” I replied. “Your job description doesn’t exactly include herding cats. But it does include analyzing data, and before you can do that, you have to collect, blend and prepare it.”

This data analyst was dismayed at the prospect of having to pull data in from multiple sources. That does indeed feel like herding cats, especially when the most useful data — where the real business opportunities hide — is unstructured, moving through the cloud and in motion, as I mentioned in my post on the changing face of data analytics.

But what do you end up with if you don’t herd those cats? Silos, overlooked opportunities and money left on the table.

Before analysis, prepare and blend your data

We’ve released a new e-book, Break Down the Barriers to Better Analytics, to clarify the data preparation and data blending step that can feel like herding cats. The e-book explores three requirements for aggregating heterogeneous data:

  • All data must be accessible. That means accessing data both inside and outside the firewall. If you omit one source of data simply because of the difficulty in blending it with other sources of data, you’ll close the door on the very business opportunities advanced analytics can help you find.
  • Data must be standardized so that you’re not mixing miles and kilometers, dollars and euros, or quarters and years. You shouldn’t have to set this up repeatedly, but should be able to automate query, aggregation, data quality and transformation tasks.
  • Finally, to keep privileged data in front of only the users entitled to see it, you’ll have to enforce user validation. Instead of nagging IT for every change in user status or access privileges, use self-service tools to keep that process at the business level where it belongs.

Those concepts are simple, but they are not easy to implement, because data comes from and resides in a zillion different places.

New e-book: Break Down the Barriers to Better Analytics

Blending data from a variety of different sources can be a daunting process, but it is not impossible. Our biopharmaceutical customer Shire reaped data analysis efficiencies that helped them improve process control, monitor processes and identify areas for improvement. You can too.

Have a look at our new e-book, Break Down the Barriers to Better Analytics, for more insights into the changing face of analytics; data preparation and data blending; and the corporate- and IT-centered barriers to using analytics efficiently in your organization.

You’ll see that there’s nothing really wrong with herding cats. As long as you can automate it.