When you’re looking for a definition of data democratization, you find a few common threads:
- making digital information accessible to the average user
- giving people the tools to understand data and make quick decisions
- opening up information systems to business users without requiring the involvement of IT personnel
- enabling the organization to be “data first” and more competitive
Sounds pretty liberating, doesn’t it? Kind of a “Power to the People” meets “digital transformation” theme.
It’s true that the goal of data democratization is to help business users get their hands on data quickly, so they can respond quickly. That kind of power is an important competitive advantage and every smart organization wants it.
But power brings responsibility. That’s why every smart organization wants to give its users access to data AND ensure that it’s the data to which they should have access. So here’s one more definition of data democratization for you — this time with guardrails.
What is data democratization?
Data democratization means giving business users access to data so they can quickly make business decisions with it. In data democratization, the role of IT is not to provide the data to those users, but to ensure that they access only the data they need, with organizational control.
The control is necessary to prevent wild-west scenarios. For example, the organization doesn’t want to lose control of the data and have it end up in unpredictable places, like USB drives and users’ personal devices. It wants to ensure the data is used in compliance with industry statutes like HIPAA and with privacy laws like GDPR. And it wants to avoid Garbage In Garbage Out (GIGO), in which users make bad decisions because they’ve analyzed the wrong data.
The biggest reason to include organizational control in data democratization, though, is efficiency. Most users can’t work efficiently with data because they don’t understand it. Why not? Because they didn’t create the databases, structures, schemas, tables and column names in the data sources. And even if they were in the room when the data sources were being created for their department, what about the data they need from other departments in the organization? A business manager trying to pull together a bird’s-eye view of customer preferences could easily need data from Sales, Operations, Finance, Marketing and E-commerce. That can be like needing five different languages to buy a cup of coffee.
So, if you’re going to give business users access to data sources, keep in mind that they’re experts in business, not in IT or database programming. You don’t want them to waste their time looking for useful data in a lot of useless places.
Why data democratization?
Data democratization is related to data empowerment, a three-pillared IT approach for giving users what they need to make the best decisions for your organization.
The first pillar is data governance. To get a 360-degree view of data, you must understand what data you have, what it means and how it relates to the business. For example, to answer the question, “Who is buying our products in each region?” you would look for data sources for Customers and Sales, then determine the fields to query.
As described above, governance also involves understanding the guardrails — the rules, policies and regulations that are associated with the data. Does the data on Customers and Sales contain personally identifiable information (PII)? Must we comply with regulations or standards on the use of that data? You don’t have your customers’ permission to use their personal data for whatever you want; there are restrictions on its use. Information privacy is a hot topic everywhere, and in the U.S., the regulatory map is becoming a minefield as individual states establish their own privacy mandates.
In short, data governance is about walking the fine line between getting the greatest business use out of the data and reducing the risks that come with that data. The risks include fines, penalties and damage to your reputation if you leave sensitive data unsecured and unencrypted.
The next pillar is data operations, which involves preparing data for use and ensuring it’s available to your business users. Giving users access to all the data in the organization doesn’t matter if a simple query takes five minutes to run. The systems that deliver the data have to perform well enough to meet the needs of the business.
Finally, data protection covers the mechanics of ensuring your data is backed up properly and your myriad endpoints are secured. It includes archiving your data, retaining it for compliance and being prepared in case of audits. And it extends to setting policies on sensitive data so that it cannot be used improperly. If privacy laws prohibit the free examination and use of customer information, for example, data protection technologies can mask PII like name, address and age while leaving sales data operable.
You want the IT resources in your organization to focus on those three pillars — data governance, data operations and data protection — instead of fulfilling users’ requests for query results.
That’s why data democratization.
It isn’t that IT is in the way. It’s that there is so much data and so many different tools for using it that data democratization was inevitable.
Consider these business realities:
- The shortage of IT talent is real, in positions ranging from programmers to system administrators. In smart organizations, the move to groom a generation of “citizen analysts” has the potential to become part of business strategy instead of a stop-gap measure.
- Every opportunity has a shelf life, and you can miss it if you can’t get to the data you need in time. Recent history has shown the role that data analysis can play in slowing the spread of a coronavirus and devising a vaccine against it. While not every company faces the same array of life-and-death decisions, almost every company faces formidable competitors. Decisions made quickly and based on the right data are the key to coming out ahead.
- Data scientists spend as much as 45 percent of their time on data preparation tasks, including loading and cleaning data. While that is an improvement over the 75-85 percent they were spending a few years ago, it still represents a big chunk of time massaging data instead of analyzing it.
It turns out that, when you drill into that data preparation time, it usually starts with answering a question like “What data do I have that could help me solve this problem?” Most people don’t know what data is available in the organization. And, if they do know, they don’t know where it is or how to get access to it. So the next questions are “Who owns system X, system Y, etc.?” and “How can I get access to the data in those systems?”
When they access the data source, they’re likely to see an arcane description of the data in it, like non-intuitive table and field names. So they ask, “Is this the right Net Sales field?” and “Which one shows me sales before taxes? After taxes?” Then, to get to their target number, they may need to combine multiple pieces of data points. They wonder, “Are the fields in this table related to the fields in that table? How? How do I have to combine them?”
That obstacle course of preparation slows the process of data democratization. The process goes even more slowly if you have to phone the right person in IT to walk you through the data.
What does data democratization look like? Self-service shopping.
Ideally, it would be as easy for business users to find and use the right data as it is for them to shop online or find a movie to watch. Data democratization is about guiding the right data between the guardrails and putting it at users’ fingertips.
That means users would have a self-service shopping experience that includes features like these:
- Browsing with sensible parameters until they find data of interest
- Getting more information on the data — what it does and does not contain, and how it is derived
- Seeing related data — “People who used this data also used this other data”
- Using a shopping cart that shows the data you want and when you can expect to receive it
- Joining a community of people who have used the data and can tell you more about it
- Taking part in an entire ecosystem of business users instead of IT professionals
- Seeing whether the data has been encrypted, in case you want to transport it
- Determining whether the data has been anonymized so you don’t run afoul of privacy laws
That’s the ideal state.
Now, let’s be honest: It’s a lot less fun to shop for data than it is for power tools, hair care products and red slingback pumps. So, until we reach that ideal state, here’s another take on what data democratization looks like.
You start with this:
It’s the hodgepodge of modeling tools, report generators, cloud providers, ERPs and relational databases that business users in any enterprise have to sort through. Layered on top of them are hoops that users must jump through — the rules, regulations, standards, codes and auditing requirements associated with using the data.
Data democratization, on the other hand, looks more like this:
On the left, database professionals can find the physical data systems and structures that make sense to them, and on the right, business analysts can find the entities they need, with the guardrails of policies and governance.
The combination enables data democratization — seeing how the data flows through the organization, where you can pick it up and which enterprise usage policies apply to it.
Through data democratization you can deal with the abundance of data by giving more people what they need to make business decisions. With tools that demystify the structure and relationships among data points, users can analyze and make decisions as close to the data as possible.