Hyper competition, globalization, economic uncertainties — all of it converging to drive a C-suite impetus for the business to become more data-driven. Organizations invest in more data science and analytical staff as they demand faster access to more data. At the same time, they’re forced to deal with more regulations and privacy mandates such as GDPR, CCPA, HIPAA, and numerous others. The outcome? The current methods meant to serve them — usually an overburdened IT team — end up failing, resulting in an alarming amount of friction across the entire organization.

The heart of the friction

Friction across the enterprise ecosystem impacts every part of the value chain. It’s driven by three primary dynamics:

Increasing number of analysts and data scientists asking for data.More regulations and policies required to enforce.A tectonic shift of data processing storage to the cloud.

Analytical demand

Over the last two to three decades, analytics have gone from the domain of IT to business self-service analytics. For the traditional financial and summary type reports, this is easy since data comes from curated and structured data warehouses. The newer self-service demand is for non-curated data for purposes of AI and machine learning.

Regulatory demand

More regulations result in more policies, but the bigger impact is going from passive enforcement to active enforcement. Passive enforcement relies on training people and hoping they’ll follow proper protocol. Active enforcement establishes a posture where systems proactively stop people from hurting themselves or the company. For example, a zero trust framework would assume you should only have access to the data you need and nothing more.

Moving to the cloud

With the move to the cloud, we don’t just move data outside our traditional perimeter defenses. The platforms separate storage from processing or compute with different styles of compute to serve different analytical use cases. The result is an exploding number of policies applied across dozens of data technologies — each with its own mechanism for securing data.

A use case for balanced data democratization

Privacera worked with a major sports apparel manufacturer and retailer on its data-driven journey to the cloud. The client’s on-prem data warehouse and Hadoop environment turned into a massive set of diverse technologies: S3 for storage and a host of compute and pressing services like EMR, Amazon Web Services (AWS), Starburst, Snowflake, Kafka, and Databricks. GDPR and CCPA emerged as critical mandates that had to be enforced actively. Hundreds of analysts excitedly tried to get access to the new data platform, outnumbering the IT support staff. The result was more than 1 million policies, and they only managed to get around 15 percent of their data into the business’s hands.

The solution: Centralized policy management and enforcement for their entire data estate. Here are the elements of their centralized data security governance:

Real-time sensitive data discovery, classification, and tagging to identify sensitive data in newly onboarded data sets from trading partners.Build once, enforce everywhere. Policies are built centrally in an easy to use, intuitive manner. Those policies are then synchronized to each underlying data service where the policy is natively enforced.Built-in advanced attribute, role, resource or tag-based policies, masking and encryption to define fine-grained controls versus the previous coarse-grained model.Real-time auditing of access events, monitoring, and alerting on suspicious events.

The result: The client reduced the number of policies by 1,000-fold, onboarded new data 95 percent faster, and got 100 percent of the data into the business’ hands. 

The new way forward

Gartner’s State of Data and Analytics Governance suggests that by 2025, 80 percent of analytical initiatives will be unsuccessful because they fail to modernize their data governance processes. The challenge for CIOs and data and privacy leaders is these mandates are often not owned by a single person. CISOs often feel they own the security posture but not the enforcement. The data leader focuses on the analytical output and insights. The CIO is often left holding the bag and needs to pull it all together. In its recent Hype Cycle for Data Security 2022, Gartner suggests 70 percent of the investment in the data security category will be toward broad-based data security platforms that can help organizations centralize data access and policy enforcement across their diverse data estate.

Learn more about balancing performance and compliance with powerful data democratization. Get your free copy of the Gartner Hype Cycle for Data Security 2022.

Data and Information Security

Truly data-driven companies see significantly better business outcomes than those that aren’t. According to a recent IDC whitepaper, leaders saw on average two and a half times better results than other organizations in many business metrics. In particular, companies that were leaders at using data and analytics had three times higher improvement in revenues, were nearly three times more likely to report shorter times to market for new products and services, and were over twice as likely to report improvement in customer satisfaction, profits, and operational efficiency.

But to get maximum value out of data and analytics, companies need to have a data-driven culture permeating the entire organization, one in which every business unit gets full access to the data it needs in the way it needs it.

This is called data democratization. Doing it right requires thoughtful data collection, careful selection of a data platform that allows holistic and secure access to the data, and training and empowering employees to have a data-first mindset. Security and compliance risks also loom.

Starting on a solid data foundation

Before choosing a platform for sharing data, an organization needs to understand what data it already has and strip it of errors and duplicates.

A big part of preparing data to be shared is an exercise in data normalization, says Juan Orlandini, chief architect and distinguished engineer at Insight Enterprises.

Data formats and data architectures are often inconsistent, and data might even be incomplete. “All of a sudden, you’re trying to give this data to somebody who’s not a data person,” he says, “and it’s really easy for them to draw erroneous or misleading insights from that data.”

Organizations often turn to outside help with data normalization because, if done incorrectly, a business might still be left with data quality issues and can’t get as much use out of their data as intended.

As more companies use the cloud and cloud-native development, normalizing data has become more complicated.

“It might be in a NoSQL database, a graph database, or in all these other types of databases now available, and making those consistent becomes really challenging,” Orlandini says.

Exercising tactful platform selection

In many cases, only IT has access to data and data intelligence tools in organizations that don’t practice data democratization. So in order to make data accessible to all, new tools and technologies are required.

Of course, cost is a big consideration, says Orlandini, as well as deciding where to host the data, and having it available in a fiscally responsible way. An organization might also question if the data should be maintained on-premises due to security concerns in the public cloud. But Kevin Young, senior data and analytics consultant at consulting firm SPR, says organizations can first share data by creating a data lake like Amazon S3 or Google Cloud Storage. “Members across the organization can add their data to the lake for all departments to consume,” says Young. But without proper care, a data lake can end up disorganized and cluttered with unusable data. Most organizations don’t end up with data lakes, says Orlandini. “They have data swamps,” he says.

But data lakes aren’t the only option for creating a centralized data repository.

Another is through a data fabric, an architecture and set of data services that provide a unified view of an organization’s data, and enable integration from various sources on-premises, in the cloud and on edge devices.

A data fabric allows datasets to be combined, without the need to make copies, and can make silos less likely.

There are many data fabric software vendors, like IBM Cloud Pak for Data and SAP Data Intelligence, which were both named leaders in Forrester’s Enterprise Data Fabric Q2 2022 report. But with many available options, it can be difficult to know which to choose.

The most important thing is to analyze and monitor data, says Amaresh Tripathy, global analytics leader at professional services firm Genpact.

“Many platforms are out there,” he says. “Choose any platform that works for you, but it should be automated and visible.” Also, the data should be easily accessible from a self-service platform that makes data analysis reporting easy, even for people with no technical experience — “Like a portal where people can see all the data, what it means, what the metrics are, and where it’s coming from,” says Tripathy.

There’s no perfect tool, and there’s often a trade-off between how well a tool does data lineage, data cataloging, and maintains data quality. “Most organizations are trying to solve all three problems together,” Tripathy adds. “Sometimes you over-index on one and don’t get a very good value on another.” So an organization should decide what’s most important, he says. “They should know why they’re doing it, which tool gives them the best bang for their buck on those three dimensions, and then make the appropriate decision.”

When thinking about how to share data, an organization can also consider implementing a data mesh, which takes the opposite approach to data fabric. While data fabric manages multiple data sources from a single virtual centralized system, a data mesh is a form of enterprise data architecture that takes a decentralized approach and creates multiple domain-specific systems.

With a data mesh, organizations can help ensure data is properly handled by putting it in the hands of those who best understand it, says Chris McLellan, director of operations at Data Collaboration Alliance, a global nonprofit that helps people and organizations get full control of their data. It could be a person, such as the head of finance, or a group of people that are acting as data stewards.

“At its core, it’s got this concept of data as a product,” he says. “And a data product is something that can be owned and curated by someone with domain expertise.”

Implementing a data mesh architecture allows an organization to put specific data sets in the hands of subject matter experts. “These people are closer to the regulations, the customer, and the end users,” McLellan says. “They’re closer to everything about that specific domain of information.”

Data mesh isn’t linked to any specific tools, so individual teams can choose whichever ones  best fit their needs, and there isn’t the bottleneck of everything having to go through a central data team.

“You’re seeing a decentralization not just of IT or app delivery, but also of data management and data governance,” says McLellan, “which are good things because marketers know the laws around consumer protection better than the IT team, and finance knows finance regulations better than IT.”

While there are many vendors selling data mesh, it’s still a shiny new object, Forrester warns, and it has its challenges, including conflicts in how it’s defined, the technologies it uses, and its value.

Training and change management

Once an architecture for data democratization is established, employees need to understand how to work with the new data processes. People can be given the right data, but even if they’re trained as administrators or accountants, they’re not necessarily going to understand what to do with it, says Insight’s Orlandini. Data access is not sufficient in itself to make an organization data-driven. “You have to do some training,” he says. “If you don’t do it properly, you’re going to have mixed success at best, or it might be a failure.”

Some organizations have started their own in-house training programs to ensure employees understand how to interpret and properly handle data.

Genpact, for instance, introduced what it calls its DataBridge initiative last year to increase data literacy across the organization.

“Our intention was not to make 100,000 people citizen data scientists,” says Tripathy. “We provide the awareness in the context of how they do their work.” For example, an employee doing claims analysis doesn’t need to learn all about anomaly detection — what they need to understand is what anomaly detection means for them. “You may or may not have all the skill sets to look at the data yourself, but you should be able to raise a question and seek help — and being able to ask that question in the right manner is the data-aware aspect of it,” he adds.

Laying the security and compliance groundwork

Proper data governance needs to be implemented from the start to maintain the integrity of data and avoid costly penalties.

Along with IT leaders, security and compliance teams need to be part of the initial conversation, says Insight’s Orlandini. “It’s a big challenge, and a lot of organizations struggle with this,” he says, adding that it’s a prerequisite a company’s leadership understands exactly what they’re offering to share, and makes sure it’s being offered to the right people.

“We live in a highly regulated world where we have to be super careful,” he says, “especially in industries like healthcare and finance where there are laws that have severe consequences if you let the wrong person have access to the wrong data.”

There are also tools that help organizations with data masking and data obfuscation to avoid revealing personally identifiable information. “You can start getting insights without revealing PII data, HIPAA records, or any of those regulatory requirements that are out there,” he continues. “There are also tools with attribute-based access controls where you actually tag data with very specific kinds of attributes — this has PII or HIPAA, whatever your attributes are — and then you only have access to the data with the right kind of attributes associated with it.”

In this way, the data controls itself automatically, and it’s available in a public cloud or hybrid environment with data in multiple locations, or even in private environments with strict compliance controls that can be put in place.

Long-term benefits

Not only can data democratization help an enterprise speed up its data pipelines, it can empower people to find new ways to solve problems through a better awareness of how to analyze and work with data.

Gartner says that by adopting data democratization, organizations can solve resource shortages, decrease bottlenecks, and enable business units to handle their own data requests more easily. By democratizing data, organizations can improve their decision-making by allowing more people to contribute to the analysis and interpretation of data; increase collaboration across teams within an organization; and enhance transparency, since more people have access to information, and can see how data-driven decisions are made.

CIO, Compliance, Data and Information Security, Data Architecture, Data Center Management, Data Governance, Data Management, Data Quality, Database Administration, Databases, IT Leadership

Data-driven brick house, or house of modern tech cards?

The origin of winning insights? Sometimes, it’s an individual at your organization. Other times, insights come from cross-team collaboration. No matter the human catalyst for the breakthrough, it all starts with your data. You need to empower your people, across technical and non-technical users, with equal access to data. In other words, data democratization. And democratization begets the data-driven culture your organization needs for enduring success.

There’s good news and not-so-good news about organizations’ data-driven mission. The good? Most executives agree data-driven operations across lines of business is key to a winning strategy. Illustrating that point is the 85% increased investment in digital capabilities and 77% increased investment in IT, as reported in the 2022 Gartner CEO and Senior Business Executive Survey.

The not-so-good news? In New Vantage Partners’ Data and AI Leadership Executive Survey 2022, two alarming numbers highlight the struggle: 1) Only 26.5% report having achieved the data-driven goal, and 2) only 19.3% report having established a data culture. Those two numbers contribute to one particularly disturbing number: Through 2025, 80% of organizations seeking to scale digital business will fail because they don’t take a modern approach to data and analytics governance, according to Gartner’s State of Data and Analytics Governance. Modernizing tech stacks and migrating to the cloud are not enough on their own. Organizations must modernize their governance practices to fully uphold their modernization efforts. With no buttressing data-driven or data-democratization pillars, their efforts are built upon a risk-ridden house of cards.

Building a democratic, data-driven brick house

Moving to the cloud promises massive potential. But the benefits are equally matched by complexity. Each cloud service has its own unique method of security and access policy enforcement. Now, add the growing spectrum of regulations and compliance mandates that must be enforced and kept up to date across this increasingly complex and technically diverse data estate. Plus the increasing number of data consumption and policy stakeholders.

All this complexity creates massive friction between data consumers, policy drivers, and especially IT, which is tasked with physical policy implementation and enforcement. 

So begins the journey to evolve data governance. And with no single size fitting all, organizations are struggling. How do we build a sustainable, modern data governance program?

Six practical steps toward better governance

Create and document your data guidelines as to who can access specific types of data.Document policies to formalize roles, access, and capabilities. For example, what can each type of user do with different types of data?Perform data discovery to determine what’s in each table, column, row, or cell. Classify and tag the sensitive data based on the rules and requirements laid out in your guidelines and policiesCreate fine-grained access policies that will be enforced in your data layer, predicated on your defined guidelines. This is the basis for permitting or accessing data for discovery, moving beyond describing and documenting to establish where sensitive data resides.Expand base policies by defining and creating specific controls for enforcing encryption, masking, and tokenization across each data service accessed.Provide deep analysis and monitoring to provide full visibility, monitoring, and auditing into data access and usage – ideally via a single pane of glass for administrators and data stewards to monitor and control data at rest and in motion.

It’s possible to achieve centralized data access and security governance across your entire multi-cloud estate. All your stakeholders – such as business teams, IT, and compliance – can leverage a unified, holistic way to manage, define, and enforce data access policies across your storage, compute engines, and consumption methods. That’s what you need to power your data democratization and achieve a data-driven enterprise.

By using Privacera, enterprises get a single, centralized data security and access policy management and enforcement layer for efficient access governance across hybrid and multi-cloud data sources. Their unified platform simplifies data security and privacy enforcement by automating and managing the entire lifecycle, including sensitive data discovery and classification. Classify and tag data based on built-in support for PII, GDPR, HIPAA, and various other regulations. Build fine-grained security policies, automatically synced to your underlying data services for localized policy enforcement and execution. And reduce time to insight through orchestration and automation of manual data governance processes to fuel your data-driven mission. Privacera’s built-in data access request workflows and automation securely get the right data to the right people faster to power your data democratization.

Learn more about Privacera here.

Data and Information Security