Gina Harris, Director of Data Governance
If making the most of your data matters (and really, why wouldn’t it?), knowing the proper tools to use, recognizing how they differ, and using them in optimal order can be highly beneficial.
Informatica, an Enterprise Cloud Data Management leader, has a suite of integrated tools for data governance; they include Axon and Enterprise Data Catalog (EDC).
Data Governance can sound very heavy – and a little scary. But really, as one of my colleagues says, data governance is about:
- What data your company has
- Where it’s located
- Who owns it
- What it means, and
- Its trustworthiness
The goal of data governance is to improve the data environment through shared accountability and communication. Simply put, Axon and EDC help make data governance a lighter effort.
Axon is Informatica’s business-facing data governance product that encourages collaboration between team members in both Business and IT. It promotes data stewardship and a common business language. It also provides insight into data quality and compliance. Policies such as the company’s response to HIPAA or GDPR, can be added to Axon and linked to assets so everyone knows how to appropriately use and protect the data. Metadata about data sets, systems, and attributes can be imported into Axon through a link with EDC. There is no actual data available in Axon; only metadata.
EDC is exactly what its name suggests – an enterprise data catalog. EDC scans the company’s resources to determine what data assets (systems, data sets, and data elements) exist and if they contain PII or PHI. EDC can also profile the data and provide technical data lineage with impact analysis. Informatica’s machine learning offering, Claire, helps provide context to the data by classifying it based on the field name and the data within the field. EDC does have the ability to display data for a single column. So, EDC focuses on the technical, while Axon focuses on the business side. When used together, they’re a very strong combination.
Although Axon and EDC both use the term “Domain,” each tool uses it slightly differently – and that can result in confusion.
In Axon, Domain is the top level of a glossary hierarchy that includes Term, Entity, and Domain.
- Term is the lowest level, and is an idea or concept. Social Security Number, Last Name, Birth Date, and City are all examples of Term.
- Entity is the middle level. It’s sometimes referred to as “sub-domain” and is used to group Terms into a logical category. Party Identifiers or Party Demographics are examples of Entity.
- Domain is the highest level of the glossary hierarchy. Examples of Domain include Party, Address, or Product.
Differently, “Domain” in EDC is a classification of data. It is used to identify the functional meaning of a data element. The Data Domain is applied by Claire, data stewards or subject matter experts. Examples of a Data Domain in EDC include Social Security Number, Last Name, Birth Date, and City.
Hmmm. Don’t those sound familiar? They’re the same examples that I used for Term in Axon. A Data Domain in EDC relates more closely to a Term in Axon than to a Domain in Axon.
Fortunately, Axon can link an Axon Term to an EDC Data Domain. This is a bi-directional link with several benefits.
- All of the data elements (fields) in EDC that are tagged with the linked Data Domain can automatically be brought into Axon and associated with the Glossary Term. So, if we link Axon’s Social Security Number Term with EDC’s SSN Data Domain, then the metadata for all of the data elements in the Catalog that have been tagged as SSN will be available in Axon.
- Authorized EDC users will have the ability to jump from Axon into EDC for even more information about the attribute, including profile results.
- The business glossary definition from Axon for Social Security Number will be applied to all of the data elements in EDC that have the SSN Data Domain.
It is important to curate the Data Domain in EDC before linking it to an Axon Term. Curating is the process of accepting or rejecting Claire’s Data Domain assignments, and this helps Claire make better future assignments. Conversely, linking Axon and EDC before curating will bring incorrect data elements into Axon and associate them with the glossary term. For example, if Claire thought a zip_plus_4 field was a Social Security number and that data element in EDC was not curated before linking the SSN data domains to the Social Security Number term in Axon, then the zip_plus_4 data element will appear in Axon associated with Social Security Number, creating a correctable mismatch of sorts.
By curating in EDC first – where it’s substantially easier, and then linking Axon Terms with EDC Domains, you can make the most of your data.