Community Comment: Part 30 - With respect to Azure, Databricks Unity Catalog & Microsoft Purview is not an either-or proposition
- Databricks Unity Catalog & Microsoft Purview
- Not an either-or proposition for Azure
- Likely need to integrate these products
- Integration dependant on several factors
The comments I provided in reaction to a community discussion thread.
Cloud Solutions Architect at Consulting Firm:
Read my latest post if you want to know what I think.
Best data governance tool: Databricks Unity Catalog or Microsoft Purview?
https://www.linkedin.com/pulse/best-data-governance-tool-databricks-unity-catalog-matias-samblancat/?trackingId=3yW1GpYsQby6qEc5ZzJkbQ%3D%3D
Gfesser:
I'm currently working on implementing both Unity Catalog *and* Microsoft Purview for our data platform with my team. This article provides a decent summary, but as I agree that these products are moving fast, pay particular attention to the concluding "key considerations" section. Also, keep in mind that this article focuses more on differences between these two products, rather than integrating these together. The closing paragraphs mention that "we know [MS] is working on a new connector to streamline the process and hope [it] is ready to use soon", but a connector is already available, albeit still in preview state, and as my team cannot make use of non-GA products in production, I'm working with MS to get a better sense of the ETA. Note also that MS implemented an OSS accelerator to integrate these, but both Databricks *and* MS have advised my not using it due to tight coupling with specific library versions of Spark etc. Also keep in mind that (1) regarding data lineage etc, UC scope is constrained to data and operations against this data registered with UC, whereas Purview has greater purview (pun intended), and (2) you may want to consider using Purview if your architecture needs to consider the longer term Fabric roadmap.
Cloud Solutions Architect at Consulting Firm:
very good insight, thanks for sharing your experience. I agree that including Fabric into the technology stack might change things slightly, specially for companies planning to move workloads from dbr to Fabric (if any?). So far, Dabricks did not have strong competition in the data engineering space, but will Fabric change that? To be seen.. If that happens, it might make sense for those companies to avoid being Databricks centric organization but I believe it will have to be decided case by case as every context is different.
Cloud Solutions Architect at Toyota:
Thanks for sharing, [Cloud Solutions Architect at Consulting Firm].
I agree with the separation that Unity is more technical facing than Purview. But Apache Atlas API capability of Purview is still tough to beat.
Gfesser:
Completely agree. Microsoft was wise to not only make use of OSS under the covers, but expose its APIs.
IT Manager & Data Governance Product Owner at German Real Estate Firm:
Currently, there are NO data quality features for Purview which are in private preview or generally available.
Gfesser:
That's correct. I see us continuing to make use of custom code and Great Expectations for the data quality components of our data platform.