Databricks has open sourced Unity Catalog: What that means for the ecosystem

Our point of view on why we need unified governance for Data and AI and why we are excited about Databricks releasing Unity Catalog as open source.

June 26, 2024
-
Uday Srinivasan, CTO, Acante

Our team is still absorbing all the customer conversations at the Databricks Data and AI conference held during the week of June 10th in San Francisco. Governance was a hot topic, as evidenced by the numerous sessions and various announcements from Databricks and its partners. During the keynote, in the chat between Ali Ghodsi and Jensen Huang, Huang mentioned that Databricks has pivoted from data processing to data governance. This statement accurately reflects what Databricks has accomplished with Unity Catalog over the past few years. Ali Ghodsi also highlighted in the keynote that quality and security are the primary reasons why 85% of AI workloads have not transitioned to production. At Acante, we simplify access governance for Data and AI, so I will be focusing on Access governance for the rest of this blog.

A modern data stack consists of the following components with governance ensuring quality, security and compliance across the stack.

Diagram: Modern Data Stack
Modern Data Stack

We love the lakehouse model that Databricks supports, as it allows the separation of data from compute and provides customers with the flexibility to build out the remaining stack. With the acquisition of Tabular, Databricks seems committed to the UniForm data format, enabling it to store a single copy of data for use by different data platforms. This is a significant win for customers, as it allows them to use various data processing tools for different use cases. In our conversations with customers and observations of the industry, what is lacking is a unifying governance layer across various systems. Without this, customers are still left to manage complex access governance across their data stacks. Each platform provides its own access control layer and telemetry to track access and lineage. Given the fluidity with which data moves between systems, a unified access governance layer would provide customers with choice and drive industry innovation.

Each vendor is attempting to make their governance layer the unifying one across data platforms. For example, Databricks supports Lakehouse Federation, which includes both Database federation (e.g., Redshift, Snowflake) and Catalog federation (e.g., Hive, AWS Glue), allowing Unity Catalog to serve as a governance layer for some customers. However, there is concern about vendor lock-in, as each data platform will inevitably focus on advancing its own data processing stacks. This is why we were very excited to learn about the open-sourcing of the Unity Catalog! It was received with great interest at the conference and community meetings starting the week of June 24th. You can read more about Databricks' intentions with open sourcing.

Note that Databricks didn’t open source their entire Unity Catalog but rather planted a seed with a 0.1 version with a very limited set of features. For UC OSS to be successful, it will require significant investment from Databricks and the ecosystem. We have already started identifying areas where we can innovate. As a validated Databricks partner, Acante welcomes the open-sourcing of Unity Catalog, as it could unleash a new wave of innovation that customers can benefit from! 

Unveiling the Challenge

As our digital footprint expands, so do the challenges of securing our data assets. Acante.ai recognizes the exponential proliferation and constant change in data access patterns, creating blind spots for traditional security approaches.

The Acante.ai Difference

At Acante.ai, our approach to data security marks a paradigm shift in the industry. Unlike traditional security models that often succumb to the static nature of data threats, Acante.ai thrives on dynamism. We believe that true security evolves with the challenges, and that's precisely what sets us apart. The Acante.ai difference lies in our commitment to providing security teams with more than just a shield; we offer a strategic ally that anticipates, adapts, and fortifies against the unpredictable proliferation of data access patterns. Our solution doesn't just keep pace with the digital transformation journey; it propels it forward. But what truly defines the Acante.ai difference goes beyond technology; it's ingrained in our culture. We are a collective of thoughtful, compassionate, and collaborative individuals on a shared mission to disrupt the security industry. With deep expertise from major brands and startups, we've collectively built over 10 startups, resulting in category-creating businesses, acquisitions, and IPOs. Our success is a testament to the collaborative spirit within our team, where every member contributes to shaping our culture and the future of data security. Join Acante.ai, and experience the difference that drives us to redefine the limits of protection in the digital age.

Dynamic Data Security

Explore the cutting-edge realm of dynamic data security with Acante.ai. In an era where the digital landscape is in a perpetual state of flux, Acante.ai's comprehensive approach to data security becomes not just a solution but a strategic imperative. Imagine a security system that not only reacts to the ever-changing data access patterns but anticipates and adapts in real-time. This level of sophistication is what sets Acante.ai apart. Our solution not only seamlessly integrates with the native controls of your data lakes and warehouse ecosystems but also evolves with them. It's not just about protecting your data; it's about empowering it. Acante.ai's dynamic data security solution is not confined by static parameters; it's a living, breathing shield that moves in harmony with the pulse of your data. As businesses navigate the complexities of the modern data landscape, Acante.ai provides not just a safeguard but a strategic ally, ensuring that security is not a hindrance but an enabler of progress.

Conclusion

In a world where data is both a valuable asset and a potential liability, Acante.ai emerges as a beacon of innovation. Join us on this exploration of the future of data security and discover how Acante.ai is empowering organizations to navigate the evolving landscape with confidence.
Request a Demo
The Next Wave of AI Safety Needs to Focus on Data Governanceimage
The Next Wave of AI Safety Needs to Focus on Data Governance

The path to AI success requires organizations to unlock the value of their proprietary data, but in order to do that, they need to ensure that the data they feed into these AI systems, including LLMs, is secure.

Acante Announces Partnership with Commvault to Bring Together the Best of Data Access Governance and Protection for Enterprise Cloud Dataimage
Acante Announces Partnership with Commvault to Bring Together the Best of Data Access Governance and Protection for Enterprise Cloud Data

Seamless integration with Commvault Cloud provides unparalleled cyber resilience in the face of growing ransomware attacks and breaches

AI Risk Starts with Data Risk: DBTA Data Summit Keynote Summaryimage
AI Risk Starts with Data Risk: DBTA Data Summit Keynote Summary

What the first wave of AI security efforts are missing, and how the Data Layer is where new and critical security and privacy concerns need to be addressed.

Our top 3 takeaways from Data+AI Summitimage
Our top 3 takeaways from Data+AI Summit

Learn why 85% of AI projects have NOT made it to production and how to empower data teams to overcome barriers to democratization of data access.

Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now
Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now
Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now
Databricks has open sourced Unity Catalog: What that means for the ecosystem

Our team is still absorbing all the customer conversations at the Databricks Data and AI conference held during the week of June 10th in San Francisco. Governance was a hot topic, as evidenced by the numerous sessions and various announcements from Databricks and its partners. During the keynote, in the chat between Ali Ghodsi and Jensen Huang, Huang mentioned that Databricks has pivoted from data processing to data governance. This statement accurately reflects what Databricks has accomplished with Unity Catalog over the past few years. Ali Ghodsi also highlighted in the keynote that quality and security are the primary reasons why 85% of AI workloads have not transitioned to production. At Acante, we simplify access governance for Data and AI, so I will be focusing on Access governance for the rest of this blog.

A modern data stack consists of the following components with governance ensuring quality, security and compliance across the stack.

Diagram: Modern Data Stack
Modern Data Stack

We love the lakehouse model that Databricks supports, as it allows the separation of data from compute and provides customers with the flexibility to build out the remaining stack. With the acquisition of Tabular, Databricks seems committed to the UniForm data format, enabling it to store a single copy of data for use by different data platforms. This is a significant win for customers, as it allows them to use various data processing tools for different use cases. In our conversations with customers and observations of the industry, what is lacking is a unifying governance layer across various systems. Without this, customers are still left to manage complex access governance across their data stacks. Each platform provides its own access control layer and telemetry to track access and lineage. Given the fluidity with which data moves between systems, a unified access governance layer would provide customers with choice and drive industry innovation.

Each vendor is attempting to make their governance layer the unifying one across data platforms. For example, Databricks supports Lakehouse Federation, which includes both Database federation (e.g., Redshift, Snowflake) and Catalog federation (e.g., Hive, AWS Glue), allowing Unity Catalog to serve as a governance layer for some customers. However, there is concern about vendor lock-in, as each data platform will inevitably focus on advancing its own data processing stacks. This is why we were very excited to learn about the open-sourcing of the Unity Catalog! It was received with great interest at the conference and community meetings starting the week of June 24th. You can read more about Databricks' intentions with open sourcing.

Note that Databricks didn’t open source their entire Unity Catalog but rather planted a seed with a 0.1 version with a very limited set of features. For UC OSS to be successful, it will require significant investment from Databricks and the ecosystem. We have already started identifying areas where we can innovate. As a validated Databricks partner, Acante welcomes the open-sourcing of Unity Catalog, as it could unleash a new wave of innovation that customers can benefit from! 

The Next Wave of AI Safety Needs to Focus on Data Governanceimage
The Next Wave of AI Safety Needs to Focus on Data Governance

The path to AI success requires organizations to unlock the value of their proprietary data, but in order to do that, they need to ensure that the data they feed into these AI systems, including LLMs, is secure.

Acante Announces Partnership with Commvault to Bring Together the Best of Data Access Governance and Protection for Enterprise Cloud Dataimage
Acante Announces Partnership with Commvault to Bring Together the Best of Data Access Governance and Protection for Enterprise Cloud Data

Seamless integration with Commvault Cloud provides unparalleled cyber resilience in the face of growing ransomware attacks and breaches

AI Risk Starts with Data Risk: DBTA Data Summit Keynote Summaryimage
AI Risk Starts with Data Risk: DBTA Data Summit Keynote Summary

What the first wave of AI security efforts are missing, and how the Data Layer is where new and critical security and privacy concerns need to be addressed.

Databricks has open sourced Unity Catalog: What that means for the ecosystemimage
Databricks has open sourced Unity Catalog: What that means for the ecosystem

Our point of view on why we need unified governance for Data and AI and why we are excited about Databricks releasing Unity Catalog as open source.

Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now
Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now
Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now
Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now