Delta Lake is an open-source project that allows you to create a Lakehouse design based on data lakes. On top of existing data lakes like S3, ADLS, GCS, and HDFS, Delta Lake enables ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. The key features of Delta Lake incorporate ACID transactions, scalable metadata handling, time travel (data versioning), open format, unified batch, and streaming source and sink, schema enforcement, schema evolution, audit history, updates and deletes, 100% compatibility with Apache Spark API and, delta Sharing. A considerable number of companies use Delta Lake to process exabytes of data every month. These include Databricks, Viacom, Alibaba group, McAfee, Upwork, eBay, Informatica, and many more.

To contribute to the project, visit the repository: https://github.com/delta-io/delta

(Visited 67 times, 1 visits today)