Announcing Databricks Runtime 5.3

We are proud to announce the release of Databricks Runtime 5.

3, which includes several new features and improvements, including: GA of Delta Time Travel Public Preview of MySQL table replication to Delta Optimized DBFS FUSE folder for deep learning workloads (Azure-only) Databricks Delta Time Travel (Generally Available) Delta Time Travel has graduated to general availability.

It adds the ability to query a snapshot of a table using a timestamp string or a version, using SQL syntax as well as DataFrameReader options for timestamp expressions.

SELECT count(*) FROM events TIMESTAMP AS OF timestamp_expression SELECT count(*) FROM events VERSION AS OF version Time Travel has many use cases, including: Re-creating analyses, reports, or outputs (for example, the output of a machine learning model), which is useful for debugging or auditing, especially in regulated industries.

Writing complex temporal queries.

Fixing mistakes in your data.

Providing snapshot isolation for a set of queries for fast changing tables.

  For more details, see Query an older snapshot of a table (time travel), and Merge Into (Databricks Delta).

MySQL table replication to Delta (Public Preview) Databricks Runtime 5.

3 lets you stream data from a MySQL table directly into Delta for downstream consumption in Spark analytics or data science workflows.

Leveraging the same strategy that MySQL uses for replication to other instances, the binlog is used to identify updates that are then processed and streamed to Databricks as follows: Reads change events from the database log.

Streams the events to Databricks.

Writes in the same order to a Delta table.

Maintains state in case of disconnects from the source.

For more details, see MySQL Table Replication to Databricks Delta.

Optimized DBFS FUSE folder for deep learning workloads (Azure only) A new FUSE mount optimized for data loading, model checkpointing, and logging from each worker to a shared storage location, file:/dbfs/ml provides high-performance I/O for deep learning workloads.

For details, see Prepare Storage for Data Loading and Model Checkpointing.

Additional improvements Apart from the above, Databricks Runtime 5.

3 also includes: Notebook-scoped library improvements New Databricks Advisor hints Delta performance improvements And more … To learn more about the release, please see the Databricks Runtime 5.

3 Release Notes.

Try Databricks for free.

Get started today.. More details

Leave a Reply