Friday, March 10, 2023

DEFAULT value in CREATE TABLE when creating delta table in Azure Databricks

 

Issue description:

 

This was a thread posted in a forum about issue creating DELTA lake table with DEFAULT value specifier in create table SQL statement.

 

 INSERT INTO.. with DEFAULT keyword / value for a column value is failing in Databricks.

 

Following statement creates a table using DELTA format:

 

CREATE TABLE students (name VARCHAR(64), address STRING DEFAULT 'unknown', student_id INT)USING DELTA

 

Inserting into this table with 'DEFAULT' keyword throws this error:

 



 

Though Databricks link below indicates that DEFAULT value works with following version criteria, this member in the forum reported that it does not work when doing INSERT into.

 

Applies to

 Databricks SQL SQL warehouse version 2022.35 or higher 


Databricks Runtime 11.2 and above


Reference : https://docs.databricks.com/sql/language-manual/sql-ref-syntax-ddl-create-table-using.html

 

This is the MS link the forum member posted as reference and questioning why INSERT with DEFAULT keyword is failing .

 

https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/sql-ref-syntax-dml-insert-into#insert-using-values

 



 

 

Side story:

 

To this post, one of the other forum member created table and inserted with DEFAULT and confirmed working fine posting as a reply.

 

Turned it was a classis misunderstanding, this member ran the query creating a typical SQL table in SQL Server using SQL studio instead of Databricks 😅

 

 



 

Resolution:

 

 

Trick is to include this tblproperties setter to make DEFAULT work

 

TBLPROPERTIES(  'delta.feature.allowColumnDefaults' = 'supported');

 

When i tried running this command, it kept failing for me and later realized that i was still using 11.x DBR and my attempt to upgrade to 12.2 was not successful.

 

CREATE TABLE students (name VARCHAR(64), address STRING DEFAULT 'unknown', student_id INT)USING DELTA TBLPROPERTIES(  'delta.feature.allowColumnDefaults' = 'supported');

 

After upgrading cluster and did a detach & attach of notebook, it did work!!

 

In my INSERT query, I am passing DEFAULT for 'address' column and it took 'unknown' string as specified in CREATE table query.

 

You can also use leave out the 'address' column in INSERT and specify other columns to insert default value:

 

INSERT INTO STUDENT  (name, student_id) VALUES ('bharani', '1233')

 

to get 'unknown' inserted in place of address.




 

No comments: