Introduction of SQL scripting in databricks, part 2

In the second part of the SQL Scripting Scripting Blog series we will explore the administrative task we discussed Part one-how to use the rule of the incital rule for each column in the table. This example will go step by step, explain the functions used and expand them behind a single table to cover the entire scheme.

You can also watch In this notebook.

Changing the assembly of all text fields in all tables in the scheme

Databricks supports a wide range of languages, insensitive and insensitive to the accent refreshment. This feature is easy to use for new tables and columns. But what if you have an existing system using the upper () or Low () in predicates everywhere and want to pick up the improvisation of the performance associated with native fluctuations in case Inswinner in simplifying your questions? This will require some programming; Now you can do it all in SQL.

Let’s use the following test scheme:

The order is based on ASCII CODEPOINTS, where all uppercase letters are preceded by all letters with low letters. Can you fix it without adding the upper () or Low ()?

Dynamic SQL commands and settings settings

Our first step is to communicate the table to change the default sum for the newly added columns. You can feed the local variables using the parameter marks that automatically detect the notebook and add widgets. You can also use Immunized execution run dynamically folded Alter table Stvtry.

Each SQL script consists of the beginning. Local variables are defined first with the state of the compound, followed by logic.

All this is just a set of linear commands. Meanwhile, you could write it all using the SQL session variables without the status. Also, you did not achieve too much. After all, you wanted to change compliance for existing columns. You want to do it, you have to:

  • Discover all existing string columns in the table
  • Change the assembly for each column

In short, you must loop over Information_schema.columns table.

Loop

SQL scripting offers four ways of repetition and ways of controlling an iteration loop.

  1. Loop … End loop;;
    This is the loop “forever”.
    This loop will continue with an annual exception or explicit explicit Iter gold Holiday The command breaks out of the loop.
    We will discuss later and point out the iterate documentation and leave how to control the loops.
  2. While the predicate does … end;
    This loop will be entered and re -entry if the predicate expression is evaluated on TRUE or the loop is distributed on an exception, it istero or left.
  3. Repeat … until the predicate is repeated;
    Unlike the time this loop enters at least once and re -omitted until the predicate expression evaluates the false or the loop is not interrupted by the exception, departure or iterat.
  4. To ask…. End for;
    This loop performs once per line of query if it does not stay soon, with the exception, departure or command of iterate.

Now use For Loop on our script snack. The question is obtained by the columns of all columns of the table string. The loop body turns each column again:

Let’s verify that the table has been properly updated:

So far so good. Our code is functionally completed, but you should say Delta to analyze the columns that you have edited to make the files to make the files. You don’t want to do it in the column. But gathering them all together and doing work only if there was actually a column of the string for which the coalance was changed. Decision, decision….

Logic

SQL scripting offers three ways to perform a conditional version of SQL commands.

  1. If it is different Logic. Syntax for this is simple:
    If the predicate then… Elseif predicate then… otherwise…. End if;
    Naturally you can have any number of optional Elseif Blocks and finals Other It is also optional.
  2. Simple Case
    This star is a scripting version of the SQL simple expression of the case.
    Expression box, if then the possibility… otherwise… the end box;
    The only version of the expression is compared with several options and the first game decides which SQL command set should be done. If no one matches, an optional Else block will start.
  3. Sought -after Case
    This star is a scripting version of the SQL sought -after expression of the case.
    Wheen predicate box then…. Otherwise… the end box;
    The block of that time is made for the first of any predicates that evaluate True. If no one matches, an optional Else block is running.

For our script snack has a simple If then ends, if will be enough. You must also collect a set of columns Analyzed To magic and some higher order functions to create a column list:

Nest

What you have written so far works for each table. What if you want to work on all tables in the scheme? SQL scripting is fully composable. In other SQL scripts, you can in a nest, conditional commands and loops.

So what you do here is double:

  1. Add a year For Loop to find all tables within the scheme using Information_schema.ableles. You must replace links to the variable table name Links to results For Question to the loop.
  2. Add the nested compound and move the variable columns list down to the other For loop. You cannot declare a variable directly in For the body of the loop; It does not add a new range. This is mainly the decision related to the coding style, but you will have a more serious reason for the new range.

This error makes sense. You have several ways to continue:

  1. Filter out unsupported table types, such as views, in the Information Scheme query. The problem is that there are types of numbers of numbers and sometimes new ones are added.
  2. To process views. That’s a great idea. Let’s say your homework.
  3. Toleration of the state of error

Manipulation

SQL scripting ability is the ability to capture and process exceptions. Conditions of conditions are defined in the section of the Declaration of Composite Statejes and apply to any statement in this compound, include a nested statement. You can process specific error conditions by name, specific SQLSTATES that process several error conditions or all error conditions. You can use the state of the state operator Get diagnostics Stamment to load information about processing exceptions and implementing any SQL scripting that you consider appropriate, such as recording a log error or starting alternative logic to the one that failed. Then you can SIGNAL New error condition, Resignation The original condition, or simply leave the compound persecution where the handler is defined and continues with the following tree.

In our script you want to skip any star for which Alter table Default Stamente did not use and entered the name of the object.

You have developed an administrative script purely in SQL. You can also write ELT scripts and turn them into a role. SQL scripting is a really powerful tool that you should use.

What to do on

Whether you are existing databricks or migration from another product, SQL scripting is the ability you should use. SQL scripting monitors Standard ANSI and is fully compatible with OSS Apache Spark ™. SQL scripting is described in detail SQL scripting Documentation databricks.

You can also use This laptop To see for yourself.

Leave a Comment