Handling Transact-SQL
If you are already using T-SQL on MS SQL, you might be interested in using the same type of functionality in the Babelfish environment. PostgreSQL provides a language called PL/pgSQL which is often used to write functions as well as procedures. The MS SQL equivalent to PL/pgSQL is called T-SQL (Transact-SQL).
To make Babelfish as compatible with MS SQL as possible, it provides a T-SQL implementation capable of understanding MS SQL Server code. As with every software which models some other software’s behavior, there are some differences and limitations.
Deploying T-SQL code in Babelfish
In this section, you will get to know T-SQL and learn how to implement basic functions. Let’s get started with some basic examples. The first example will show how to convert kilometers into statute miles using the TDS interface (= port 1433).
Let’s quickly dive into the code:
CREATE FUNCTION km_to_miles (@km float)
RETURNS float
AS
BEGIN
DECLARE @miles float
SELECT @miles = @km*0.6214
RETURN(@miles)
END
GO
SELECT km_to_miles (234)
GO
km_to_miles
145,4076
(1 row affected)
The T-SQL code can be executed directly. What you can instantly see is that there is no language clause anymore, and we have a slightly different syntax than PostgreSQL users are used to in PL/pgSQL.
Note that this is the way the function is created when you are operating using the TDS protocol. In other words: This is how it works when you connect as a MS SQL user. However, there is a second way to deploy T-SQL code: You can always deploy the function PostgreSQL-style via port 5432 (= standard PostgreSQL port). Here’s how it works:
CREATE FUNCTION public.km_to_miles("@km" double precision)
RETURNS double precision
LANGUAGE pltsql
AS '{"version_num": "1", "typmod_array": ["-1", "-1"], "original_probin": ""}',
'BEGIN
DECLARE @miles float
SELECT @miles = @km*0.6214
RETURN(@miles)
END';
Now this may look a bit strange, but it does the same thing.
Running simple loops
The next example shows how to run a simple loop in T-SQL. Just like in other languages, you can use WHILE loops. FOR loops are usually simulated. To demonstrate how T-SQL in Babelfish works, we have implemented a simple WHILE
loop:
CREATE TABLE counter (
number int
)
GO
DELETE FROM counter
GO
-- simple loop function
CREATE FUNCTION simple_loop(@num int)
RETURNS varchar
AS
BEGIN
DECLARE @Counter int
SET @Counter = 1
WHILE (@Counter <= @num)
BEGIN
INSERT INTO counter(number) VALUES (@Counter)
SET @Counter = @Counter + 1
END
RETURN (@Counter)
END
GO
SELECT simple_loop(5)
GO
SELECT * from counter
GO
This example contains two things that are of importance. First of all, you can see a loop in action. That’s pretty standard and straightforward. What you need to be aware of here is the type mismatch. We create an integer variable which is returned. However, what is returned is a varchar. In standard PostgreSQL, this would instantly create an error. Babelfish (and MS SQL) are way more tolerant in this case.
What is also important: If you want to ensure that the data is safe and visible: Don’t forget to COMMIT. Otherwise, data will be lost. Babelfish is not playing to the same transactional rules as standard PostgreSQL. You need to keep that in mind.
But let’s look at one more example. In the following case, we want to prevent a division by zero. The way to fix that in Babelfish is to use the NULLIF function:
CREATE FUNCTION simple_division (@num1 float, @num2 float)
RETURNS float
AS
BEGIN
DECLARE @result varchar
SELECT @result = @num1/NULLIF(@num2,0)
RETURN(@result)
END
GO
The NULLIF function is the best way to prevent division by zero.
Using CASE ... WHEN ...
in T-SQL
In T-SQL you can use CASE ... WHEN ...
similar to the way you can use it in PostgreSQL. The following listing contains an example of how to use CASE ... WHEN ...
correctly. We used the “PostgreSQL” method to connect to the database:
CREATE FUNCTION case_in_assignment()
RETURNS void AS
$$
DECLARE @a text,
@b int = 1,
@c text = 'one';
BEGIN
SET @a = CASE @b WHEN 1 THEN @c ELSE 'other value' END
PRINT @a
SET @b = 3
SET @a = CASE @b WHEN 1 THEN @c ELSE 'other value' END
PRINT @a
END
$$ LANGUAGE pltsql;
What is noteworthy here is the PRINT
command. It is basically the T-SQL equivalent of RAISE INFO
. When we run the code (in PostgreSQL mode) we can see what happens:
postgres=> SELECT case_in_assignment();
INFO: one
INFO: other value
case_in_assignment
--------------------
(1 row)
PRINT
simply issues a message.
Managing nested blocks
Just like PL/pgSQL, T-SQL is a classical block-oriented language. The concept is pretty similar to what we see in PostgreSQL. Of course, the syntax is slightly different, but it should be simple and straightforward. The following example shows how a simple IF
and a nested block can be used in T-SQL:
CREATE OR REPLACE FUNCTION if_samples()
RETURNS int AS
$$
DECLARE @a int
BEGIN
SET @a = 0
IF 2 > @a
IF @a = 1
PRINT '@a = 1'
ELSE
PRINT '@a < 1'
ELSE
BEGIN
SET @a = NULL
PRINT @a
END
RETURN 0
END
$$ LANGUAGE pltsql;
The following listing shows the output of the script (executed in PostgreSQL mode):
postgres=> SELECT if_samples();
INFO: @a < 1
if_samples
------------
1
(1 row)
Let’s now run the same thing using the tsql
command line tool so that we can compare the outputs:
2> SELECT if_samples()
3> GO
if_samples
@a < 1
(0 rows affected)
What might seem strange for PostgreSQL users is the way Babelfish (and therefore MS SQL) handles procedures vs. functions. In PostgreSQL, these two things are totally different, and using a procedure in the content of a function will certainly error out. In Babelfish the situation is as follows:
1> CALL if_samples()
2> GO
Msg 102 (severity 16, state 0) from BABEL Line 1:
"syntax error near ')' at line 1 and character position 16"
1>
2> CALL if_samples
3> GO
As you can see, the resultant behavior is slightly different.
Return values and data types
MS SQL and PostgreSQL differ in one important point. Often, the return data type of a function in PostgreSQL is void
. In other words: A function can return nothing.
That is not possible in Babelfish and MS SQL. Let’s modify the function we just used, and see what happens:
postgres=> CREATE OR REPLACE FUNCTION if_samples()
RETURNS void AS
$$
DECLARE @a int
BEGIN
SET @a = 0
IF 2 > @a
IF @a = 1
PRINT '@a = 1'
ELSE
PRINT '@a < 1'
ELSE
BEGIN
SET @a = NULL
PRINT @a
END
END
$$ LANGUAGE pltsql;
The PostgreSQL side is no problem. We can make the T-SQL function return void (= nothing). In PostgreSQL mode one can call the function normally:
postgres=> SELECT if_samples();
INFO: @a < 1
if_samples
------------
(1 row)
Calling the same function via TDS will error out because void
is not supported. You need to be aware of this fact.
bash# tsql -H PUT_HOSTNAME_HERE -p 1433 -U PUT_USER_HERE \
-P PUT_PASSWORD_HERE -D postgres
...
1> SELECT if_samples()
2> GO
Msg 50856066 (severity 16, state 0) from BABEL Line 1:
"data type void is not supported yet"
1>
Calling non-MS SQL functions
In real life, Babelfish users will use T-SQL functions and utilize MS SQL compatible functions (which is the whole idea of Babelfish in the first place). However, it can happen that an end user wants to call PostgreSQL code which does not exist in MS SQL.
The important point is that it is possible. Here is an example:
1> SELECT pg_database_size('postgres')
2> GO
pg_database_size
11128571
(1 row affected)
The pg_database_size
function is definitely PostgreSQL-specific. However, that doesn’t mean we cannot call the function. Quite to the contrary, we can make full use of all the procedures and functions PostgreSQL has to offer, unless there is a name conflict.
Temporary tables in action
PostgreSQL as well as Babelfish / MS SQL support the concept of a temporary table. However, there are some major differences. To show how this works, we first create a standard table:
1> CREATE TABLE a (id int)
2> GO
In Babelfish (and thus in MS SQL) a temporary table is prefixed using a hash as shown in the next listing:
1> CREATE TABLE #a (x int, y int)
2> GO
1> SELECT * FROM #a
2> GO
x y
1> SELECT * FROM a
2> GO
id
1> SELECT * FROM #a
2> GO
Msg 208 (severity 16, state 0) from BABEL Line 1:
"relation "#a" does not exist"
Fetching the row count
When you write a stored procedure, it often happens that you want to know the number of rows affected by a change. The way to do that in T-SQL is by using the @@ROWCOUNT
variable.
The following code snippet shows an example of how this works:
CREATE PROCEDURE test_changes @id int, @newid int
AS
BEGIN
UPDATE sample
SET persoid = @newID
WHERE persoid = @ID
IF @@ROWCOUNT = 0
PRINT 'Warning: No rows were updated';
END
GO
EXEC test_changes @id = 1200, @newid = 100000
GO
Warning: No rows were updated
(return status = 0)
The variable is set automatically and carries the number of rows touched by the UPDATE
statement we executed before. It offers a simple way to control the behavior of your procedure.
Running multi-transactional procedures
In T-SQL, a procedure can contain more than just one transaction. This allows us to commit directly inside a procedure. PostgreSQL has the same feature, which is why it was relatively easy to model Babelfish’s behavior on top of PostgreSQL.
The following listing shows some code demonstrating how to run multiple transactions inside the sample procedure.
CREATE PROCEDURE transactions_sample
AS
BEGIN
BEGIN TRANSACTION
INSERT INTO sample
VALUES ('Test', 'User', 1010)
COMMIT TRANSACTION
PRINT ('INSERT: ')
SELECT * FROM sample
BEGIN TRANSACTION
DELETE FROM sample
WHERE persoid = 1010
COMMIT TRANSACTION
PRINT ('DELETE: ')
SELECT * FROM sample
END
GO
EXEC transactions_sample
GO
The behavior is similar to what one would expect in PostgreSQL.
How T-SQL code is stored internally
In this section, you will learn how Babelfish stores stored procedure code internally. To understand how this works, we first need to connect using a normal PostgreSQL connection (so psql
on the PostgreSQL port and not tsql
on the MS SQL port.
What we see here is that PostgreSQL stores T-SQL code in a way not quite like other code:
postgres=> SELECT proname, prosrc, probin
FROM pg_proc
WHERE proname = 'km_to_miles' ;
proname | prosrc | probin
-------------+------------------------------------+---------------------------------------------------------------------------
km_to_miles | BEGIN +| {"version_num": "1", "typmod_array": ["-1", "-1"], "original_probin": ""}
| DECLARE @miles float +|
| SELECT @miles = @km*0.6214+|
| RETURN(@miles) +|
| END |
(1 row)
In the case of other languages, it works like this: If you are writing plain SQL or PL/pgSQL, the code is normally only stored in prosrc (plain text). If you happen to use a C function, you will find the function name of the C level in prosrc and the link to the shared library in the probin column. To show how this works, we have deployed the dblink extensions and run a simple query:
postgres=> CREATE EXTENSION dblink;
CREATE EXTENSION
postgres=> SELECT proname, prosrc, probin
FROM pg_proc
WHERE proname = 'dblink'
LIMIT 1;
proname | prosrc | probin
---------+---------------+----------------
dblink | dblink_record | $libdir/dblink
(1 row)
If you look at T-SQL code, you’ll see that there is a bit more magic involved. The code is in prosrc just like in other cases. However, the probin column is used to store a JSON document dealing with typmod-related information. This behavior is specific to T-SQL and is handled by Babelfish specific hooks.
The MS SQL equivalent to this query is the sp_helptext
function which is currently not supported by Babelfish.
Let’s take a look at one more example using the “PostgreSQL style interface” to write stored procedures. The following code shows some more complex behavior:
CREATE PROCEDURE [HumanResources].[uspUpdateEmployeePersonalInfo]
@BusinessEntityID [int],
@NationalIDNumber [nvarchar](15),
@BirthDate [datetime],
@MaritalStatus [nchar](1),
@Gender [nchar](1)
WITH EXECUTE AS CALLER
AS $$
BEGIN
-- SET NOCOUNT ON;
BEGIN TRY
UPDATE [HumanResources].[Employee]
SET [NationalIDNumber] = @NationalIDNumber
,[BirthDate] = @BirthDate
,[MaritalStatus] = @MaritalStatus
,[Gender] = @Gender
WHERE [BusinessEntityID] = @BusinessEntityID;
END TRY
BEGIN CATCH
EXECUTE [dbo].[uspLogError];
END CATCH;
END; $$ LANGUAGE 'pltsql';
As you can see, the syntax of CREATE PROCEDURE is not the way it works in PostgreSQL. The question is: How does it work internally? The TDS handler takes this code and processes it completely. pltsql is therefore an “additional stored procedure language”. The handlers take care of these syntax elements before the body of the real function starts.
Nevertheless, there are differences: We still see PostgreSQL-style dollar quoting and so on, which is different from a “real” MS SQL Server stored procedure.