# Declarations, not Functions, not Procedures

In a recent posting on www.swug.org, a regular contributor posted a T-SQL function that calculates the checksum digit of a standard, 13-digit barcode. The algorithm is a simple weighted sum method (see *Data & Databases*, section 15.3.1. if you do not know what that means). Given a string of 13 digits, you take the first 12 digits of the string of the barcode, use a formula on them, and see if the result is the 13th digit. The rules are simple:

- Sum each digit in an odd position to get S1.
- Sum each digit in an odd position to get S2.

Subtract S2 from S1, do a modulo 10 on the sum, and then compute the absolute positive value. The formula is ABS(MOD(S1-S2), 10) for the barcode checksum digit.

Here is the author’s suggested function code translated from T-SQL in Standard SQL/PSM:

CREATE FUNCTION Barcode_CheckSum(IN my_barcode CHAR(12))

RETURNS INTEGER

BEGIN

DECLARE barcode_checkres INTEGER;

DECLARE idx INTEGER;

DECLARE sgn INTEGER;

SET barcode_checkres = 0;

-- check if given barcode is numeric

IF IsNumeric(my_barcode) = 0

THEN RETURN -1;

END IF;

-- check barcode length

IF CHAR_LENGTH(TRIM(BOTH ' ' FROM my_barcode))<> 12

THEN RETURN -2;

END IF;

-- compute barcode checksum algorithm

SET idx = 1;

WHILE idx <= 12

DO -- Calculate sign of digit

IF MOD(idx, 2) = 0

THEN SET sgn = -1;

ELSE SET sgn = +1;

END IF;

SET barcode_checkres = barcode_checkres +

CAST(SUBSTRING(my_barcode FROM idx FOR 1) AS INTEGER)

* sgn;

SET idx = idx + 1;

END WHILE;

-- check digit

RETURN ABS(MOD(barcode_checkres, 10));

END;

Let’s see how it works:

barcode_checkSum('283723281122')

= ABS (MOD(2-8 + 3-7 + 2-3 + 2-8 + 1-1 + 2-2), 10))

= ABS (MOD(-6 -4 -1 -6 + 0 + 0), 10)

= ABS (MOD(-17, 10))

= ABS(-7) = 7

Okay, where to begin? Notice the creation of unneeded local variables, the assumption of an IsNumeric() function taken from T-SQL dialect, and the fact that the check digit is supposed to be a character in the barcode and not an integer separated from the barcode. We have three IF statements and a WHILE loop in the code. This is about as procedural as you can get.

In fairness, SQL/PSM does not handle errors by returning negative numbers, but I don’t want to get into a lesson on the mechanism used, which is quite different from the one used in T-SQL.

Why use all that procedural code? Most of it can be replaced by declarative expressions. Let’s start with the usual Sequence auxiliary table in place of the loop, nest function calls, and use CASE expressions to remove IF statements.

The rough pseudo-formula for conversion is:

- A procedural loop becomes a sequence set:

FOR seq FROM 1 TO n DO f(x);

=> SELECT seq FROM Sequence WHERE seq <= n;

- A procedural selection becomes a CASE expression:

IF.. THEN .. ELSE

=> CASE WHEN.. THEN .. ELSE.. END;

- A series of assignments and function calls become a nested set of function calls:

DECLARE x <type>;

SET x = f(..);

SET y = g(x);

..;

=> f(g(x))

Here is a translation of those guide lines into a first shot at a rewrite:

CREATE FUNCTION Barcode_CheckSum(IN my_barcode CHAR(12))

RETURNS INTEGER

BEGIN

IF barcode NOT SIMILAR TO '%[^0-9]%'

THEN RETURN -1;

ELSE RETURN

(SELECT ABS(SUM((CAST (SUBSTRING(barcode

FROM S.seq FOR 1) AS INTEGER)

* CASE MOD(S.seq)WHEN 0 THEN 1 ELSE -1 END)))

FROM Sequence AS S

WHERE S.seq <= 12);

END IF;

END;

The SIMILAR TO regular expression predicate is a cute trick worth mentioning. It is a double-negative that assures the input string is all digits in all 12 positions. Remember that an oversized string will not fit into the parameter and will give you an overflow error, while a short string will be padded with blanks.

But wait! We can do better:

CREATE FUNCTION Barcode_CheckSum(IN my_barcode CHAR(12))

RETURNS INTEGER

RETURN

(SELECT ABS(SUM((CAST (SUBSTRING(barcode

FROM S.seq FOR 1) AS INTEGER)

* CASE MOD(S.seq)WHEN 0 THEN 1 ELSE -1 END)))

FROM Sequence AS S

WHERE S.seq <= 12)

AND barcode NOT SIMILAR TO '%[^0-9]%';

This will return a NULL if there is an improper barcode. It is only one SQL statement, so we are doing pretty well. There are some minor tweaks, like this:

CREATE FUNCTION Barcode_CheckSum(IN my_barcode CHAR(12))

RETURNS INTEGER

RETURN

(SELECT ABS(SUM(CAST(SUBSTRING(barcode

FROM Weights.seq FOR 1) AS INTEGER)

* Weights.wgt))

FROM (VALUES (CAST(1 AS INTEGER), CAST(-1 AS INTEGER)),

(2, +1), (3, -1), (4, +1), (5, -1),

(6, +1), (7, -1), (8, +1), (9, -1), (10, +1), (11,

-1), (12, +1)) AS weights(seq, wgt)

WHERE barcode NOT SIMILAR TO '%[^0-9]%');

Another cute trick in Standard SQL is to construct a table constant with a VALUES() expression. The first row in the table expression establishes the data types of the columns by explicit casting.

What is the best solution? The real answer is none of the above. The point of this exercise was to come up with a set-oriented, declarative answer. We have been writing functions to check a condition. What we want is a CHECK() constraint for the barcode. Try this instead.

CREATE TABLE Products

(..

barcode CHAR(13) NOT NULL

CONSTRAINT all_numeric_checkdigit

CHECK (barcode NOT SIMILAR TO '%[^0-9]%')

CONSTRAINT valid_checkdigit

CHECK (

(SELECT ABS(SUM(CAST(SUBSTRING(barcode

FROM Weights.seq FOR 1) AS INTEGER)

* Weights.wgt))

FROM (VALUES (CAST(1 AS INTEGER), CAST(-1 AS INTEGER)),

(2, +1), (3, -1), (4, +1), (5, -1),

(6, +1), (7, -1), (8, +1), (9, -1), (10, +1), (11,

-1), (12, +1)) AS weights(seq, wgt)

= CAST(SUBSTRING(barcode FROM 13 FOR 1) AS INTEGER)),

..

);

This will keep bad data out of the schema. The reason for splitting the code into two constraints is to provide better error messages. That is how we think in SQL.

--

**Joe Celko** was a member of the ANSI X3H2 Database Standards Committee and helped write the SQL-92 standards. He is the author of over 750 magazine columns and seven books, the best known of which is *SQL for* *Smarties*, now in its third edition. For more articles, puzzles, resources, and more, go to www.celko.com.

Contributors : Joe Celko

Last modified 2006-03-10 01:27 PM