Scalar and aggregate functions allow you to perform operations on a set of values from a database without using external functions to handle the data. Though SQLite has some useful predefined functions, they sometimes don’t meet our needs. In this tutorial we’ll explain the differences between a scalar and aggregate function, and we’ll learn how to define our own scalar and aggregate functions and register them to a Python SQLite connection using create_function
and create_aggregate
. The source code for this tutorial can be downloaded from this link.
- Scalar Functions
- Aggregate Functions
- Other Remarks About Scalar and Aggregate Functions
- Closing Thoughts
Scalar Functions
Scalar functions are called once per record in the selected set and return a value for each call. In this section we will describe how to define new scalar functions, and how to register them in a database connection.
Defining New Scalar Functions
Unlike adapters and converters, scalar functions are registered for each connection using the create_function(name, nargs, func)
method of an
- A scalar function is invoked in a SQL statement using the name provided by the
name parameter. The names of scalar functions are case insensitive. - The
nargs parameter specifies the number of arguments of the scalar function. We will discuss this parameter in depth throughout this section. - The
func parameter is a callable object, i.e. a class with a__call__()
method. Every Python function is a callable object.func is called each time we use itsname in a SQL statement.
import sqlite3
import datetime
def test_age():
""" The `age()` function computes the age from the birth date. """
with sqlite3.connect(':memory:') as conn:
conn.create_function('age', 1,
lambda date: datetime.date.today().year \
- int(date.split('-')[0]))
conn.execute('CREATE TABLE artists(' \
'name TEXT NOT NULL, ' \
'birth_date TEXT NOT NULL, '\
'country CHAR(3));')
conn.executemany(
'INSERT INTO artists VALUES(?, ?, ?);',
( ('Lionel Richie', '1949-06-20', 'USA'),
('Alison Moyet', '1961-06-18', 'GBR'),
('Róisín Murphy', '1973-07-05', 'IRL'),
('Alissa White-Gluz', '1985-07-31', 'CAN'),
('Ralf Hütter', '1946-08-20', 'DEU')
))
for name, age in conn.execute(
'SELECT name, age(birth_date) FROM artists;'):
print('%s is %d years old.' % (name, age))
test_age()
The test_age()
function creates a new scalar function named datetime.date.today().year
, then splits the birth date, and, finally, subtracts the year of birth from the current year. This function has just one argument, create_function()
to 1. Then, we create the test_age()
function will print the age of each artist:
Lionel Richie is 70 years old. Alison Moyet is 58 years old. Róisín Murphy is 46 years old. Alissa White-Gluz is 34 years old. Ralf Hütter is 73 years old.
Code More, Distract Less: Support Our Ad-Free Site
You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.
We can also use a class with a __call__()
method as a scalar function. Let’s take a look at a sample class mapping color names to RGB colors:
import sqlite3
def test_color_map():
""" Using a Python class as scalar function. """
class ColorMap:
""" A class mapping color names into RGB colors. """
cmap = {
'black' : (0 , 0 , 0 ),
'white' : (255, 255, 255),
'red' : (255, 0 , 0 ),
'blue' : (0 , 255, 0 ),
'green' : (0 , 0 , 255)
}
def __call__(self, name):
try:
return str(self.cmap[name.lower()])
except KeyError:
return None
with sqlite3.connect(':memory:') as conn:
conn.create_function('rgb', 1, ColorMap())
conn.execute('CREATE TABLE drawing(' \
'item TEXT NOT NULL, ' \
'x INTEGER NOT NULL, ' \
'y INTEGER NOT NULL, ' \
'stroke TEXT NOT NULL, ' \
'fill TEXT NOT NULL);')
conn.executemany(
'INSERT INTO drawing VALUES(?, ?, ?, ?, ?);',
( ('line-01' , 12 , 23, 'red' , 'red' ),
('circle-03', 34 , 5 , 'black', 'blue' ),
('line-02' , 129, 23, 'black', 'black'),
('text-01' , 12 , 3 , 'black', 'white'),
('rect-33' , 80 , 54, 'black', 'green')
) )
for item, stroke, fill in conn.execute(
'SELECT item, rgb(stroke), rgb(fill) FROM drawing;'):
print(';'.join((item, stroke, fill)))
test_color_map()
The __calls__()
method tries to retrieve a name from the color map in a case-insensitive way. It returns the RGB value, or None if __call__()
method has just one argument, so we will set the create_function()
to 1. The test_color_map()
function creates a SELECT
statement uses the rgb()
scalar function to retrieve all colors from the database in RGB format:
line-01;(255, 0, 0);(255, 0, 0) circle-03;(0, 0, 0);(0, 255, 0) line-02;(0, 0, 0);(0, 0, 0) text-01;(0, 0, 0);(255, 255, 255) rect-33;(0, 0, 0);(0, 0, 255)
Scalar Functions with the Same Name
You can register multiple scalar functions to the same name, as long as they have a different number of arguments. SQLite will match the function call to the right implementation according to the number of parameters.
import sqlite3
import random
def test_scalar_signature():
""" Registering multiple functions to the same name. """
with sqlite3.connect(':memory:') as conn:
conn.create_function(
'randint', 1, lambda u: random.randint(0, u))
conn.create_function(
'randint', 2, lambda l, u: random.randint(l, u))
for i in range(5):
for n in conn.execute('SELECT randint(1, 100);'):
print('Pick a random number: %d' % n)
for i in range(5):
for n in conn.execute('SELECT randint(50);'):
print('Pick a random number [0-50]: %d' % n)
test_scalar_signature()
The test_scalar_signature()
registers two
- The first function picks a random integer between 0 and
u . Therandint(50)
function call matches this function, since it has just one parameter. - The second function chooses a random integer between
l andu . Therandint(1, 100)
function call matches this function, since it has two parameters.
The test_scalar_signature()
has this typical output:
Pick a random number: 40 Pick a random number: 93 Pick a random number: 87 Pick a random number: 6 Pick a random number: 28 Pick a random number [0-50]: 26 Pick a random number [0-50]: 18 Pick a random number [0-50]: 0 Pick a random number [0-50]: 50 Pick a random number [0-50]: 30
Scalar Functions with Default Parameters
Sometimes we wish to provide default values to parameters of a scalar function. But, as we have seen above, create_function()
requires the number of arguments of the registered function. How many arguments have functions like func(a, b=1, c=2)
? A function like this may have 1, 2, or 3 arguments, depending on how we call it. To register such a function, we must set
conn.create_function('f', -1, func)
Setting the
import sqlite3
def test_default_parameters():
""" Scalar functions with default parameters. """
with sqlite3.connect(':memory:') as conn:
conn.create_function('concat', -1,
lambda x, y, sep=',' : "%s%s%s" % (x, sep, y))
conn.execute('CREATE TABLE songs(' \
'artist TEXT NOT NULL, ' \
'title TEXT NOT NULL);')
conn.executemany('INSERT INTO songs VALUES (?, ?);',
( ('Muse', 'Sunburn'),
('Epica', 'Storm the Sorrow'),
('Leo Sayer', 'Living in a Fantasy'),
('Spandau Ballet', 'Gold'),
('Roy Orbison', 'Uptown')
))
print('Using the `,` default separator:')
for word in conn.execute(
'SELECT concat(artist, title) FROM songs;'):
print(word[0])
print('\nSpecifying `;` as separator:')
for word in conn.execute(
"SELECT concat(artist, title, ';') FROM songs;"):
print(word[0])
test_default_parameters()
The test_default_parameters()
registers a concat(x, y, sep=',')
function, where SELECT
statements, one using the default separator, the other using ;
as separator. The output of this functions is:
Using the `,` default separator: Muse,Sunburn Epica,Storm the Sorrow Leo Sayer,Living in a Fantasy Spandau Ballet,Gold Roy Orbison,Uptown Specifying `;` as separator: Muse;Sunburn Epica;Storm the Sorrow Leo Sayer;Living in a Fantasy Spandau Ballet;Gold Roy Orbison;Uptown
Scalar Functions with Variable Number of Arguments
Occasionally we don’t know the number of parameters of a function in advance, so we want to define functions capable of handling a variable number of parameters. Functions like this are sometimes referred to as variadic functions. Just like the case of functions with default parameters, variadic functions are registered with create_function
method set to -1.
import sqlite3
def test_variadic_scalar():
""" Scalar function with variable number of arguments. """
with sqlite3.connect(':memory:') as conn:
conn.create_function('csv', -1,
lambda *args : ','.join(map(repr, args)) )
conn.execute('CREATE TABLE songs(' \
'artist TEXT NOT NULL, ' \
'song TEXT NOT NULL, ' \
'album TEXT NOT NULL, ' \
'year INTEGER NOT NULL);')
conn.executemany(
'INSERT INTO songs VALUES (?, ?, ?, ?);',
( ('Muse', 'Cave', 'Showbiz', 1999),
('America', 'Sandman', 'America', 1971),
('Pet Shop Boys', 'Happiness', 'Super', 2016),
('Sleeping Romance', 'Alba', 'Alba', 2017),
('Depeche Mode', 'Home', 'Ultra', 1997)
))
print('artist,song,album,year')
for record in conn.execute(
'SELECT csv(artist, song, album, year) FROM songs;'):
print(record[0])
print('\nartist,song,year')
for record in conn.execute(
'SELECT csv(artist, song, year) FROM songs;'):
print(record[0])
test_variadic_scalar()
The test_variadic_scalar()
function registers a
The output of the test_variadic_scalar()
function will be:
artist,song,album,year 'Muse','Cave','Showbiz',1999 'America','Sandman','America',1971 'Pet Shop Boys','Happiness','Super',2016 'Sleeping Romance','Alba','Alba',2017 'Depeche Mode','Home','Ultra',1997 artist,song,year 'Muse','Cave',1999 'America','Sandman',1971 'Pet Shop Boys','Happiness',2016 'Sleeping Romance','Alba',2017 'Depeche Mode','Home',1997
Code More, Distract Less: Support Our Ad-Free Site
You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.
Aggregate Functions
Aggregate functions are called for each record in the table but return just once, after all records have been processed. In this section we will learn how to define and register our own aggregate functions.
Defining New Aggregate Functions
In sqlite3
an aggregate function is a class providing three methods:
- You initialize a new instance of the class with the
__init__()
method. It only has aself parameter, that is, the instance of the class. You can use it to initialize some variable, like a counter. - The
step()
method is called once for any record of the table. You can define it with as many arguments as you want. Handling NULL values is your responsibility. - After the last row has been processed, the
finalize()
method is called to compute the result of the aggregate function. It also only has theself parameter.
Let’s take a look at an example. We’ll define a sum()
function, which tries to convert any value to a numeric type. You define an aggregate function using the create_aggregate(name, nargs, aggregate_class)
method of an create_function
method was used to define scalar functions. This is best illustrated with an example:
import sqlite3
def test_reduce():
""" Defining a new aggregate function. """
class Reduce:
""" Adds all values. """
def __init__(self):
self.total = None
def step(self, item):
if self.total is None:
self.total = item
else:
self.total += item
def finalize(self):
return self.total
with sqlite3.connect(':memory:') as conn:
conn.create_aggregate('reduce', 1, Reduce)
conn.execute('CREATE TABLE data(' \
'word TEXT, num INTEGER, bin BLOB);')
conn.executemany(
'INSERT INTO data VALUES(?, ?, ?);',
( ('Guido' , 1, b'\x00\x11'),
('Van' , 2, b'\x22\x33'),
('Rossum', 3, b'\x44\x55') ))
for record in conn.execute(
'SELECT word, num, quote(bin) FROM data;'):
print(','.join(map(str, record)))
for word, num, bin in conn.execute(
'SELECT reduce(word), reduce(num), ' \
'quote(reduce(bin)) FROM data;'):
print('reduce(word) = %s' % word)
print('reduce(num) = %d' % num)
print('reduce(bin) = %s' % bin)
test_reduce()
Let’s break down this example:
- The
Reduce class represents our aggregate function. Its__init__()
method setstotal toNone
. We will use it to hold the partial “sum” of all values. Itsstep()
method handles a single value from the table. Iftotal hasn’t been set yet, then it sets it toitem . Otherwise, it “adds” it tototal . Addition behaves according to the data type oftotal :- If it is an
int
, then the function performs integer addition. - If it is a
float
, then it performs float addition. - If it is a
str
, then it concatenatestotal toitem . - If it is a
blob
, then it concatenatestotal toitem .
- If it is an
test_reduce()
creates aconn connection to an in-memory database. Then, it registers theReduce aggregate function as'reduce'
. It creates adata table, holding three distinct data types, and fills it with 3 records. We print out all records using aSELECT
statement. Thequote(bin)
function will return the hexadecimal representation of the BLOB, in order to make it more readable. Finally, we apply thereduce()
function on each column, and we get the following output:
Guido,1,X'0011' Van,2,X'2233' Rossum,3,X'4455' reduce(word) = GuidoVanRossum reduce(num) = 6 reduce(bin) = X'001122334455'
The DISTINCT Filter
Every aggregate function with a single parameter can prefix its argument with ALL
or DISTINCT
. The ALL
keyword selects all rows in the results set (which is the default behavior), while DISTINCT
selects just one instance for each value in the column. In the following example the DISTINCT
keyword helps the choice()
function pick a value from distinct samples.
import sqlite3
import random
def test_choice():
""" Test the `Chooser` aggregate function. """
class Chooser:
""" Picks an item from a set of choices. """
def __init__(self):
self.choices = []
def step(self, item):
""" Adds `values` to the choices. """
print('inserting %d' % item)
self.choices.append(item)
def finalize(self):
""" Picks an item from the choices. """
if self.choices:
return random.choice(self.choices)
with sqlite3.connect(':memory:') as conn:
conn.create_aggregate('choice', 1, Chooser)
conn.execute(
'CREATE TABLE samples(sample INTEGER);')
conn.executemany(
'INSERT INTO samples VALUES(?);',
( (1, ), (3, ), (7, ), (1, ), (3, ),
(4, ), (5, ), (1, ) ))
print('Available samples:')
for n in conn.execute('SELECT * FROM samples;'):
print(n[0])
print('\nChoosing from all samples:')
for n in conn.execute(
'SELECT choice(sample) FROM samples;'):
print('Pick a sample: %d' % n)
print('\nOnly one instance per sample:')
for n in conn.execute(
'SELECT choice(DISTINCT sample) FROM samples;'):
print('Pick a sample: %d' % n)
test_choice()
Let’s have a closer look at this example:
- The
Chooser class is an aggregate function which collects all values from a column, and returns a random value from it, orNone
if there is no value available. Its__init__()
method creates an empty list. Itsstep()
method printsitem , then adds it to the list. It doesn’t check if that value is already on the list. Finally, itsfinalize()
method chooses a value from the list usingrandom.choice()
. - The
test_choice()
function opens aconn connection to an in-memory database, and registers theChooser
function as'choice'
. Then, it creates asamples table, and inserts some integer values in it. As you can see, samples are not unique, e.g. there are more than one instance of1
. Finally, we try to use thechoice()
function twice, with and without theDISTINCT
keyword, getting the following results:
Available samples: 1 3 7 1 3 4 5 1 Choosing from all samples: inserting 1 inserting 3 inserting 7 inserting 1 inserting 3 inserting 4 inserting 5 inserting 1 Pick a sample: 1 Only one instance per sample: inserting 1 inserting 3 inserting 7 inserting 4 inserting 5 Pick a sample: 3
As we can see, in the first case all values from the DISTINCT
keyword prevents duplicate values from being inserted into the list. Using DISTINCT
can be more efficient than testing for duplicates in the step()
method, using something like if item not in self.choices
.
Aggregate Functions with Variable Number of Arguments
Everything we said earlier about the number of arguments of scalar functions still apply to the step()
method of classes representing aggregate functions. That means we can have:
- Multiple aggregate functions with the same name, but different number of parameters.
- An aggregate function with default parameters.
- An aggregate function with variable number of parameters.
The only difference is you call the create_aggregate
method instead of the create_function
method. Let’s see an example of variadic aggregate function:
import sqlite3
def test_variadic_aggregate():
""" Defining a variadic aggregate function. """
class RangeMax:
""" Maximum value in a range of cells. """
def __init__(self):
self.maxval = None
def step(self, *values):
if values:
if self.maxval is None:
self.maxval = max(values)
else:
self.maxval = max(
self.maxval, max(values))
def finalize(self):
return self.maxval
with sqlite3.connect(':memory:') as conn:
conn.create_aggregate('rangemax', -1, RangeMax)
conn.execute('CREATE TABLE review(' \
'item TEXT NOT NULL, ' \
'USA INTEGER NOT NULL, ' \
'CAN INTEGER NOT NULL, ' \
'MEX INTEGER NOT NULL);')
conn.executemany(
'INSERT INTO review VALUES(?, ?, ?, ?);',
( ('belt' , 723, 134, 89),
('tie' , 567, 378, 78),
('necklace', 893, 456, 46),
('earrings', 744, 393, 122),
('bracelet', 612, 334, 103)
) )
for m in conn.execute(
'SELECT rangemax(USA, CAN, MEX) FROM review;'):
print('Maximum number of items sold: %d' % m)
test_variadic_aggregate()
The None
if the range is empty. Its step()
method uses the Python built-in max()
function to calculate the maximum value within a row of data, then it compares it to the current maximum value. This aggregate function is registered with
The test_variadic_aggregate()
function creates a SELECT
statement uses the rangemax()
function on a rectangular range, including all items and countries, to determine the maximum number of items sold, which is:
Maximum number of items sold: 893
Code More, Distract Less: Support Our Ad-Free Site
You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.
Remarks About Scalar and Aggregate Functions
In this section we have collected a series of noteworthy remarks which apply both to scalar and aggregate functions. Read them carefully before trying to define your own functions.
Beware of NULL Values
When defining a new scalar or aggregate function, remember that the values which you are performing operations on can be NULL. If that’s the case, then check for None
values and handle them appropriately; e.g. an aggregate function counting or summing values should skip NULL values.
import sqlite3
def test_checked_mean():
""" Calculating the mean value with `CheckedMean`. """
class CheckedMean:
""" Computes the mean of a set of values. """
def __init__(self):
self.values = []
def step(self, item):
if item is not None:
self.values.append(item)
def finalize(self):
return sum(self.values) / len(self.values) \
if self.values else 0
with sqlite3.connect(':memory:') as conn:
conn.create_aggregate('mean', 1, CheckedMean)
conn.execute(
'CREATE TABLE samples(i INTEGER);')
conn.executemany(
'INSERT INTO samples VALUES(?);',
( (23, ), (None, ), (12, ), (50, ),
(76, ), (None, ), (32, ), (63, ) ))
for tot in conn.execute(
'SELECT mean(i) FROM samples;'):
print('The mean value is: %d' % tot)
test_checked_mean()
In this example we define a step()
methods prevents None
values from being inserted in the test_checked_mean()
function creates a NULL
values. Finally, the SELECT
statement uses the mean()
aggregate function to calculate the arithmetic mean on the
The mean value is: 42
When possible, rely on the WHERE
or HAVING
clauses of the SELECT
statement for filtering NULL
values:
import sqlite3
def test_unchecked_mean():
""" Calculating the mean value with `UncheckedMean`. """
class UncheckedMean:
""" Computes the mean of a set of values. """
def __init__(self):
self.values = []
def step(self, item):
self.values.append(item)
def finalize(self):
return sum(self.values) / len(self.values) \
if self.values else 0
with sqlite3.connect(':memory:') as conn:
conn.create_aggregate('mean', 1, UncheckedMean)
conn.execute(
'CREATE TABLE samples(i INTEGER);')
conn.executemany(
'INSERT INTO samples VALUES(?);',
( (23, ), (None, ), (12, ), (50, ),
(76, ), (None, ), (32, ), (63, ) ))
for tot in conn.execute(
'SELECT mean(i) FROM samples ' \
'WHERE i IS NOT NULL;'):
print('The mean value is: %d' % tot)
test_unchecked_mean()
The step()
method won’t check for NULL
values. The test_unchecked_mean()
function adds a WHERE
clause to the SELECT
statement to make sure that no NULL
value will be passed to the test_unchecked_mean()
is the same as its checked version.
New Functions Can Silently Overwrite Others
If you try to register a scalar or aggregate function with the same name and the same number of arguments as another function, the latter will replace the former, and the SQLite core won’t complain (e.g. by raising an exception or a warning):
import sqlite3
def test_overwrite():
""" Silently overwriting a scalar function. """
def sloc(file):
try:
with open(file, 'rt', encoding='utf-8') as src:
n = 0
for line in src:
line = line.strip()
if line:
n += 1
except Exception:
return 0
return n
def sloc_comment(file):
try:
with open(file, 'rt', encoding='utf-8') as src:
n = 0
for line in src:
line = line.rstrip()
if line and not any(
map(line.startswith, ('/*', ' *'))):
n += 1
except Exception:
return 0
return n
with sqlite3.connect(':memory:') as conn:
conn.create_function('sloc', 1, sloc)
conn.execute('CREATE TABLE files(' \
'name TEXT NOT NULL, ' \
'lang TEXT NOT NULL);')
conn.executemany(
'INSERT INTO files VALUES(?, ?);',
( ('SQLiteGroupConcat.sql', 'sql'),
('SQLiteTrim.sql', 'sql')
))
print('Using the `sloc()` function:')
for name, sloc in conn.execute(
'SELECT name, sloc(name) FROM files;'):
print('File `%s` is %d SLOC.' % (name, sloc))
conn.create_function('sloc', 1, sloc_comment)
print('\nUsing the `sloc_comment()` function:')
for name, sloc in conn.execute(
'SELECT name, sloc(name) FROM files;'):
print('File `%s` is %d SLOC.' % (name, sloc))
test_overwrite()
In this example, we define two functions for computing the number of source lines of code (SLOC): the sloc()
function just skips the empty lines, the sloc_comment()
function also skips SQL comments. Both functions will return 0 if an exception occurs while reading the file (e.g. the file is not available).
The test_overwrite()
function creates a SELECT
statement will call sloc()
, while the second will use sloc_comment()
. Even though they use different functions, notice how both functions have their name
argument set to
Using the `sloc()` function: File `SQLiteGroupConcat.sql` is 15 SLOC. File `SQLiteTrim.sql` is 9 SLOC. Using the `sloc_comment()` function: File `SQLiteGroupConcat.sql` is 8 SLOC. File `SQLiteTrim.sql` is 2 SLOC.
Beware that this holds for predefined functions, too:
import sqlite3
import random
def test_overwrite_predefined_function():
""" Overwriting the `total()` function. """
class new_total:
""" New version of the total function. """
def __init__(self):
self.tot = 0
def step(self, i):
print('adding %d' % i)
self.tot += i
def finalize(self):
return self.tot
with sqlite3.connect(':memory:') as conn:
conn.execute('CREATE TABLE test(c);')
conn.executemany('INSERT INTO test VALUES(?);',
( (1, ), (2, ), (3, ), (4, ) ))
# Using predefined `total()` function
for i in conn.execute(
'SELECT total(c) FROM test;'):
print('The total of column `c` is %d' % i[0])
# Using `new_total()` function
conn.create_aggregate('total', 1, new_total)
for i in conn.execute(
'SELECT total(c) FROM test;'):
print('The total of column `c` is %d' % i[0])
test_overwrite_predefined_function()
The total of column `c` is 10 adding 1 adding 2 adding 3 adding 4 The total of column `c` is 10
In other words, you must be careful when choosing the name for your new functions in order to avoid overwriting existing default scalar and aggregate functions.
Aggregate and Scalar Functions Share the Same Namespace
Earlier we explained that we can define multiple scalar functions with the same name but with a different number of arguments. SQLite will choose the correct alternative according to the number of arguments in the function’s call. Despite the fact that they are registered with different functions, both scalar and aggregate functions share the same namespace. In other words, if we register a scalar function with the same name and number of arguments as an aggregate function, the function registered last will overwrite the other.
import sqlite3
import random
def test_namespace():
""" Aggregate and scalar functions share the same namespace. """
class RandomPair:
""" Selects a random pair of items. """
def __init__(self):
self.choices = []
def step(self, x, y):
self.choices.append("(%s;%s)" % (x, y))
def finalize(self):
return random.choice(self.choices) \
if self.choices else None
with sqlite3.connect(':memory:') as conn:
conn.execute('CREATE TABLE pairs(' \
'x INTEGER NOT NULL, y INTEGER NOT NULL);')
conn.executemany('INSERT INTO pairs VALUES(?, ?);',
( (12, 35), (97, 72), (12, 34),
(54, 23), (11, 90), (23, 44) ))
print('Available samples:')
for x, y in conn.execute(
'SELECT * FROM pairs;'):
print('(%d, %d)' % (x, y))
conn.create_function('random', 2,
lambda l, u: random.randint(
min(l, u), max(l, u)))
print('\nUsing the scalar function:')
for i in conn.execute(
'SELECT random(x, y) FROM pairs;'):
print(i[0])
conn.create_aggregate('random', 2, RandomPair)
print('\nUsing the aggregate function:')
for i in conn.execute(
'SELECT random(x, y) FROM pairs;'):
print(i[0])
test_namespace()
The test_namespace()
function creates a connection to an in-memory database, holding a single SELECT
statement twice:
- The first time, we register the scalar function
random(l, u)
, which picks a random number betweenu andl . TheSELECT
statement returns 6 records. - The second time, we register the aggregate function
random(x, y)
, which choose a random pair between all records in thepairs table. TheSELECT
statement returns just one record.
The typical output of the test_namespace()
function will be:
Available samples: (12, 35) (97, 72) (12, 34) (54, 23) (11, 90) (23, 44) Using the scalar function: 13 93 21 46 25 25 Using the aggregate function: (12;35)
SQLite features a max(column)
aggregate function, which picks the maximum value in a max(x, y, *others)
scalar function, which selects the maximum value between
Code More, Distract Less: Support Our Ad-Free Site
You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.
Ambiguous Function Calls
Be really careful when defining functions with default parameters and variadic functions. Let’s consider three functions with the following signatures: func(x)
, func(x, y=0)
, and func(x, *args)
. We can associate either the second or the third to the same identifier, since they both require the create_function()
to be set to -1. If we execute a SELECT func(a) FROM tab;
statement, which function will
import sqlite3
import random
import string
def test_ambiguous_call():
""" An ambiguous function call. """
def generate_words(n=10, minlen=5, maxlen=10):
for i in range(n):
yield ''.join(
( random.choice(string.ascii_letters) \
for i in range(minlen, maxlen) ))
with sqlite3.connect(':memory:') as conn:
conn.create_function('concat', 2,
lambda x, y : '%s,%s' % (x, y))
conn.create_function('concat', -1,
lambda x, y, sep=',' : '%s%s%s' % (x, sep, y))
conn.execute('CREATE TABLE words(' \
'prefix TEXT, suffix TEXT);')
conn.executemany('INSERT INTO words VALUES (?, ?);',
( zip(generate_words(5), generate_words(5)) ))
print('`concat(x, y, sep)` matches the second function:')
for word in conn.execute(
"SELECT concat(prefix, suffix, '@') FROM words;"):
print(word[0])
print('\nAmbiguous call:')
for word in conn.execute(
'SELECT concat(prefix, suffix) FROM words;'):
print(word[0])
test_ambiguous_call()
The test_ambiguous_call()
registers two 'concat'
scalar function, one with two arguments, another one with two arguments and one optional argument. That’s perfectly fine, since they have different number of parameters. Both functions concatenate the first value to the second value. The concat()
function twice:
- The first call uses three parameters, so only the second scalar function can match it.
- The second call uses two parameters, so both functions are suitable candidates, but the first function will be applied.
Let’s take a look at a typical output of the test_ambiguous_call()
function:
`concat(x, y, sep)` matches the second function: bRGIB@TrVbN pBxeU@omJsa cnhCw@kUpcW QyJAM@ENDcj ClnjO@KDWAC Ambiguous call: bRGIB,TrVbN pBxeU,omJsa cnhCw,kUpcW QyJAM,ENDcj ClnjO,KDWAC
As you can see from this example, the function with fixed number of argument is matched first, though the function with variable number of arguments was a good candidate, too. This behavior is hardly predictable, so be really careful when defining functions with the same name.
Error Handling
Scalar and aggregate functions can raise exceptions; e.g. a NULL value is passed to a function that can’t handle NULL values. In this case the exception propagates to the SQLite core and will raise another exception, which is a subclass of the sqlite3.Error
exception. This feature makes error handling easier since functions can raise any kind of exception, but we need to be able to handle those from within the sqlite3
module.
import sqlite3
def test_exceptions():
""" Handling exceptions from Scalar and Aggregate Functions. """
def faulty_scalar(a):
""" A scalar function raising `ValueError`. """
raise ValueError('Faulty scalar function.')
class faulty_aggregate:
""" An aggregate function raising `ValueError`. """
def step(self, a):
raise ValueError('Faulty aggregate function.')
def finalize(self):
pass
with sqlite3.connect(':memory:') as conn:
conn.create_function(
'faulty_scalar', 1, faulty_scalar)
conn.create_aggregate(
'faulty_aggregate', 1, faulty_aggregate)
try:
conn.execute('SELECT faulty_scalar(1);')
except sqlite3.Error as error:
print(error)
try:
conn.execute('SELECT faulty_aggregate(1);')
except sqlite3.Error as error:
print(error)
test_exceptions()
In our example we define a ValueError
exception. To protect our code from propagating exceptions, we need an except
clause to catch sqlite3.Error
exceptions. This is the base class for all exceptions raised by the sqlite3
module, as we can see from test_exceptions()
’s output:
user-defined function raised exception user-defined aggregate's 'step' method raised error
Closing Thoughts
In this tutorial we learned how to define custom scalar and aggregate functions. They are both registered on a per-connection basis. Scalar functions are represented by callable objects with a fixed (or variable) number of arguments. On the other hand, aggregate functions are represented by classes with step()
and finalize()
methods.
Functions are matched by name and number of arguments, so we can register more than one function to the same name, but we must always keep in mind that aggregate and scalar functions share the same namespace, and that registering both fixed-number and variable-number functions can lead to ambiguous function calls, which are hard to detect.
Did you find this free tutorial helpful? Share this article with your friends, classmates, and coworkers on Facebook and Twitter! When you spread the word on social media, you’re helping us grow so we can continue to provide free tutorials like this one for years to come. After that, please subscribe to our email list for more Python tutorials.
Code More, Distract Less: Support Our Ad-Free Site
You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.
List of All Examples in This Tutorial
For convenience, here’s a list of all the Python SQLite examples we presented in this tutorial. Hopefully you’ll find this inspirational when creating your own SQLite functions in Python.
- Defining a New Scalar Function
age()
- Using a Python Class as Scalar Function
- Multiple Functions with the Same Name
- Scalar Function with a Default Parameter
- A Scalar Function with a Variable Number of Parameters
- Defining the
Reduce
Aggregate Function - Using the
DISTINCT
Keyword - A Variadic Aggregate Function
- Checked Mean
- Unchecked Mean, Relying on the
WHERE
Clause - Overwriting a Custom Function
- Overwriting a Predefined Function
- Aggregate and Scalar Functions Share the Same Namespace
- Ambiguous Function Calls
- Handling Errors from Scalar and Aggregate Functions