Data integrity is the backbone of effective software, and data validation is the guardian of that integrity. This guide explores data validation in Python, focusing on Cerberus — a lightweight, adaptable library. Our goal is to demonstrate how Cerberus allows you to validate data against a variety of rules and schemas. We’ll start with Cerberus basics, covering installation, schemas, and error handling. Then, we’ll dive into its rule set with examples. Lastly, we’ll unveil Cerberus customization, showing you how to extend its functionality. By the end, you’ll be equipped to validate your own data in your own projects, ensuring your data meets the highest quality and integrity standards in your Python applications.
Cerberus Basics
Cerberus is a powerful Python library designed for data validation and sanitization. It offers a straightforward way to define and enforce validation rules on complex data structures like dictionaries or JSON documents. In this section, we’ll explore the basics of Cerberus, including installation and how to get started.
Installation
You can install Cerberus using pip, Python’s package manager. Open your terminal or command prompt and run the following command:
pip install cerberus
Code More, Distract Less: Support Our Ad-Free Site
You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.
Getting Started with Cerberus
Now that you have Cerberus installed, let’s dive into how to use it. There are three steps to validating your data with Cerberus:
- Create a data validator schema.
- Validate your data against the schema.
- Handle errors to correct your data.
Creating a Validator and Schema
To begin validating data with Cerberus, you’ll need to create a validator object and define a validation schema. The schema defines the rules for validating your data. Here’s a quick example:
from cerberus import Validator
#Define a validation schema
schema = {
'name': {'type': 'string', 'minlength': 3, 'maxlength': 20},
'age': {'type': 'integer', 'min': 18},
'email': {'type': 'string', 'regex': r'^\S+@\S+\.\S+$'},
}
#Create a Validator instance with the schema
validator = Validator(schema)
The above code defines a validation schema in the form of a dictionary that includes rules for three fields: name,
age,
and email.
For example, it specifies that name
should be a string with a minimum length of 3 characters and a maximum length of 20 characters, age
should be an integer greater than or equal to 18, and email
should be a string that matches a specific regular expression pattern for email addresses.
After defining our schema, we create a validator instance called validator
by passing the schema to it.
Validating Data
With the validator and schema in place, you can now use the validate()
method to check if a data dictionary conforms to the schema:
data = {
'name': 'Usman Malik',
'age': 33,
'email': 'johndoe@example.com'
}
#Validate the data
if validator.validate(data):
print("Data is valid!")
else:
print("Validation errors:")
print(validator.errors)
Output:
Data is valid!
The output shows that the dictionary is valid.
Handing Validation Errors
If validation fails, Cerberus provides a detailed report of errors. You can access and handle these errors programmatically to provide meaningful feedback to users or take appropriate actions based on the validation results. For example, the dictionary in the following script is not validated by the schema we defined and the validation error is printed on the console.
data = {
'name': 'Usman Malik',
'age': 33,
'email': 'johndoeexample.com'
}
#Validate the data
if validator.validate(data):
print("Data is valid!")
else:
print("Validation errors:")
print(validator.errors)
Output:
Validation errors:
{'email': ["value does not match regex '^\\S+@\\S+\\.\\S+$'"]}
Cerberus Rules
In the previous section, we got acquainted with the basics of Cerberus, setting up a validator and defining a validation schema. Now, let’s dive deeper into some of the rules that Cerberus supports for validating and shaping your data.
Types of Rules
Cerberus offers a rich set of rules that you can apply to your validation schema. Here are some of the key rules you can use:
Type Rule
The type
rule specifies the expected data type for a field. For example:
'score': {'type': 'integer'}
You have already seen the type rule in action when we set up our schema bounding the age to integers above 18.
Required Rule
The required
rule ensures that a field must be present in the data. For example:
'email': {'type': 'string', 'required': True}
Empty Rule
The empty
rule allows you to specify whether a field can be empty or not. You can set it to True or False. For example:
'comments': {'type': 'string', 'empty': False}
Min and Max Rules
The min
and max
rules set minimum and maximum constraints on numeric values. For example:
'age': {'type': 'integer', 'min': 18, 'max': 99}
Regex Rule
The regex rule lets you validate a field using a regular expression pattern. For example:
'zipcode': {'type': 'string', 'regex': r'^\d{5}$'}
Code More, Distract Less: Support Our Ad-Free Site
You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.
Examples Using Cerberus Rules
Let’s see an example that creates a schema with the above rules.
schema = {
'name': {'type': 'string', 'minlength': 3, 'maxlength': 20, 'required': True},
'age': {'type': 'integer', 'min': 18},
'email': {'type': 'string', 'regex': r'^\S+@\S+\.\S+$', 'required': True},
'score': {'type': 'integer', 'min': 0, 'max': 100},
'comments': {'type': 'string', 'empty': False},
'zipcode': {'type': 'string', 'regex': r'^\d{5}$'}
}
validator = Validator(schema)
The following script creates a dictionary and validates it using the schema we just defined:
data = {
'age': 25,
'score': 85,
'comments': '',
'zipcode': '12345'
}
#Validate the data
if validator.validate(data):
print("Data is valid!")
else:
print("Validation errors:")
print(validator.errors)
Output:
Validation errors:
{'comments': ['empty values not allowed'], 'email': ['required field'], 'name': ['required field']}
Let’s address the above problems and modify our data dictionary.
data = {
'name': 'Alice Johnson',
'age': 25,
'email': 'alice@example.com',
'score': 85,
'comments': 'Great job!',
'zipcode': '12345'
}
#Validate the data
if validator.validate(data):
print("Data is valid!")
else:
print("Validation errors:")
print(validator.errors)
Output:
Data is valid!
By now, you’re probably starting to see how this could be useful when requesting input from a user in an application. Let’s dive in a little further into more Cerberus customization.
Cerberus Customization
Cerberus allows you to extend its capabilities with custom rules, types, methods, default setters, and more. Let’s see how to define our custom rules and data types.
Custom Validation Rules
One of the most powerful features of Cerberus is the ability to define custom validation rules. Here’s an example of how to define and use a custom rule:
from cerberus import Validator
class MyValidator(Validator):
def _validate_is_even(self, is_even, field, value):
if is_even and value % 2 != 0:
self._error(field, "Value must be even")
schema = {'number': {'type': 'integer', 'is_even': True}}
v = MyValidator(schema)
document = {'number': 3}
print(v.validate(document))
print(v.errors)
Output:
False
{'number': ['Value must be even']}
In the above code, we define a custom validator class MyValidator
that extends the Validator
class. We then define a custom validation rule _validate_is_even
that checks if a given integer value is even. If the is_even
flag is set to True
in the schema for a field, the custom rule will be applied to that field during validation.
To test the custom validator, we create an instance of MyValidator
and pass in the schema. We then validate a document that contains the field number
with a value of 3. Since 3 is not even, the validation fails and the error message is returned.
Custom Types
Just like rules, Cerberus allows you to create custom types. Let’s see an example:
from decimal import Decimal
import cerberus
decimal_type = cerberus.TypeDefinition('decimal', (Decimal,), ())
cerberus.Validator.types_mapping['decimal'] = decimal_type
schema = {
'price': {
'type': 'decimal'
}
}
data = {
'price': Decimal(10.99)
}
v = cerberus.Validator(schema)
print(v.validate(data))
Output:
True
The script above begins by importing the Decimal
class from the decimal module. It then imports the Cerberus module and proceeds to define a custom data type named decimal
using the cerberus.TypeDefinition
class. This custom type is then registered with Cerberus by adding it to the cerberus.Validator.types_mapping
dictionary. To test this custom data type, we establish a straightforward validation schema for a
If you do not pass a decimal type to the price
value, you will see an error. For instance, instead of passing a decimal type, the following script passes the default floating type value of 10.99, which is not validated by the custom type schema.
data = {
'price': 10.99
}
v = cerberus.Validator(schema)
print(v.validate(data))
Output:
False
Conclusion
Cerberus is a useful Python package for performing various data validation tasks, and in this tutorial, you’ve learned how to harness its functionalities to conduct a wide range of data validation operations. Whether you’re safeguarding financial transactions, validating user inputs, or maintaining data consistency, Cerberus empowers you to easily validate your data, offering a reliable and extensible solution for ensuring data quality and integrity in Python applications.
If you want to explore Cerberus further, see the official documentation.
Code More, Distract Less: Support Our Ad-Free Site
You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.