In this topic, we discuss SQL Server Constraints. We first discuss the theory behind them, i.e. the concepts of Data Integrity and Referential Integrity, and Database Normalization. Then we will learn about the types of constraints in SQL with examples, its rules, and syntax. In particular, we will check table constraints. You can learn SQL queries in another article.
SQL constraints are programmatic constructs that alter the Server’s behavior while it applies INSERT/DELETE/UPDATE operations on data. SQL Constraints are SQL Server’s way of implementing the concepts of Data Integrity, Referential Integrity, and Database Normalization in general.
Data Integrity is the process of checking the data against real-world rules. For example, a person’s National ID info is unique to that person, and an employee’s Hiring Date must be less than their Last Working Date, etc.
Referential Integrity is a set of measurements against a database to ensure the relations between its entities (tables) remain valid after any change to the data. As we will see below, SQL Server provides the Primary Key and Foreign Key constraints features to ensure that the referential integrity state of the database remains valid at all times.
Referential Integrity is a part of a general process called Database Normalization.
Database Normalization is the process by which we make sure the Database is in good shape in terms of the various integrity measurements and Business Logic rules.
Normalization is important because it eliminates Data redundancy. If a database is poorly designed it will contain the same data duplicated over many places of it. This adds a burden to the application accessing the database to be responsible for maintaining the Data & Referential Integrity for the database.
From a lower-level perspective, a normalized database means a smaller storage space required per row in each table since the row will never contain columns that logically belong to another table. This improves the speed of fetching rows from the disk and the number of rows fetched from each disk page read, hence more performance.
1NF stands for “1st Normal Form”. It is the most basic step towards making a database normalized.
The first normal form states that:
2NF states that there must be a mechanism in the Database engine that reflects any dependency between a referencing column and a referenced column. This mechanism must enforce this dependency such that changes to the data in either the referenced column or the referencing column do not corrupt the dependency.
3NF states that all column references in reference data that are not dependent on the primary key should be removed. Another way of putting this is that only foreign key columns should be used to reference another table, and no other columns from the parent table should exist in the referenced table. 3NF is important in reducing redundancy which improves performance as discussed earlier, but it must be noted that by definition it cannot be enforced by SQL Server using any mechanism as it depends on the Database designer to be aware of it and avoid repeating columns unnecessarily in many places.
The Primary Key constraint (PK) implements 1NF by making sure each table has a single column or a small set of columns that uniquely identify each row. Because primary key constraints in SQL are required to be unique, they are usually defined over an identity column, i.e. an integer column that is incremented automatically by SQL Server while inserting a new row.
To declare a Primary Key constraint in SQL Server you can do this in two ways:
CREATE TABLE [dbo].[Regions]( [RegionID] [int] NOT NULL PRIMARY KEY, [RegionDescription] [nchar](50) NOT NULL )
(Note that the example above places the constraint over a regular int column rather than an identity column)
ALTER TABLE [dbo]. [Regions] ADD PRIMARY KEY CLUSTERED ( [RegionID] ASC )
Foreign Key constraints (FK) is a restriction over a column or set of columns that define a relationship between the row they are part of and a unique row in another table. This is achieved by the FK matching exactly the same number and type of columns of the Primary Key (PK) of the table being referenced.
For example, the Regions table below can be used as a reference table for another table that needs to add region information to its data. We usually use the names “parent table” and “child table” to refer to the referenced table and the referencing table respectively. For example, the data in the two tables would look like this:
RegionID | RegionDescription |
1 | Eastern |
2 | Western |
3 | Northern |
4 | Southern |
ID | Name | RegionID | StartDate | Code | Budget |
1 | South Carolina | 1 | 2016-05-12 | SC | NULL |
2 | Virginia | 1 | 2016-05-14 | VG | 20000 |
3 | New York | 2 | 2017-04-15 | NY | NULL |
The column RegionID in SalesRegions defines the link between a Sales Region info and Region info. By separating the data in this way, the FK constraint satisfies the 1NF rule that each set of related data must have its own table. For example, we have a third table to store info about Marketing Regions, etc.
To define an FK Constraint on a table you should follow this syntax.
ALTER TABLE [dbo].[SalesRegions] ADD CONSTRAINT [FK_SalesRegions_Regions] FOREIGN KEY([RegionID]) REFERENCES [dbo].[Regions] ([RegionID]) GO
We will use the following table to demonstrate the concepts below:
CREATE TABLE [dbo].[SalesRegions]( [ID] [int] IDENTITY(1,1) NOT NULL, [Name] [varchar](200) NOT NULL, [RegionID] [int] NOT NULL, [StartDate] [date] NOT NULL, [Code] [nvarchar](50) NOT NULL, [Budget] [decimal](18, 0) NULL ) ON [PRIMARY] GO ALTER TABLE [dbo].[SalesRegions] WITH CHECK ADD CONSTRAINT [FK_SalesRegions_Regions] FOREIGN KEY ([RegionID]) REFERENCES [dbo].[Regions] ([RegionID]) GO ALTER TABLE [dbo].[SalesRegions] CHECK CONSTRAINT [FK_SalesRegions_Regions] GO
Note that some developers argue against Data Integrity constraints in SQL saying that the business logic of an application should be retained within the business layer instead of the Database layer. However, this approach makes it possible to have invalid data in the database. This can happen because there are bugs in the business logic or because data is updated via a different route. For example, when a database is used by multiple applications with differing logic. Data Integrity SQL constraints prevent this problem.
A Unique Constraint as the name implies is used to ensure that the data inside a single column or a set of columns remain unique across the whole table at all times.
A unique constraint can be defined using the following syntax:
ALTER TABLE dbo.SalesRegions ADD CONSTRAINT UK_SalesRegions_Code UNIQUE NONCLUSTERED ( Code ) GO
This example adds a unique constraint over the column Code in table SalesRegions, since from the application logic perspective a sales region code should be used for only one sales region.
Both Unique Keys and Primary Keys enforce uniqueness, however, note the following differences between them:
Primary Key:
Unique Key:
Also note that although it is a “Constraint”, SQL Server Management Studio will place Unique Key objects under the “Indexes” node in Object Explorer rather than the “Constraints” node under the respective table.
The difference between the two is non-existent as both have the same effect. You can reach the same effect for the constraint above by defining an index instead as follows:
CREATE UNIQUE NONCLUSTERED INDEX UK_SalesRegions_Code ON dbo.SalesRegions -( Code ) GO
A Default Constraint adds more context to the database by making sure that a default value is given to a column whenever it makes sense to use one. For example, if you are adding a new row to a “Files” table which contains a DateCreated column, it makes sense to add a default constraint to this column to be equal to the current time.
For example in the table SalesRegions if you want to set a default constraint on the Budget column such that it is initially 1000 for a new sales region, you can do it inline in the table definition as follows:
CREATE TABLE [dbo].[SalesRegions]( [ID] [int] IDENTITY(1,1) NOT NULL, [Name] [varchar](200) NOT NULL, [RegionID] [int] NOT NULL, [StartDate] [date] NOT NULL, [Code] [nvarchar](50) NOT NULL, [Budget] [decimal](18, 0) NULL DEFAULT(1000) ) ON [PRIMARY] GO
You can also do it after the table has been declared as follows:
ALTER TABLE [dbo].[SalesRegions] ADD CONSTRAINT [DF_SalesRegions_Budget] DEFAULT ((1000)) FOR [Budget] GO
The advantage of the second approach – which can also be used while creating the table as a second statement following the CREATE statement – is that you can control the name of the default constraint. This is useful if you follow a naming standard for your database schema.
A Check Constraint is a logical rule that makes sure the data entered in one or more columns is consistent with business rules. For example in our SalesRegion table we can limit the budget to be between $15,000 and $100,000 while creating the table as follows:
CREATE TABLE [dbo].[SalesRegions]( [ID] [int] IDENTITY(1,1) NOT NULL, [Name] [varchar](200) NOT NULL, [RegionID] [int] NOT NULL, [StartDate] [date] NOT NULL, [Code] [nvarchar](50) NOT NULL, [Budget] [decimal](18, 0) NULL, Check ([Budget]) between 15000 and 100000), ) ON [PRIMARY] GO
Similarly, you can add it to the table after it has been created to control the name assigned to the constraint as follows:
ALTER TABLE [dbo].[SalesRegions] ADD CONSTRAINT CC_SalesRegions_Budget CHECK (([Budget]>=(15000) AND [Budget]<=(100000))) GO
Note that check SQL constraints that require more complex logic, for instance like looking for value across another table will typically be implemented as a user-defined function or as a trigger.
A Not-Null Constraint validates that a certain column can never be left with an unknown value. This can be used to enforce the application logic whenever the application assumes a column must always have a value, for example, a Spare Part must always have a Serial Number.
The way to declare this constraint is to simply add the text “NOT NULL” at the end of the column definition, as you can see in most of the columns in the definition of SalesRegions.
Note that you may decide to make a column not-nullable later in the application development. To apply this you need to update the column’s data first to replace all its null values with a default value and then apply the NOT NULL constraint in a command similar to this:
Alter TABLE [dbo].[SalesRegions] ALTER COLUMN Budget [decimal](18,0) NOT NULL GO
In this article, we discussed SQL Server constraints as SQL Server’s way of applying Data Integrity and Referential Integrity rules. We also presented some background theory on Database Normalization as this theory is directly related to the Constraints concept in SQL Server.