What is Data Integrity?

In this era where more and more data is being stored and produced than ever, it is important that the integrity of our data is preserved. Imagine we have some employees working for an MNC. These employees work for two departments say, sales and finance. Both of these departments have their own database tables. Then do we need to store the record of these employees twice? No, data integrity says that there should be no duplicate data in our database. There should be no redundancy and our data should be consistent throughout the database. But, how can this be achieved? There are various data integrity constraints provided to us by the DBMS. So, let's see them in detail.

Data Integrity

Data Integrity is having correct and accurate data in your database. When we are storing data in the database we don't want repeating values, we don't want incorrect values or broken relationships between tables. So, let's understand through an example that how broken relationships can cause data inconsistency.

Example: Let us imagine we have a customer database where we have two tables i.e ' customer_table '(customer_id, customer_name, purchase_id) and ' purchase_table '(purchase_id, purchhased_item). These two tables are related such that if any purchase is made by the customer then that data of the purchased item will be stored in the purchase_table. So, if we have a record of a purchased item in the purchase_table then that purchase must be made by some customer present in the customer _table . Now, let's consider a situation where the purchase table says that an item is purchased by a customer_id say 10 but that customer_id doesn't exist in the customer_table. It is not possible that a purchase is done without a customer. Such a situation might have arisen because the data of the customer was removed from customer_table. So, if the data of the customer is removed from the customer table then the corresponding data should also be removed from the purchase table. Therefore, the changes in the database have to be updated consistently for maintaining data integrity.

Data Integrity can be maintained using constraints. These constraints define the rules according to which the operations like updation, deletion, insertions etc. have to be performed to maintain the data integrity. There are mainly four types of Data Integrity:

  1. Domain Integrity
  2. Entity Integrity
  3. Referential Integrity
  4. User-Defined Integrity

Domain Integrity

Domain refers to the range of acceptable values. It refers to the range of values that we are going to accept and store in a particular column within a database. The data types available are mainly integer, text, date etc. Any entry which we make for a column should be available in the domain of the data type.

Example: If we have to store the salary of the employees in the ' employee_table ' then we can put constraints that it should only be an INTEGER. Any entry other than integer like characters would not be acceptable and when we try to give input like this, the DBMS will produce errors.

Entity Integrity

Each row for an entity in a table should be uniquely identified i.e. idf some record is saved in the database then that record should be uniquely identified from others. This is done with the help of primary keys. The entity constraint says that the value of the primary key should not be NULL. If the value of the primary key is NULL then we can't uniquely identify the rows if all other fields are the same. Also, with the help of primary key, we can uniquely identify each record.

Example: If we have a customer database and customer_table is present there with attributes like age and name. Then each customer should be uniquely identified. There might be two customers with the same name and same age, so there might be confusion while retrieving the data. If we retrieve the data of customer named 'A ngel' then two rows are having this name and there would be confusion. So, to resolve this issues primary keys are assigned in each table and it uniquely identifies each entry of the table.

Referential Integrity

Referential Integrity is used to maintain the data consistency between two tables. Rules are made in the database structure about how foreign keys should be used to ensure that changes, addition and deletion in the database maintain the data integrity. The referential integrity constraints state that if a foreign key in the first table refers to the primary key of the second table then every value of foreign key in the first table should either be null or present in the second table.

Example: Let us suppose we have two tables of the student(student_id, name, age, course_id) and course(course_id, course_name, duration). Now, if any course_id is present in the student table which is not there in the course table then this is not allowed. The course_id in the student table should either be null or if any course_id is present in the student table then it should also be present in the course table. This is how referential integrity is maintained.

User-Defined Integrity

Sometimes these three integrity i.e domain, referential and entity integrity are not enough to maintain the data integrity. Such integrity is typically implemented through triggers and stored procedures. Triggers are a block of statements which executes automatically if any predefined events occur.

Example1: Whenever any new row is inserted into a student_table which has marks of different subjects of students then automatically new average is also calculated and stored.

We can also define some user-defined constraints and specific business rules here.

Example2: If we are creating a table for registration and we want the user's age should be greater than 21. Such type of constraint is set by the user.

Example3: Let us suppose we have an employee_table with attributes like emp_id, emp_name, job_name, salary, mobile_no. Here we have constraints that the id of an employee should always start with some specific characters like 'AfterAcademy' and then followed by digits. So, any entry which does not follow this constraint would not be acceptable.

So, today we learnt about data integrity and how it helps in maintaining the data integrity. Hope you learnt something new today.

Do share this blog with your friends to spread the knowledge. Visit our YouTube channel for more content. You can read more blogs from here .

Keep Learning :)

Team AfterAcademy!