Thursday, February 23, 2012

difference between soft referential integrity and hard referential integrity in teradata


One of my colleagues had come across the issue of  soft RI  vs hard RI .  Even i was curious to find that this concept is widely used in Teradata , we are having lack of knowledge on how it works  including me :-)

So i did some analysis on this concept with example and wanted to share with everyone over here.



Hard Referential Integrity -

It involves manual creation of Referential links like "foreign key ... references"....  Please find an example below between department & employee table .

-- create department table & employee table and have referential integrity set explicity  between them as follows

Create table department
(
Dept_no integer not null,
Dept_name varchar(30),
Dept_loc varchar(50)
)
Unique primary index(dept_no);

--- insert some sample records into  Department table

insert into department values (101,'Sales','mumbai);
insert into department values (102,'Accounts','mumbai);
insert into department values (103,'Human Resources','mumbai);


Create  table Employee
(
Emp_num integer not null,
Emp_name carchar(30) not null,
Dept_no integer not null,
Constraint foreign_emp_dept foreign key ( Dept_no)  references department(dept_no)
)
unique primary index(emp_num);

Now , try to insert following record into employee table as below

insert into Employee values (123456,'swapnil',104);
-- following statement fails because  employee table refers to dep_no column in department table to check for value 104.  Since it wont be able to find the value it will fail with referential integrity error.

This type of referential integrity is called Hard referential integrity.



SOFT referential integrity.
( thanks to James Lee  for correcting me :) )

This is handled at system level by optimizer  and we need to specify "with no check option"
Create table department
(
Dept_no integer not null,
Dept_name varchar(30),
Dept_loc varchar(50)
)
Unique primary index(dept_no);

--- insert some sample records into  Department table

insert into department values (101,'Sales','mumbai);
insert into department values (102,'Accounts','mumbai);
insert into department values (103,'Human Resources','mumbai);


Create  table Employee
(
Emp_num integer not null,
Emp_name carchar(30) not null,
Dept_no integer not null
,Constraint foreign_emp_dept foreign key ( Dept_no)  references WITH NO CHECK OPTION department(dept_no)  
)
unique primary index(emp_num);

insert into Employee values (123456,'vinay',101);
insert into Employee values (123456,'Sachin',104);


We can achieve soft  referential integrity by using  inner join between department & Employee tables

sel
dept.dept_no,
emp.emp_num,
emp.emp_name
from
employee emp
inner join   ----> here inner join does soft referential integrity & picks only matching columns
department dept
on
dept.dept_no = emp.dept_no;

advantages of Soft RI over Hard RI is that there is  that there is no overhead of maintaining Referential integrity tables in case of Soft RI & it would help in better performance in case of tables with large values and there is no constraints put on referenced columns when their values change.

Soft referential integrity is ideal in case of large datawarehhouses becuase of ETL mechanisms which cleanse unnecessary records .

Well, its upto you to decide which is better option considering the restrictions and performance concerns when these tables tend to grow