Delete Duplicate Records from Table in SQL Server

Wednesday, August 13, 2008

I have found the best way to delete duplicate records in a table which has IDENTITY Column. For example, we have an employee table which has duplicate data of EmployeeName and Salary field.

TableName : tbl_Employee

Field Name -------------- FieldType
---------------------------------------------------
EmployeeID ------------- int (IDENTITY)
EmployeeName ----------varchar(50)
Salary --------------------int


Table Records

EmployeeID EmployeeName Salary
----------------------------------------------------------------
1 ----------- AAA ----------- 15000
2 ----------- BBB ----------- 10000
3 ----------- CCC ------------20000
4 ----------- BBB ----------- 10000
5 ----------- CCC ----------- 20000
6 ----------- AAA ---------- 15000
7 ----------- BBB ----------- 10000


DELETE
FROM tbl_Employee
WHERE EmployeeID NOT IN
(
SELECT MAX(EmployeeID)
FROM tbl_Employee
GROUP BY EmployeeName, Salary)


Output ( After executing this query)
---------------------------------------------------------------------------------------


EmployeeID EmployeeName Salary
----------------------------------------------------------------
5 ----------- CCC ----------- 20000
6 ----------- AAA ---------- 15000
7 ----------- BBB -----------10000

Another way

We can do it in another way when table has not any Identity field. First we have to insert the unique record by using distinct command.

Select distinct * into tempEmployee from tbl_Employee

Now delete all record from tbl_Employee

Truncate table tbl_Employee

Now insert unique record in tbl_Employee table from tmpEmployee.

Insert into tbl_Employee
Select * from tempEmployee

After that we can drop that temporary table.

Drop table tempEmployee

I know only these two ways if anyone know any other way. Please share his/her knowledge.

0 comments:

Post a Comment