Tryst with Technology: SQL - Find Possible Duplicate Records in Multiple Columns

Monday, December 01, 2008

SQL - Find Possible Duplicate Records in Multiple Columns

Below is the code used to find possible duplicate records across multiple columns. Below I am trying to find duplicate rows in col1, col2 combined eg.

Consider the following ERD :-

Table data:-

Here row 2 and 3 are duplicate and the following SQL detects and finds such rows:-

SELECT a.* FROM test a INNER JOIN 
( 
    SELECT col1, col2 FROM test GROUP BY col1, col2 HAVING Count(*) >1 
) b 
ON (a.col1 = b.col1 
AND a.col2 = b.col2)

3 comments:

cloud9ine9/18/2009 10:57:00 PM
How about:

select col1, col2, count(col2) from a group by col1,col2 having count(col2) > 1
ReplyDelete
Replies
cloud9ine9/19/2009 12:54:00 AM
try it out, this will too. that's why the group by is there :)
ReplyDelete
Replies
Vyas9/19/2009 01:26:00 AM
Anil, you are right...but the point I am trying to get out here is : Your query does "NOT" show the rows containing the dupes. It tells me what is duplicated. I want to print out all rows that are duplicated. Like in the above case - my query prints out A - B 2 times.

Perhaps I have stripped off the example to a bare minimum and hence I am not able to explain the complexity involved. My work related query had sub selects - you really don't want to see PSoft queries do you ;)
ReplyDelete
Replies

Thank you for your feedback

Tryst with Technology

Monday, December 01, 2008

SQL - Find Possible Duplicate Records in Multiple Columns

3 comments:

Archive

Labels

Creative Commons License

FatWallet