So I have a table which has duplicate rows for everything, but there is a catch: the duplicate elements are not consecutive and they have different primary keys. Is there a code snippet that can work for this case?
You'll need to define your table more clearly for us. What makes something a duplicate? What indexes are on that table.
A solution should be simple but it won't be quick to run if the table is large.
Get Outlook for Android
How can you have duplicates if the pk is not the same? It's part of the table as well as the rest of the data. If - and I am starting to assume here - you have a synthetic key (a single integer) and all the rest is the same, then something along this lines should do the trick:
for each <yourtable>
break by <field-1>
by <field-2>
by <field-3>
...
by <field-n>:
if not first-of(<field-n>) then delete <yourtable>.
end.
The idea is to do a break-by on all fields except the primary index field. Then, if you find a duplicate, delete it. This may not work if your table is large though.
Some times, what it makes duplicate some data, it's the iterpretation that some actor on the system gives to the data, dont you think?
Assuming data is at least 4NF, we can consider its economical and practically efficient for database storing, but perhaps some department gives the same value to datum with, let's say, KEY(CustNum, DateOperation) and they consider this as duplicate record (if customer showed up two times in a date), and Rus Paul was asked to create a data table with the concentrated data about this scenario...
Just a thought .
And [mention:e666fffb14004b29b4bad87b731999a8:e9ed411860ed4f2ba0265705b8793d05] showed what most of us would think as solution to the task at hand..
There can be many ways to find a ‘duplicate’ once one is able to define what a duplicate means, Patrick’s break-by approach can work - probably audit fields need to be excluded as well (create, last update, user) and I imagine it might have issues with ‘large objects’ but the idea is there… however, I would not simply delete the ‘duplicate’ like that as it’s PK might be used in child tables (FK) and that could make a mess out of the database :(
before deleting duplicates, consider whether you should.
if the data appear to be duplicates but maybe they aren’t, then maybe you shouldn’t be deleting business records just because they don’t conform to 4NF or some other “normal” form.
if they’re real duplicates then fine=