|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
creating cluster IDs |
Post Reply
|
| Author | |
alice.jaques
Newbie
Joined: 01 Mar 2010 Online Status: Offline Posts: 1 |
Quote Reply
Topic: creating cluster IDsPosted: 03 Mar 2010 at 2:38am |
|
Hi, I have a large file (>5million) records that include several duplicates of the same person for different events. Currently, the duplicate people do not have a unique ID to identify them as duplicates and I need to create one.
Does anyone know how to create a unique ID for duplicate records? I intent on using several variables to identify which people are duplicates - name, DOB, postcode, country of birth etc. I have tried using the identify duplicates tool, but this only gives the first or last primary a 1 and all other duplicates for that person a 0 when I need to say, this group of 20 duplicates has ID number 1, this group of 14 duplicates has ID number 2 etc.
Cheers,
Alice
| |
![]() |
|
Jeena
Newbie
Joined: 23 Dec 2009 Location: India Online Status: Offline Posts: 25 |
Quote Reply
Posted: 03 Mar 2010 at 5:03pm |
|
I can make some broad suggestions which should point you on the right course of action.
I would first use the tool to identify unique cases such that the first unique case has the 1 flag and then below that for the duplicates would have zeros. Then I would use the LAG function along with if statements and COMPUTE statements to assign each group the unique values that you require. Something like this: IF ($CASENUM = 1) UniqueID = 1. IF ($CASENUM <> 1 AND PrimaryFirst = 1) UniqueID = LAG(UniqueID)+1. IF ($CASENUM <> 1 AND PrimaryFirst = 0) UniqueID = LAG(UniqueID). EXE. Thank you |
|
|
Regards,
Jeena | |
![]() |
|
Post Reply
|
| Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You can vote in polls in this forum |