Thursday, March 31, 2011

SQL Remove mutiple values but keep distinct duplicates

We have interfaces from a legacy financial application where records need to flow into the new financial application until all transactions are moved over to the new application.  To do this we have a mapping between the legacy and the new application for the GL accounts.  The new application has a consolidated GL so there are multiple accounts in the legacy system mapping to a single account in the new system.  We came across a situation where we needed to do a reverse mapping.  Since we have this one to many situation we can't always map back to a single GL account string in the legacy system.  For these situations we setup an exception report so that they could be reported back to the user community and the items processed manually but for the rest we needed to go ahead and interface the records. The other wrench in the works is that for this particular process we were only looking at particular segments of the account string.  For these segments we could have multiple mappings as well as multiple mappings that are duplicates as in the sample data below:


Entity     Function    Cost Center
---         ---             ---
111       111           A3B
111       111           B4Z
111       333          A3B
111       333          A3B
111       444          A2S

So what we want is to remove the first two rows above since they are multiple values for the one string 111.111.  We want only one of the next two rows, that is the duplicate but we want to keep it since it's a one to one map and we want the last row.

To get the list of values where we have only one mapping we can use:

--Get a unique list of combinations (this removes the duplicates so we can remove only the multiple ones next)
SELECT DISTINCT entity, [function]  INTO #tmp_1 FROM map

--Get list of new account strings that map to a single legacy value
SELECT t.entity, t.[function]  INTO #tmp_2 FROM map
   INNER JOIN #tmp_1 t
      ON t.entity = map.entity
      AND t.[function] = map.[function]
   GROUP BY map.entity, map.[function]
   HAVING count(*) = 1

  --Now get the final list of valid mappings that we can use in our joins (We have to add the DISTINCT back in to remove the duplicates again)
SELECT DISTINCT map.entity, map.[function], map.costcenter INTO #tmp_final_map FROM map
   INNER JOIN #tmp_2 t
      ON t.entity = map.entity
      AND t.[function] = map.[function]

SELECT * FROM #tmp_final_Map


DROP TABLE #tmp_1
DROP TABLE #tmp_2
DROP TABLE #tmp_final_map

No comments:

Post a Comment