The Open Code Project

SQL: Synchronizing Data Sources with Joins - Finding Changed Records

March 18th,2010 by Allan Bogh

While working with Active Directory automation projects, one of the many things I run into is the need to sync multiple data sources. Usually we would perform a join using User Management Resource Administrator (UMRA). While this type of join produces 3 types of results, one being that the record is located within both data sources, it doesn't provide a method to determine which records have been modified.

I will provide a method to take 2 data sources and determine which records have been modified. If the record is within the resulting table then you will need to find out which information has changed, otherwise all of the information matches and you do not need to process the record. This method will work for any type of data; from Oracle, to SQL, or AD and CSV. 

If you have a source record from the authoritative resource with information like

 
ID
FirstName
LastName
1
Allan
Bogh
 
And a corresponding record in another system (doesn’t have to be SQL) like
 
ID
FirstName
LastName
1
Al
Bogh
 
You can use SQL to do a quick check to see if anything has changed. If a record is returned then you would use UMRA (or another scripting language) to determine what information changed.
 
This first method uses two dynamic tables within SQL. You would run this statement for every record that you have. After the AND statement you would put all of the columns you want to check, make sure to put ORs between each.
 
SELECT * 
FROM (SELECT '1' AS ID,'Allan' AS FN, 'Bogh' AS LN) AS T1
JOIN (SELECT '1' AS ID, 'Al' AS FN, 'Bogh' AS LN) AS T2
ON T1.ID=T2.ID 
    AND (
         T1.FN<>T2.FN
         OR T1.LN<>T2.LN
)
 
If you have a SQL source database and another SQL table with similar information, then you can perform the same query like this:
 
SELECT * 
FROM SourceDatabase.Users AS T1
JOIN UMRADatabase.Users AS T2
ON T1.ID=T2.ID 
    AND (
         T1.FN<>T2.FN
         OR T1.LN<>T2.LN
)
 
Or if you want to be specific with the columns you pull back from your SQL datasources you could use
 
SELECT T1.ID,
      T1.FN,
      T1.LN,
      T2.ID,
      T2.FN,
      T2.LN 
FROM SourceDatabase.Users AS T1
JOIN UMRADatabase.Users AS T2
ON T1.ID=T2.ID 
    AND (
         T1.FN<>T2.FN
         OR T1.LN<>T2.LN
)
 
Or another way:
 
SELECT *
FROM (SELECT ID, FN, LN FROM SourceDatabase.Users) AS T1
JOIN (SELECT ID, FN, LN FROM UMRADatabase.Users) AS T2
ON T1.ID=T2.ID 
    AND (
         T1.FN<>T2.FN
         OR T1.LN<>T2.LN
)
 
The result will be any records which have been changed:
 
ID
FN
LN
ID
FN
LN
1
Allan
Bogh
1
Al
Bogh

 

Comments (0)


:

:

:


: formatting help
Close

Formatting instructions:

You can use <a> tags but everything else will be stripped and your comment will look funny.

I swear, don't use html except the <a> tag or else some random star will supernova. Remember, we have a star right next to us, so don't try it.

This isn't bbcode either so don't use it. That is all.