Fairly often, a customer will ask me to refresh their Test database with Production. Sometimes, if the customer is performing certain tests using data stored only in the Test database, I will need to manually input data from the Test application to the Test database, after it has been refreshed with Production. It can be difficult to know for certain, all the data which needs to be contained within the newly refreshed Test database. Does anyone know a method that I might use to quickly compare the data in a certain subset of tables? I am thinking about a procedure which will output the data of approximately a dozen tables to one test file. Prior to refreshing Test, I would run this procedure against both Test and Production and use a "vdiff" type utility to compare text in the two output files. Currently, I have unload statements which output data from the target tables to individual files. Does anyone know a similar process which would result in a single file for each database? Thanks. asked 28 Apr '12, 09:01 dejstone |
We have to do this kind of compare on a regular basis to test the results of complex algorithms in our application software. We also have recovery situations where we selectively restore data. There are two methodologies that we use.
answered 29 Apr '12, 13:00 Glenn Barber |
You can use the UNLOAD SELECT statement to output the result of a query which (presumably) will join all of the necessary tables so that the output is a single file. This approach will work well only if you have a star-schema (or something close) that will limit the size of the output file and, more importantly, make it easier to determine where the missing/additional data is coming from. My point is that if you're joining ten tables, and then compare the result from two databases, it can be considerably harder to pinpoint the missing/extra data if the output file differs by tens or hundreds of thousands of rows - all because 5 rows were missing from a single table. Another way of doing the comparison is to set up proxy tables to the other database, and then compose a FULL OUTER JOIN between the like tables in each database. This avoids the need to export the database contents to a flat file, but will not perform as well for large datasets. answered 28 Apr '12, 09:45 Glenn Paulley |
Just a link to a somewhat question (and discussion):
How can I compare data from proxy tables and local tables?.
And I would surely agree with Glenn's warning in case you are about to compare the results of a heavy join - I would recommend a table-by-table comparison...