Test on subset with SQL-queried values

Aug 3, 2014 at 6:30 PM

how would you set up a test where referential integrity of a dwh can be tested?
I have a set of IDs in a fact-table which must be a subset of IDs of a dimension-table. The SQL DISTINCT-queries are not always equal, therefore the test on equality fails.

The test makes sense as soon as key constraints are not set for a dwh due to performance-reasons.

Best regards,
Aug 4, 2014 at 4:59 PM
After quick rethinking I found that similarly performing is another approach:

Count all rows of the fact table.
Count all rows of the fact table inner joined with the dimension table.
Compare both results.

So there might be no need for such set-comparisons with sql-queries values.
Aug 5, 2014 at 9:28 AM
Personally, I enforce this kind of integrity rules in the architecture of my ETL ... so I usually don't test this.

If I would have to execute the test that you're describing, I'd go for a comparison of an inner join and a left outer join (fact to dimension).

Anyway, I think it's an interesting case for a new kind of tests where subset/contains could be used to compare two resultSets. Somthing that I should investigate further.
Marked as answer by Xnyl on 9/25/2014 at 6:38 AM