There is no better performance test data than production data, BUT sometimes production data must be "anonymized" before being used by developers.
Does anyone have any suggestions about how to do this so that the SQL Anywhere query optimizer will pick the more-or-less the same plans and result in more-or-less the same performance?
Some possible rules come to mind: Any two equal input values must result in two equal output values; any two different input values must result in two different output values; the distribution of output values must "look" more-or-less-the same as the distribution of input values.
Does this only matter for columns in primary keys, foreign keys and indexes?
Should statistics be recreated after anonymizing? ...or is there a danger recreation will throw everything off?
asked 17 Nov '14, 14:02