[This is a follow-up form the question "Find random free alphanumeric value" in the general newsgroup]: If one wants to generate a sequence of 4 random digits/upper letters in SQL, how can this be done? My intention is to use
However, as the range of the ASCII values for digits (48-57) and upper letters (65-90) is not adjacent, a simple usage of *CHAR(cast(truncnum(rand() * 36, 0) as int) + 48)* won't do. Instead, based on the generated integer value, one would have to add 48 (for char indexes < 10) or 55. As a consequence, it's necessary to call CHAR() with an if expression as argument. In order to evaluate the expression, it must be used twice, i.e. for comparison and then to add the according constant. Therefore, I have used a derived table to build the rand() values, as in the sample below:
But my attempts to use an IF EXPRESSION for that seem to fail. The syntax seems o.k. but I have the impression that the RAND values (as used in the aliased columns) are re-evaluated for each usage of the alias, and as such, are not deterministic during the evaluation of the expression. That's wrong, IMHO. An exemplary output is: 34,53,'>',1,67,'5',20,72,'L',19,67,':' Obviously, the 2nd column is not based on the value 34 that is used in the 1st column (it should be 34 + 55 = 89), and so on. As a consequence, the columns chr1 to chr4 with the desired character may return characters that are - in contrast to the requirement - neither digits nor upper letters, e.g. a colon. I can bring this down to the simple query:
IMHO, this should return a row with two identical values, with different results for each call. But with SA 12.0.0.2566, it returns different columns, such as
Only when the second expression is simplified to a repetition of the same expression, then two identical values get returned:
Resume: Basically this seems a wrong treatment of rand()-based expressions in derived tables. |
Maybe you want RAND() to be deterministic, but you will be lonely :)
http://dcx.sybase.com/index.html#1200en/dbreference/rand-function.html Once upon a time (I think) RAND was deterministic, but that bug got fixed. FWIW here's the code I use... --------------------------------------------------------------------- BEGIN DROP FUNCTION generate_random_string; EXCEPTION WHEN OTHERS THEN END; CREATE FUNCTION generate_random_string ( IN @return_length INTEGER, IN @character_set VARCHAR ( 100 ) DEFAULT '' ) RETURNS LONG BINARY NOT DETERMINISTIC BEGIN DECLARE @characters_to_use LONG BINARY; DECLARE @character_count DOUBLE; DECLARE @random_number DOUBLE; DECLARE @return_position INTEGER; DECLARE @character_position INTEGER; DECLARE @return_value LONG BINARY; CASE @character_set WHEN 'caps' THEN SET @characters_to_use = STRING ( 'QWERTYUIOPASDFGHJKLZXCVBNM' ); WHEN 'letters' THEN SET @characters_to_use = STRING ( 'QWERTYUIOPASDFGHJKLZXCVBNM', 'qwertyuiopasdfghjklzxcvbnm' ); WHEN 'numbers' THEN SET @characters_to_use = STRING ( '1234567890' ); WHEN 'alphanumeric' THEN SET @characters_to_use = STRING ( 'QWERTYUIOPASDFGHJKLZXCVBNM', 'qwertyuiopasdfghjklzxcvbnm', '1234567890' ); WHEN 'loweralphanumeric' THEN SET @characters_to_use = STRING ( 'qwertyuiopasdfghjklzxcvbnm', '1234567890' ); WHEN 'upperalphanumeric' THEN SET @characters_to_use = STRING ( 'QWERTYUIOPASDFGHJKLZXCVBNM', '1234567890' ); ELSE SET @characters_to_use = STRING ( -- omitted: single quote, back slash 'QWERTYUIOPASDFGHJKLZXCVBNM', 'qwertyuiopasdfghjklzxcvbnm', '1234567890', '`~-=!@#$%^&*()_+{}|[]:"<>?;,./' ); END CASE; SET @character_count = LENGTH ( @characters_to_use ); SET @random_number = RAND(); SET @return_position = 1; WHILE @return_position <= @return_length LOOP SET @character_position = CAST ( ROUND ( @random_number * @character_count, 0 ) AS INTEGER ); SET @return_value = STRING ( @return_value, SUBSTR ( @characters_to_use, @character_position, 1 ) ); -- MESSAGE STRING ( @return_position, ', ', @random_number, ', ', @character_position, ', ', @return_value ) TO CLIENT; SET @random_number = RAND(); SET @return_position = @return_position + 1; END LOOP; RETURN @return_value; END; -- generate_random_string --------------------------------------------------------------------- -- Testing... /* BEGIN DECLARE @seed INTEGER; DECLARE @dummy DOUBLE; -- Note: DATEDIFF returns INTEGER so the range of @seed is limited. -- However, this logic will work well past the year 2055 since -- a negative seed is OK. -- Note: The extra "seeding" crap may no longer be necessary in V12. SET @seed = DATEDIFF ( SECOND, CURRENT TIMESTAMP, '2055 10 27 00:00:00' ); SET @dummy = RAND ( @seed ); SET @dummy = RAND(); SET @dummy = RAND(); SELECT CAST ( generate_random_string ( 5 ) AS VARCHAR ) AS "1", CAST ( generate_random_string ( 5, 'caps' ) AS VARCHAR ) AS "2", CAST ( generate_random_string ( 16, 'alphanumeric' ) AS VARCHAR ) AS "3", CAST ( generate_random_string ( 15, 'loweralphanumeric' ) AS VARCHAR ) AS "4", CAST ( generate_random_string ( 30, 'upperalphanumeric' ) AS VARCHAR ) AS "5"; END; 1,2,3,4,5 'yk2{j','XGBCL','SKJhKeOmxhAhGJWF','dxw4oef9rkocygh','FDQGCWZ6RIZQMJE447Y7BP84N0QOK8' */ No, that's a misunderstanding. I do want RAND() to be not deterministic, of course. Everything else would be nonsense. But I don't think that an expression based on RAND() should be re-evaluated each time is it referenced. And I remember that Glenn has pointed out in the general newsgroup a while ago that this kind of re-evaluation is not desired. (Haven't got the actual news thread, however). Sidenote: As stated in my comment, v8 treats that kind of non-determinism correctly: Different calls of RAND() within the same query are not deterministic, computations based on their results are. General newsgroup reference: "rand() vs order by" from 2010-09-04 - here there are other problems with RAND() but possibly the same reason applies. Too deep for me... I wouldn't intentionally code multiple calls to RAND() inside a query to save my life, I have absolutely no idea how the query optimizer rearranges things for execution. I leave that for you and Glenn to discuss :) |
Just to add: The same behaviour appears with SA 11.0.1.2427, whereas older versions (e.g. 8.0.3.5574) return correct results. Some kind of over-optimization in newer versions?
I just stumbled over that again with 12.0.1.3484 - the behaviour is still the same.
Is this behaviour of rand() by design? - Anymore insights are still appreciated:)
It seems to be more by design of how the "as" keyword works: rndInd1 ist obviously just a placeholder for the casted value and so probably the select statement is rewritten by SQL Anywhere to:
select cast(truncnum(rand() * 36, 0) as int), cast(truncnum(rand() * 36, 0) as int)+0
For this statement the observed behavior is correct. The "as" defines just an alias not a variable.
If this is true, then
would be rewritten as well, and would give different results for both (same-named) columns as well - but it does not in that case. So I'm not sure if the aliasing is the source of the problem here.