I have a requirement to format the results of any SELECT statement so all columns and all rows are returned as a single field. Due to how this is integrating with vendor software, I cannot pre or post-process the query outside of this function. As an example, say we have a query like this: SELECT first_name, member_since, last_seen FROM Person The result set would normally look something like this: =============================================== | first_name | member_since | last_seen | =============================================== | Glenn | 2009-11-10 | 2012-02-01 | | Breck | 2009-11-08 | 2012-01-31 | | Volker | 2009-11-11 | 2012-01-30 | The data format standard dictates each row should be enclosed in square brackets and separated by commas. Each column should be separated by a '^' character. Therefore, if I call my function like this: SELECT query_to_dataformat( 'SELECT first_name, member_since, last_seen FROM Person' ); The result should be: '[Glenn^2009-11-10^2012-02-01],[Breck^2009-11-08^2012-01-31],[Volker^2009-11-11^2012-01-30]'
This same thing needs to work, regardless of the original query's complexity, number of columns, or column names. My initial thought on this task was to use two techniques:
Unfortunately, I am in a position where I have a variable holding a variable name and no direct way to reference the actual value. Here is my existing code (Note: this will produce extra commas and carets, but I removed the additional code that deals with those issues to make it more brief): ALTER FUNCTION "DBA"."query_to_dataformat"( IN input_sql LONG VARCHAR ) RETURNS LONG VARCHAR BEGIN DECLARE @dataformat_string LONG VARCHAR; DECLARE @column_number INTEGER; DECLARE @column_name VARCHAR(128); DECLARE @column_list DYNAMIC SCROLL CURSOR FOR SELECT column_number, name FROM sa_describe_query( input_sql ); -- Execute input SQL and loop over results FOR results_loop AS results CURSOR USING input_sql DO -- Every row begins with a '[' character SET @dataformat_string = STRING( @dataformat_string, '[' ); -- Loop over each column in the result set OPEN @column_list; column_loop: LOOP FETCH NEXT @column_list INTO @column_number, @column_name; -- Leave loop if we processed all columns IF ( SQLCODE <> 0 ) THEN LEAVE column_loop; END IF; -- Append column data to string; Columns are separated by '^' SET @dataformat_string = STRING( @dataformat_string, '^', @column_name ); END LOOP; CLOSE @column_list; -- Every row ends with a ']' character SET @dataformat_string = STRING( @dataformat_string, '],' ); END FOR; -- Return final string required data format RETURN @dataformat_string; END In the above example, @column_name would be holding "first_name", "member_since", or "last_seen". That is the variable name I need to reference from the FOR statement, but it doesn't seem possible to resolve it to the actual local FOR variable, which is holding "Breck" or "Glenn". This results in the output string of the following: '[first_name^member_since^last_seen],[first_name^member_since^last_seen],[first_name^member_since^last_seen]'
Am I going about this the wrong way? Is there a simpler solution I am missing? Environment: OS: Windows Server 2008 64bit Database: SQL Anywhere 11.0.1.2724 CPU: 2x Intel Xeon 10-Core RAM: 64GB Disk: 320GB FusionIO (database), 640GB FusionIO (log/temp) |
You did ask for simple, right? ...just so I have that right :) CREATE TABLE person ( first_name VARCHAR ( 100 ), member_since DATE, last_seen DATE ); INSERT person VALUES ( 'Glenn', '2009-11-10', '2012-02-01' ); INSERT person VALUES ( 'Breck', '2009-11-08', '2012-01-31' ); INSERT person VALUES ( 'Volker', '2009-11-11', '2012-01-30' ); COMMIT; CREATE FUNCTION query_to_dataformat ( IN @select LONG VARCHAR ) RETURNS LONG VARCHAR BEGIN DECLARE @unload LONG VARCHAR; DECLARE @result LONG VARCHAR; SET @unload = STRING ( 'UNLOAD ', @select, ' INTO VARIABLE @result ', ' DELIMITED BY ''^'' ', ' ROW DELIMITED BY ''],['' ', ' ESCAPES OFF HEXADECIMAL OFF QUOTES OFF' ); EXECUTE IMMEDIATE @unload; RETURN STRING ( '[', LEFT ( @result, LENGTH ( @result ) - 2 ) ); END; SELECT query_to_dataformat ( 'SELECT first_name, member_since, last_seen FROM Person' ); query_to_dataformat('SELECT first_name, member_since, last_seen FROM Person') [Glenn^2009-11-10^2012-02-01],[Breck^2009-11-08^2012-01-31],[Volker^2009-11-11^2012-01-30] ...and surely Breck has taken the time to blog about that solution in detail here ... and may not have taken the time to let the forum know:)
(08 Feb '12, 03:31)
Volker Barth
In case there's a need to output the result set description (say, as "first row") as well (though that's not specified by @Ralph), I guess one could use the UNLOAD with the APPEND option to concatenate the description and the result set into one string...
(08 Feb '12, 03:34)
Volker Barth
This is fantastic, great solution! I think I stumbled upon the UNLOAD statement, but must have missed the fact it could be redirected to a variable instead of a file on disk. I am marking this solution as the "accepted" one because it appears to work for all the weird edge cases.
(13 Feb '12, 11:05)
Ralph Wissing
|
You are on the right track but what you need to build from the sa_describe_query output is a statement that you will then use on an EXECUTE IMMEDIATE call. example (not tested): declare @query long varchar; declare @result long varchar; -- compose list of columns -- generates string like "c1",'^',"c2",'^',...,'^',"cn" select list( string( '"', name, '"' ), ',''^'',' order by column_number ) into @query from sa_describe_query( input_sql ); -- build rest of the statement set @query = 'select list( string( ''['',' || @query || ','']'' ), '','' ) into @result from ( ' || input_sql || ') dt'; -- generate @result from the rows selected in the input_sql -- assumption is that input_sql is a select statement execute immediate @query; return @result; Update #1: I spent a few minutes to write up a more general solution (which I have tested :-): create or replace function f_wrap_query_results ( in @select long varchar, -- must be a valid SELECT query in @orderby long varchar default '', -- optional list of columns in query in @col_sep long varchar default ',', in @col_beg long varchar default '"', in @col_end long varchar default '"', in @row_sep long varchar default char(10), in @row_beg long varchar default '{', in @row_end long varchar default '}' ) returns long varchar begin declare @query long varchar; declare @result long varchar; -- prepare delimiters for use in generated statement if isnull( @col_sep, '' ) != '' then set @col_sep = ',''' || replace( @col_sep, '''', '''''' ) || ''','; else set @col_sep = ','; end if; if isnull( @col_beg, '' ) != '' then set @col_beg = '''' || replace( @col_beg, '''', '''''' ) || ''','; end if; if isnull( @col_end, '' ) != '' then set @col_end = ',''' || replace( @col_end, '''', '''''' ) || ''''; end if; if isnull( @orderby, '' ) != '' then set @orderby = ' order by ' || @orderby; end if; -- compose list of columns -- generates string like "c1",'^',"c2",'^',...,'^',"cn" (where '^' is @col_sep) select list( string( @col_beg, '"', name, '"', -- see Encoding note below @col_end ), @col_sep order by column_number ) into @query from sa_describe_query( @select ); -- build rest of the statement set @query = 'select list( string( @row_beg,' || @query || ',@row_end ), @row_sep' || @orderby || ' ) into @result from ( ' || @select || ') dt'; -- generate @result from the rows selected in the input_sql -- assumption is that input_sql is a select statement execute immediate @query; return @result; end; Using this solution, to get your output use: select f_wrap_query_results( input_sql, null, '^', '', '', ',', '[', ']' ); Functions to wrap the query results as other data types can easily be built on top of this function. Examples: create or replace function f_wrap_query_results_as_html ( in @select long varchar, -- must be a valid SELECT query in @orderby long varchar default '', -- optional list of columns in query in @row_sep long varchar default char(10) ) returns long varchar begin return f_wrap_query_results( @select, @orderby, '', '<td>', '</td>', @row_sep, '<tr>', '</tr>' ); end; create or replace function f_wrap_query_results_as_json ( in @select long varchar, -- must be a valid SELECT query in @orderby long varchar default '' -- optional list of columns in query ) returns long varchar begin return f_wrap_query_results( @select, @orderby, ',', '''', '''', ',', '[', ']' ); end; create or replace function f_wrap_query_results_as_csv ( in @select long varchar, -- must be a valid SELECT query in @orderby long varchar default '' -- optional list of columns in query ) returns long varchar begin return f_wrap_query_results( @select, @orderby, ',', '''', '''', char(10), '', '' ); end; Encoding: Note that formatting (e.g. date types) and quoting / escaping of special characters in the column values is left as an exercise to the reader! :-) Hint: you could use the domain_name of the column to decide how to change the expression string('"', name, '"') to do the appropriate encoding of the column value. For example, if encoding as HTML then the expression could be changed to string('html_encode("', name, '")'). Example usage: select f_wrap_query_results( 'select top 3 table_id, file_id, count from systable order by table_id' ); outputs: {"1","0","274"} {"2","0","2011"} {"3","0","303"} select f_wrap_query_results_as_html( 'select top 3 table_id, file_id, count from systable order by table_id' ); outputs: <tr><td>1</td><td>0</td><td>274</td></tr> <tr><td>2</td><td>0</td><td>2011</td></tr> <tr><td>3</td><td>0</td><td>303</td></tr> select f_wrap_query_results_as_json( 'select top 3 table_id, file_id, count from systable order by table_id' ); outputs: ['1','0','274'],['2','0','2011'],['3','0','303'] select f_wrap_query_results_as_csv( 'select top 3 table_id, file_id, count from systable order by table_id', 'count desc' ); outputs: '2','0','2011' '3','0','303' '1','0','274' Note that this last example uses the @orderby parameter to change the order in which the output is listed. Update #2: A warning: any time that you are using EXECUTE IMMEDIATE on a string that is generated from user input you should be careful not to open your application to SQL Injection attacks - always sanitize your inputs! 2
Looks like a valid suggestion for a new set of builin functions, right? - Developped with a kind of public version control via the edit history:) Great solution, anyway!
(03 Feb '12, 05:05)
Volker Barth
Mark, brilliant solution! I really like the generalized function, too. I suspect it would be useful in a quite a few scenarios. One issue I'm having now is that sa_describe_query() truncates the "name" field to VARCHAR(128). Some queries have large string concatenations and functions without an alias. This screws up the SELECT list of columns from being built correctly, since some parenthesis (and possibly quotes) are not closed, causing them to become unbalanced and the EXECUTE IMMEDIATE call fails with syntax, or other errors. For example: select f_wrap_query_results( 'select top 3 string(table_id, table_id, table_id, table_id, table_id, table_id, table_id, table_id, table_id, table_id, table_id, table_id ), file_id, count from systable order by table_id'); Although the underlying issue, even if there was a way to fix that, may be the lack of an alias, in general. For example: select f_wrap_query_results( 'select top 3 table_id || table_id, table_id, file_id, count from systable order by table_id'); Results in "Derived table 'dt' has no name for column 1"
(03 Feb '12, 13:50)
Ralph Wissing
Replies hidden
The root issue in both of your examples is that there is a column in the derived table that does not have a name.... hence the "Derived table 'dt' has no name for column 1" error that you are getting in your second example. Your first example has also hit a idiosyncrasy in the server's unparser - when a column is not given an explicit name the server makes up a name for it from the text of the column expression and in this case it has chosen to add a newline (n) after "string(" ... and since identifiers are not allowed special characters the server generates a syntax error when trying to parse the generated statement. I currently can't think of a solution other than to require that all column expressions in the input query have explicitly named (and unique) column names. Perhaps someone else can think of a solution to this issue?
(03 Feb '12, 14:37)
Mark Culp
One could check whether the column_name returned by sa_describe_query is an identifier or not, and add an alias automatically in the latter case, something like: select name, case when name not regexp '\w+' then 'MyColumn_' || column_number else name end case as name_or_alias from sa_describe_query( 'select top 3 table_id || table_id, "table_id", file_id as [alias with space] from systable order by table_id') returns
When building the query, one could then use the name_or_alias to build the query and use the original name to build the column header... Note: The check with 'w+' is a simplification of the real rules for identifiers.
(03 Feb '12, 17:16)
Volker Barth
|
Just a hint:
The LIST() aggregate may be helpful to "flatten" the multiple columns into one string, and you could use "^" as the according delimiter...