It's common for the first line of a comma-separated or tab-separated text file to be the column names for the data that follows, so it's natural to want to take advantage of this in OPENSTRING or LOAD TABLE.

I'm sure that this would have already been considered by the SQLA development folks, so I'm wondering:
- Have you found a reason why this isn't a good idea or not feasible, so don't hold my breath?
OR
- Is this on a long list of "good ideas", but you've been adding more important stuff and you can't do everything all at once, so maybe someday down the road, no promises?

Thanks!

Loyal fan,
Dan K.

asked 15 Sep '10, 23:45

Dan%20Konigsbach's gravatar image

Dan Konigsbach
455101220
accept rate: 0%

edited 08 Oct '10, 10:05

Volker%20Barth's gravatar image

Volker Barth
40.1k361549819

I don't know about everyone else, but I'm not quite sure what you're asking for. Are you saying there should be a function to create said table 'on-the-fly' based on the column names?

(16 Sep '10, 00:08) Calvin Allen

No, if you look at the OPENSTRING WITH clause or the LOAD TABLE column-name list, you will see that these let you identify the columns in the input data, so that you can correlate them with the destination. (Technically, the LOAD TABLE column list refers to destination columns, but it is still identifying the columns in the input data.)

This means that the SQL statement needs to know the format of the incoming data, which is inherently less robust than having self-defining data.

When a CSV or TSV files is capable of being self-defining, I'm suggesting it should do so.

Understand now?

(16 Sep '10, 01:34) Dan Konigsbach

Without the row schema for OPENSTRING, we wouldn't be able to infer the datatype of the incoming data. Are you suggesting an option to return all columns as CHAR or perhaps NCHAR? We could use CHAR if the source encoding matches the CHAR encoding and NCHAR otherwise. For LOAD TABLE, I suppose we could match the column names from the data file with those in the target table and infer the datatype that way. Is that it?

(16 Sep '10, 04:07) John Smirnios

I guess it would be nice for the LOAD TABLE to use the first line of a CSV as the field names, especially if for whatever reason you had exported them in a different order than you were importing them.

(16 Sep '10, 14:34) Daz Liquid

Obviously the unload table would need an option for it too :-)

(16 Sep '10, 14:36) Daz Liquid
3

@John: IMO this is a job for proxy tables (see Karim's reply), rather than adding a whole bunch of intelligence to the other techniques.

(16 Sep '10, 15:19) Breck Carter
1

Karim's reply is pretty smart, I think it's even better than using the field names in the load table because it creates the fields too. Ta

(17 Sep '10, 08:29) Daz Liquid
More comments hidden
showing 5 of 7 show all flat view

Maybe I am completely off the mark here, but assuming your server is on Windows, could you not create a remote data access server using the Microsoft Access Text Driver as the underlying ODBC driver. I believe the driver will then interpret the first line as the column names and make the appropriate guess as to what datatype the data is. You can then create a proxy table to the csv file and use select * into ... to create the base table with the proper column names, datatypes and data.

Try the following:

1) create a file called foo.csv with the following contents

col1,col2,col3,col4,col5
1,abc,3,4,5
6,def,8,9,10

2) create a dsn called mycsvdsn using the Mictosoft Access Text Driver and set the directory to point to the directory that foo.csv resides in.

3) connect to SA and create a remote server

CREATE SERVER rem CLASS 'odbc' USING 'mycsvdsn'

4) create the proxy table

CREATE EXISTING TABLE foo AT 'rem;;;foo.csv;

5) query the proxy table to make sure the column names and data look correct

SELECT * FROM foo

6) create and load the base table

SELECT * INTO mybasetab FROM foo

7) verify that mybasetab has the correct schema and data

HTH

permanent link

answered 16 Sep '10, 13:26

Karim%20Khamis's gravatar image

Karim Khamis
5.7k53870
accept rate: 40%

edited 16 Sep '10, 16:04

OPENSTRING WITH and LOAD TABLE (...) serve multiple roles:

  1. identify the columns in the input,
  2. map the columns to the destination, and
  3. provide the schema for the input (implicitly or explicitly).

I'm suggesting that you want to sever the first role from them when input can self-identify its columns.

Case 1:

The input's first row says it has columns InA, InB, InC, InD, not necessarily in that order, and where you don't need InC. You might have

OPENSTRING WITH ( InA CHAR(30), InB CHAR(30), InD CHAR(30)) 
           OPTION(COLUMNNAMES 1 SKIP 1) 

and not care that there's an extra column or if the file actually has the columns in a different order.

Or, your destination table has columns DestP, DestQ, DestR, where InA -> DestQ, InB -> DestP, InD -> DestR. Making up a silly syntax just as an example, you might have

LOAD TABLE DestTbl (DestP COMEFROM InB, 
                    DestQ COMEFROM InA, 
                    DestR COMEFROM InD) COLUMNNAMES 1 SKIP 1

Case 2:

The input file's first row has the same column names as your destination table, but not necessarily in the same order, and it might have extra columns or missing columns. Then, you might want something like a WITH AUTO NAME option, e.g.

LOAD TABLE DestTbl (DestP, DestQ, DestR) COLUMNNAMES 1 SKIP 1 
           WITH AUTO NAME

which won't care about extra columns in the input and will set missing columns to their default.

A capability like this would make external unload/reload much more robust and flexible. The old I-SQL INPUT had some self-defining formats, and though those formats are obsolete, the concept was good. A column name header row is so common in CSVs and TSVs that (humbly, and with great respect) I'd rather not have to go through Microsoft Desktop Driver ODBC support to use them.

[All syntax examples are only to illustrate the point - not actual syntax proposals.]

permanent link

answered 16 Sep '10, 19:46

Dan%20Konigsbach's gravatar image

Dan Konigsbach
455101220
accept rate: 0%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×113
×18
×7
×2
×1

question asked: 15 Sep '10, 23:45

question was seen: 3,656 times

last updated: 08 Oct '10, 10:05