My company has developed software that needs to load data transferred to clients from a server using XMPP. The client uses SQL Anywhere for its database; the server uses a different database product entirely. Our process works by keeping track of which rows of data have been transferred from one machine to the other for various data types that a client can subscribe to. When a client is added to the system for the first time, all of the data in the server to date gets downloaded. From then on, only the rows that change get sent. Our code is written in C# and we transmit batches of data over the wire from the server to the client. On the client, we iterate over the data in the batch one row at a time and run a stored procedure which performs an "upsert" operation on a table. This is generally done using an We've built the application and its running, but we're not happy with the performance of loading data of one particular type. There are millions of rows of data of that type that have to be downloaded to the client from the server and inserted into the SA database on the first run. Our code is running as fast as it cans loading data one row at a time right now. We need to find a way to speed it up, preferably by sending all of the rows of this type in the batch to the database in one I/O. I've been kicking around the idea of writing a DLL in C language to speed this process up. The C# code would convert the objects in a batch into an array of I can do all of this, except that I don't know how to do the BCP into the temporary table. I've read the documentation on Embedded SQL and nothing in there pops out at me as a way to do this. And when I search the documentation for "BCP" all I get back are references to the LOAD TABLE command and the INPUT statement. The former requires the data to be in a file on the file system and the latter is for Interactive SQL only. Not what I've looking for. Can anyone point me in the right direction? P.S. It's important that the actions of the C DLL participate in the same transaction used by the C# code. Can that be done? |
LOAD TABLE does have a "USING CLIENT FILE" clause that will allow it to read from a file on the client machine: http://dcx.sybase.com/index.html#sa160/en/dbreference/load-table-statement.html*d5e60003 Take note of some of the requirements in the 'Privileges' section though:
From ESQL, you can also do wide "puts" to insert multiple rows in one request. I'm not sure if the .net driver has a way of exposing it natively -- it might. See: http://dcx.sybase.com/index.html#1201/en/dbreference/put-statement.html I added an answer that shows a bulk insert sample using our .NET provider.
(19 Apr '13, 10:13)
Reg Domaratzki
John: Thanks! We'll be looking at the
(19 Apr '13, 10:55)
TonyV
|
Have you tried using the SABulkCopy class in the SQL Anywhere .NET provider? The following sample code (lacking any error checking for simplicity) cranks in 1000 rows at once. using System; using System.Data; using iAnywhere.Data.SQLAnywhere; namespace bulkInsert { class Program { static void Main(string[] args) { SAConnection conn = new SAConnection( "Data Source=dba_sql_12" ); conn.Open(); // Assume the following table definition : // CREATE TABLE t1 ( pk integer primary key, c1 integer ) DataTable table = new DataTable("t1"); table.Columns.Add(new DataColumn("pk", typeof(int))); table.Columns.Add(new DataColumn("c1", typeof(int))); for( int i=0; i<1000; i++ ) { table.Rows.Add(i, i); } using(SABulkCopy bulkCopy = new SABulkCopy(conn)) { bulkCopy.BulkCopyTimeout = 600; // in seconds bulkCopy.DestinationTableName = "t1"; bulkCopy.WriteToServer(table); } conn.Close(); } } } Here's another sample that uses a DataReader instead of a DataTable as the input to bulkCopy.WriteToServer(). In your case, the DataReader can point to a DataReader that you opened on the connection to the source database. using System; using System.Data; using iAnywhere.Data.SQLAnywhere; namespace bulkInsert { class Program { static void Main(string[] args) { SAConnection conn = new SAConnection( "Data Source=dba_sql_12" ); conn.Open(); SACommand cmd = new SACommand("select row_num, row_num+1 from sa_rowgenerator(1,1000)", conn ); SADataReader reader = cmd.ExecuteReader(); using(SABulkCopy bulkCopy = new SABulkCopy(conn)) { bulkCopy.BulkCopyTimeout = 600; // in seconds bulkCopy.DestinationTableName = "t1"; bulkCopy.WriteToServer(reader); } conn.Close(); } } } Reg: Actually, we looked at that when we first started using SQL Anywhere last summer, and the performance was abysmal. We believe that
(19 Apr '13, 10:22)
TonyV
Replies hidden
Comment Text Removed
There was a recent bug fix for SABulkCopy which describes an 'OutOfMemoryException' but the underlying problem was described as "SABulkCopy was loading the whole source table before copying to destination table" and it was fixed by making all WriteToServer methods respect batch size. Versions fixed: 12.0.1.3790, 11.0.1.2871, 10.0.1.4346 It sounds plausible that "loading the whole source table" could be a performance issue but I'm not familiar enough with the code to comment with any certainty.
(19 Apr '13, 10:33)
John Smirnios
I have revisited the performance of SQBulkCopy and it's still slower than code we've written to insert rows into a table one row at a time. And in the test case, the table has no indexes. To be clear, our code connects to the database, creates an If I'm not mistaken, this is the fastest way to perform the operation short of a bulk load. But
(19 Apr '13, 14:10)
TonyV
Have you tried using SELECT FROM OPENSTRING(...)? - That should omit any parameter treatment as the data can be specified via a file or a blob variable...
(19 Apr '13, 18:44)
Volker Barth
|
You may find some ideas in some of the articles about LOAD TABLE and BCP on my blog; here's a sampling... UNLOAD and LOAD Via Named Pipes |
Doesn't that sound like a solution for some sync technology like MobiLink (which would batch inserts/updates by default, AFAIK)?
Just to ask: Are you aware that as of v12, the INSERT ... VALUES statement can handle multiple rows in one step by specifying
That might be a way to batch inserts on the SQL side, as well, without having to use temporary tables.
Using the OPENSTRING function, i.e.
comes to my mind, too.
Yes, I know about
INSERT
being able to insert multiple rows of data, but the problem is how do you get that to work from C#? You'd need one parameter per row of data to be inserted. That would probably put a limit on the number of rows in the batch equal to `MaxNumberOfParameters / NumberofColumns'. I don't know what the first number is, but we're talking about operating on at least 500 rows in a batch, maybe more. Regarding MobiLink, I don't think we have a license and the word "Mobi" in the name made me think of cell phones. This is a PC application, so it never occurred to me to look at that. Finally, thanks for the information on the OPENSTRING function. I didn't know that existed.