SA-12 vs SA-16 performance on long varchar columns and extension tables

I am trying to keep this short but I have to explain some details of my/our issues. Below there is a TLDR and the Questions!

I am currently analysing a performance issue on a client system with Sybase SQL Anywhere 12/16. In short this client upgraded from sa-12 to sa-16. Reads and writes have slowed down significantly. Best case 25% slower, worst (some complex view) 1000% or no result is found within a reasonable timeframe.
I have been given access to a testing environment: One dbeng12, (old backup) one dbeng16 (newer backup).
The only key difference I have been able to find is the number of extension pages. Some table(s) have a long varchar column. This is a “Note” Field which most of the times is NULL but can contain longer texts.

I looked at the fragmentation tab in Sybase central (and the corresponding statement):

Table A in sa-12 (Pagesize 16k):
Rows: 49.800.000
Number of Pages 700.000
Number of extension Pages: 125 (not thousand)

Table A in sa-16 (Pagesize 16k):
Rows: 53.400.000
Number of Pages 751.000
Number of extension Pages: 246.000

Difference in rows and (normal) pages is due to the 6 months that passed between the two backups.
I am being assured that the long-varchar column in question has not been updated or alterd in any significant way. Which means all of the data which was present 6 months ago, is still part of the data today and therefore within the table today.

I have been reading up on blobs and extension pages. As far as I now understand the long varchar column is considered a blob. This means that as soon as this column is about to contain more than 254 bytes of data, this data will (by default) be placed on an extension page. (In short: If an Update extends above PCTFREE, the page is full or an INSERT trys to insert more than 254 bytes -> row split and the data of that rows column will be placed on an extension page.

I don’t understand how it is possible that I have almost no extension pages for this table in sa-12 and a quarter of a million in sa-16.

I have tried to reorganize the table in sa-16 and after a checkpoint and refresh the numbers almost stayed the same. The extension pages even went up by about 10.000.

Even if there was a big change within the long varchar colums (i.e. the client has added a lot more data) I have to assume that the data within the long varchar columns is here to stay.

TLDR: How can I increase read performance for tables which have a long varchar column? For most rows this column is NULL. For rows that contain data, the average length of this column is about 10.000 characters. I can’t change the table’s structure due to external circumstances.

My current Ideas:

INLINE
I have experimented with the INLINE Column attribute but if I set this to about 11.000 bytes this would mean, with 16k pagesize, one row per page in a worst case scenario.

BLOB INDEX
I have only read about this but an Index on the long varchar columns “could” increase performance. My worry here is that I would add another huge block of data containing the same information the extended pages contain, only to find data within the extended pages. I don’t know if this would do anything but my gut feeling says that this will not help.

Realigning the Columns
I have almost scrubbed this idea. I have read that a row split not only moves the “problem column” data to an extension page, but also every column that comes after that column within that row. This would mean that I could create a new long varchar column, which would now be located at the end, copy the data from the old column, drop the old one and rename the new column. This would move my “problem column” to the end of the row and in theory mean that even if a row split happens, all columns remain on the page except for the long varchar columns which would be located on the extension pages. This idea might be based on false information as I have read that in case of a row split only the “culprit” column gets ostracized. But I might be mixing row splits and row continuations which is why I wanted to present this solution.

I would be very thankful for any input or ideas!

Best regards
Daniel

### UPDATE I’ll add information here that I forgot and was asked:

The schema of both databases stayed mostly the same
I compared the execution plans, at first there were differences, after updating the statistics, querys in both systems used the same route/indexes
Regarding parallelism: I have yet to try SET OPTION PUBLIC.MAX_QUERY_TASKS = '1' but did notice a difference in behaviour regarding the number of concurrent tasks according to the mpl statistcs
Trimming the data within the long varchar columns resulted in better performance. SA-16 was almost as fast as SA-12.
INLINE and PREFIX are both NULL for all long varchar columns

asked 16 Jul '18, 07:28

Daniel AsaDev
41●1●1●6
accept rate: 0%

edited 16 Jul '18, 09:17

Just two questions to help others with further details:

Is the schema of both databases for the according table identical ? Do you use the INLINE and PREFIX clauses for the according long varchar column, or are the defaults used, i.e. PREFIX 8 INLINE 256 for character data?
Have you compared the plans of some of the queries with large differences in v12 vs. v16? This is to make sure the worse performance is really due to blob storage and not to other issues - IMHO, v16 sometimes performes much worse because of "more intra-query parallelism" but I can't tell whether that applies here.

(16 Jul '18, 08:16) Volker Barth

Replies hidden

Just to re-enforce Volker's point - do a quick test on v16 with

SET OPTION PUBLIC.MAX_QUERY_TASKS = '1';

before anything else.

(16 Jul '18, 08:25) Justin Willey

Thanks for the quick reply!

The schema of both databases has not changed much. INLINE and PREFIX are both NULL, so it should be the 254/8 Default.

About the parallelism I noticed one thing. The servers are started with the -gns Option to output the multiprogramming level statistics. I did notice that on sa16, when running a longer query, the number of tasks plummets almost right away to the minimum. Also that the concurrency was often at 2 with algorithm=0 on sa-16. On sa-12 I always saw concurrency[1] or 0 and algorithm=2.

I will try to SET OPTION PUBLIC.MAX_QUERY_TASKS = '1'; once I get access to their system again.

BUT I don’t think that the Parallelism is the culprit here. I did one test. I ran an UPDATE Statement to trim the long varchar columns in both sa-12 and sa16. I reduced every long varchar column to 20 characters. (I changed the data not the type of the column). After that statement ran I checked performance again. Almost every query I tried in sa16 was close to the performance in sa12.

PS: There was one more oddity I forgot to mention: Every execution plan I checked I noticed that in SA-16 I get twice as many cache hits and reads in the subtree statistics.

(16 Jul '18, 09:16) Daniel AsaDev

It might be worth posting your server start-up parameters. We used to do lots of messing with threads etc in v11 (never used 12) but things went much better in v16 when we stripped all that out and left everything on auto.

(16 Jul '18, 10:40) Justin Willey

Testinvironment SA12
-n placeholder -c 9G -ca 0 -cc -cr -cv -ti 1440 -gnl 50 -gnh 1610 -gns -os 10M c:\placeholder\placeholder.db
Testinvironment SA16
-n placeholder -c 9G -ca 0 -cc -cr -cv -ti 1440 -gnl 50 -gnh 1610 -gns -os 10M -x tcpip(ServerPort=2638) c:\placeholder\placeholder.db
Production Invironment SA16
-n placeholder -c 50G -ca 0 -cc -cr -cv+ -ti 1440 "c:\placeholder\placeholder.db" -gnl 40 -gnh 140 -gns -x TCPIP(ServerPort=2638) -o "c:\placeholder\placeholder.txt"

(16 Jul '18, 11:55) Daniel AsaDev

It's hard to diagnose dynamic performance problems by looking at

static statistics like number of pages and
manual snapshots of a few server-level dynamic performance statistics like thread count . . .

. . . better to use tools that continuously capture data at the server, connection and/or query level.

The best tools are the

third-party Foxhound database performance monitor for SQL Anywhere (runtime statistics at the server and connection level)
SQL Anywhere procedure profiler (time spent executing each procedure statement)

Some folks (not me) have good luck with the

SQL Anywhere Application Profiling / Database Tracing (time spent and plans used for each query)
SQL Anywhere Index Consultant (suggests indexes that might have helped speed up the workload you run)

Then there's this . . .

SQL Anywhere health and availability Monitor

FWIW, when helping clients diagnose their performance problems, I always start with Foxhound... and more often than not, it's all that's needed :)

(18 Jul '18, 15:04) Breck Carter

Replies hidden

I never tried Foxhound, I have seen Screenshots and Blogs about it but I would have to requisition it, and that might take some doing.

About the "SQL Anywhere procedure profiler" With this I am familiar. I have successfully used this on multiple occasions. I did not try this this time because something as simple as "SELECT * FROM TableA JOIN TableB ON (...)" has a longer execution time on SA12 then on SA16. So I didn’t see the point in checking which procedures gets executed how many times and what lines “waste” the most execution time.

I did check the execution plans and they were virtually identical. Except for one thing: The CacheHits and Cache reads were always twice as high on SA16 then on the SA12.

The Index Consultant is something I did not try yet. This is definitely worth a try. I will also try the Monitor, though I have been using dbconsole a lot to check certain metrics but did not find anything unusual when comparing the two test environments.

(23 Jul '18, 10:14) Daniel AsaDev

Is the char-collation for both databases the same?

(24 Jul '18, 09:42) Martin

Replies hidden

Which settings do I need to check? I have found NcharCharSet which is set to UTF-8 and NcharCollation is UCA

(26 Jul '18, 03:54) Daniel AsaDev

Well, you are relating to "long varchar" data, not "long nvarchar", so the relevant charset and collation are those for non-wide char...

What do the following properties reveal on both databases:

-- For CHAR data:
select db_property('CharSet'),
  db_property('Collation'),
  db_extended_property('Collation', 'AccentSensitivity'),
  db_extended_property('Collation', 'CaseSensitivity'),
  db_extended_property('Collation', 'PunctuationSensitivity'),
  db_extended_property('Collation', 'Properties'),
  db_extended_property('Collation', 'Specification');

-- For the NCHAR collation:
select db_property('NcharCharSet'),
  db_property('NCharCollation'),
  db_extended_property('NCharCollation', 'AccentSensitivity'),
  db_extended_property('NCharCollation', 'CaseSensitivity'),
  db_extended_property('NCharCollation', 'PunctuationSensitivity'),
  db_extended_property('NCharCollation', 'Properties'),
  db_extended_property('NCharCollation', 'Specification');

The most important property is 'CharSet': Is that the same on both databases? (I.e. changing from a single byte code page like Windows-1252 to a multi-byte encoding like UTF-8 will increase your storage requirements - although usually modestly for common Western languages).

(26 Jul '18, 05:42) Volker Barth

showing 3 of 10 show all flat view

Be the first one to answer this question!

toggle preview

community wiki:

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

*italic* or _italic_
**bold** or __bold__
link:[text](http://url.com/ "title")
image?![alt text](/path/img.jpg "title")
numbered list: 1. Foo 2. Bar
to add a line break simply add two spaces to where you would like the new line to be.
basic HTML tags are also supported

learn more about Markdown

Question tags:

sa-12 ×438
performance ×275
sa-16 ×261
long-running-query ×18
fragmentation ×9

question asked: 16 Jul '18, 07:28

question was seen: 1,641 times

last updated: 26 Jul '18, 05:43

SAP SQL Anywhere

SA-12 vs SA-16 performance on long varchar columns and extension tables

Follow this question

Related questions