According to this reply, a LONG VARCHAR COMPRESSED column value shorter that 256 bytes will not be compressed because the length is shorter than the default INLINE 256: http://sqlanywhere-forum.sap.com/questions/1038/column-compression/1039#1039 Suppose I have a LONG VARCHAR column with the following characteristics (in particular, avg_length = 152): SELECT * FROM sa_column_stats ( tab_name = 'ttt', col_name = 'ccc', tab_owner = 'DBA', max_rows = 1000000000 ); table_owner,table_name,column_name,num_rows_processed,num_values_compressed,avg_compression_ratio,avg_length,stddev_length,min_length,max_length,avg_uncompressed_length,stddev_uncompressed_length,min_uncompressed_length,max_uncompressed_length 'DBA','ttt','ccc',9757647,0,,151.89775032807609,2348.2860711857497,0,170,151.89775032807609,2348.2860711857497,0,170 What are the pros and cons of forcing compression with COMPRESSED INLINE 0 PREFIX 0? Will that even work? ( see, I'm not afraid to look stoopid :) Edit: Yes, it seems to work ( not completely stoopid, at least not this time :) Here is a small test, first of an uncompressed table fff: CREATE TABLE fff ( pkey INTEGER NOT NULL DEFAULT AUTOINCREMENT PRIMARY KEY, ccc LONG VARCHAR NULL ); INSERT fff ( ccc ) SELECT REPEAT ( 'ABCDEFGHIJ', 15 ) FROM sa_rowgenerator ( 1, 10000 ); COMMIT; Foxhound shows it uses up 1.7M of disk space (181 bytes per row) in a 11.0.1.2276 database with 4K page size: CREATE TABLE DBA.fff ( -- 10,000 rows, 1.7M total = 1.7M table + 0 ext + 16k index, 181 bytes per row pkey /* PK */ INTEGER NOT NULL DEFAULT autoincrement, ccc LONG VARCHAR NULL, CONSTRAINT ASA85 PRIMARY KEY ( -- 16k pkey ) ); Here is what Foxhound says for table ttt, which is identical except for COMPRESSED INLINE 0 PREFIX 0: CREATE TABLE DBA.ttt ( -- 10,000 rows, 868k total = 344k table + 452k ext + 72k index, 89 bytes per row pkey /* PK */ INTEGER NOT NULL DEFAULT autoincrement, ccc LONG VARCHAR COMPRESSED INLINE 0 PREFIX 0 NULL, CONSTRAINT ASA84 PRIMARY KEY ( -- 72k pkey ) ); Here are the compression statistics (in particular, avg_compression_ratio = 77%): SELECT @@VERSION, * FROM sa_column_stats ( tab_name = 'ttt', col_name = 'ccc', tab_owner = 'DBA', max_rows = 1000000000 ); @@VERSION,table_owner,table_name,column_name,num_rows_processed,num_values_compressed,avg_compression_ratio,avg_length,stddev_length,min_length,max_length,avg_uncompressed_length,stddev_uncompressed_length,min_uncompressed_length,max_uncompressed_length '11.0.1.2276','DBA','ttt','ccc',10000,10000,76.66666666666667,150.0,0.0,150,150,,,, So... it looks like COMPRESSED INLINE 0 PREFIX 0 can reduce disk usage quite a bit, even for a relatively short column. What are the cons? |
Here are the results of two tests of the INSERT side of a high-throughput application, before and after COMPRESSED INLINE 0 PREFIX 0 was specified for one single LONG VARCHAR column: Average Core2 Quad Transaction Disk Disk Average Compression CPU Usage Time Space Space Uncompressed Ratio for INSERT for INSERT Rows for Table per Row Column Length for Column Process Process --------- --------- --------- --------- ---------- ---------- ------- Before 9,703,524 5.9G 656 bytes 152 bytes 0 35% 1 sec After 28,776 18M 657 bytes 158 bytes 12.6% 80% 30 sec Some comments and observations...
Clearly, at least in this case, COMPRESSED INLINE 0 PREFIX 0 was a really bad idea... and perhaps it is a bad idea in all cases. Perhaps there is a good reason the defaults are INLINE 256 PREFIX 8. And perhaps the suggestion "don't use COMPRESSED on short columns" is a good one. |