SA10 to SA16, UTF8 encoding, "right truncation of string data" error

Our DB contains Chinese characters (UTF-8). During migration, some truncated character at the end of a vchar field caused "right truncation of string data" error.

By "Truncated Character" I mean, say, a 3-byte character was truncated when it was inserted in the old DB due to field size limit, leaving 2 meaningless bytes at the end of the string.

Seems that SA16, when loading the data in, would treat the meaningless 2 bytes as a 3-byte character, hence exceeded the vchar field size.

Tried these migration methods, all failed:

migration wizard in SC (Sybase Central) 16, read old DB by ODBC
open the SA10 DB files directly by SC16 and let it convert
unload the SA10 DB by SC10 into *.dat files, then LOAD into a new DB in SA16

Except turning off the "string_rtruncation" option, is there any setting to tell SA16 just take the exact bytes that were in the SA10 old DB, "garbage out garbage in"?

asked 04 Aug '13, 00:32

wcl
46●1●1●4
accept rate: 0%

edited 04 Aug '13, 07:22

flat view

3 Answers:

active answers oldest answers newest answers popular answers

"Right truncation of string data" doesn't mean that a character has been truncated. It means that the string you are inserting doesn't fit within the column's defined width and therefore some characters had to be truncated from the string (and, btw, SA never truncates in the middle of a multibyte character). By default, CHAR columns have byte-length semantics so, for example, a column might need to be declared as CHAR(3) to fit a single Chinese character.

If you are unloading from a UTF8 database and reloading the same data into a new UTF8 database with the very same schema, it's not obvious how you are getting this error.

To provide more assistance, I would need to know the exact data for the row that fails to be inserted, the definition of the table involved, the statements used to unload/load the data and the collations of the old & new databases.

-john.

permanent link

answered 04 Aug '13, 00:44

John Smirnios
12.0k●3●96●166
accept rate: 37%

If you are unloading from a UTF8 database and reloading the same data into a new UTF8 database with the very same schema

Yes, this is what I am doing.

I know "Right truncation of string data" doesn't mean that a character has been truncated.

My case is, I encountered "Right truncation of string data" during migration, then found the exact row and column in the old DB that caused this error, and found by observation that it has (before migration) a truncated character at the end. Removing the truncated character made the migration error go away.

I don't know why the character was truncated in the old DB in the first place. Perhaps by the Java program that uses the DB.

Not sure how much I can provide (table definition, exact data). But I will see what I can do when back to office on Monday.

Some basic settings:

SA 10, Win2k3 Ent Ed. R2 SP2, 32bit

SA 16, Win2k8 St Ed. SP2, 32 bit

(below are true for both old and new DB)

DB page size 2K

CHAR collation sequence UTF8

CHAR character set encoding UTF-8

NCHAR collation sequence UCA

NCHAR character set encoding UTF-8

(04 Aug '13, 01:20) wcl

Replies hidden

What does happen when you try to unload/reload the database on SA 10? Or when you update the rows with "wrong data" to the same value in the old database? Does that fail, too?

(04 Aug '13, 06:01) Volker Barth

No error occurred when unloading the old DB from SA10 to *.dat files. Haven't tried to reload to SA10 though.

I did tried to update the truncated string to another row of the same table in SA10. No error occurred. This was what I did:

In Interactive SQL GUI, select * from the table, find out the row from the displayed result, right-click on the truncated string, choose "copy cell".
click on same column of another row, paste the value.
run "commit"

(04 Aug '13, 07:33) wcl

When you found the row with the truncated character, was the character written out using hex escape notation (eg 'xE6x80')? Whenever UNLOAD TABLE encounters a truncated character at the end of a string, it uses escapes otherwise the file may not be parseable when it is being loaded. The next byte which would likely be a closing quote or column delimiter would be interpreted as a follow byte for the truncated character. Data from the next column could then be interpreted as part of the current column.

I experimented with a v10 database containing a column with a truncated character at the end of a string and it did write it out with escapes.

(04 Aug '13, 08:00) John Smirnios

Yes, after unloading to *.dat files, the truncated character was written out like that: \xE6\x80

(04 Aug '13, 21:38) wcl

Just tested, unloading and reloading from/to SA10 also failed.

So, this is not a migration issue between SA10 and SA16. It's an issue (bug?) of how UNLOAD and LOAD functions handle truncated multi-byte character?

My LOAD statement, which was generated by the UNLOAD function of the SC10 GUI.:

LOAD TABLE "owner"."table1" ("col1", "col2") FROM 'c:/temp/unload/400.dat' FORMAT 'ASCII' QUOTES ON ORDER OFF ESCAPES ON CHECK CONSTRAINTS OFF COMPUTES OFF STRIP OFF DELIMITED BY ',' ENCODING 'UTF-8' go

(05 Aug '13, 00:41) wcl

showing 1 of 6 show all flat view

If the column is defined as varchar the length is default defined as bytes not characters, so the UTF-8 string might require more bytes to store the same characters in the migrated DB. One workaround is to define the varchar with an explicit character term like

varchar(3 characters)

permanent link

answered 08 Jan '14, 10:41

Martin
9.0k●130●169●257
accept rate: 14%

flat view

Hi, I have the same problem upgrading from ASA 9.0.X to ASA 11 or above. The problem is that default for string_rtruncation option is changed from Off to On. So you have to set this option before reloading data using:

set option public.string_rtruncation = 'Off' go

Good luck

Giorgio Papagno

permanent link

answered 05 Aug '13, 02:19

Giorgio Papagno
306●2●4●11
accept rate: 20%

IMHO this won't fix the data, but only suppress the warning message. So there'll be no hint that some truncation has happened.

(05 Aug '13, 03:00) Reimer Pods

Thanks for the above suggestion. But as explained by the other commentator, I don't quite prefer this workaround.

I don't mind losing those truncated character. They are now meaningless anyway.

But I fear I would miss some other truncation error during the migration.

(05 Aug '13, 06:58) wcl

flat view

Your answer

toggle preview

community wiki:

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

*italic* or _italic_
**bold** or __bold__
link:[text](http://url.com/ "title")
image?![alt text](/path/img.jpg "title")
numbered list: 1. Foo 2. Bar
to add a line break simply add two spaces to where you would like the new line to be.
basic HTML tags are also supported

learn more about Markdown

Question tags:

sa-16 ×260
sa-10 ×119
migration ×55
utf8 ×14

question asked: 04 Aug '13, 00:32

question was seen: 9,503 times

last updated: 08 Jan '14, 10:41

SAP SQL Anywhere

SA10 to SA16, UTF8 encoding, "right truncation of string data" error

LOAD TABLE "owner"."table1" ("col1", "col2") FROM 'c:/temp/unload/400.dat' FORMAT 'ASCII' QUOTES ON ORDER OFF ESCAPES ON CHECK CONSTRAINTS OFF COMPUTES OFF STRIP OFF DELIMITED BY ',' ENCODING 'UTF-8' go

Follow this question

Related questions