Our DB contains Chinese characters (UTF-8). During migration, some truncated character at the end of a vchar field caused "right truncation of string data" error.

By "Truncated Character" I mean, say, a 3-byte character was truncated when it was inserted in the old DB due to field size limit, leaving 2 meaningless bytes at the end of the string.

Seems that SA16, when loading the data in, would treat the meaningless 2 bytes as a 3-byte character, hence exceeded the vchar field size.

Tried these migration methods, all failed:

  • migration wizard in SC (Sybase Central) 16, read old DB by ODBC
  • open the SA10 DB files directly by SC16 and let it convert
  • unload the SA10 DB by SC10 into *.dat files, then LOAD into a new DB in SA16

Except turning off the "string_rtruncation" option, is there any setting to tell SA16 just take the exact bytes that were in the SA10 old DB, "garbage out garbage in"?

asked 04 Aug '13, 00:32

wcl's gravatar image

wcl
46114
accept rate: 0%

edited 04 Aug '13, 07:22


"Right truncation of string data" doesn't mean that a character has been truncated. It means that the string you are inserting doesn't fit within the column's defined width and therefore some characters had to be truncated from the string (and, btw, SA never truncates in the middle of a multibyte character). By default, CHAR columns have byte-length semantics so, for example, a column might need to be declared as CHAR(3) to fit a single Chinese character.

If you are unloading from a UTF8 database and reloading the same data into a new UTF8 database with the very same schema, it's not obvious how you are getting this error.

To provide more assistance, I would need to know the exact data for the row that fails to be inserted, the definition of the table involved, the statements used to unload/load the data and the collations of the old & new databases.

-john.

permanent link

answered 04 Aug '13, 00:44

John%20Smirnios's gravatar image

John Smirnios
9.0k377114
accept rate: 39%

1

If you are unloading from a UTF8 database and reloading the same data into a new UTF8 database with the very same schema

Yes, this is what I am doing.

I know "Right truncation of string data" doesn't mean that a character has been truncated.

My case is, I encountered "Right truncation of string data" during migration, then found the exact row and column in the old DB that caused this error, and found by observation that it has (before migration) a truncated character at the end. Removing the truncated character made the migration error go away.

I don't know why the character was truncated in the old DB in the first place. Perhaps by the Java program that uses the DB.

Not sure how much I can provide (table definition, exact data). But I will see what I can do when back to office on Monday.

Some basic settings:

SA 10, Win2k3 Ent Ed. R2 SP2, 32bit

SA 16, Win2k8 St Ed. SP2, 32 bit

(below are true for both old and new DB)

DB page size 2K

CHAR collation sequence UTF8

CHAR character set encoding UTF-8

NCHAR collation sequence UCA

NCHAR character set encoding UTF-8

(04 Aug '13, 01:20) wcl
Replies hidden

What does happen when you try to unload/reload the database on SA 10? Or when you update the rows with "wrong data" to the same value in the old database? Does that fail, too?

(04 Aug '13, 06:01) Volker Barth

No error occurred when unloading the old DB from SA10 to *.dat files. Haven't tried to reload to SA10 though.

I did tried to update the truncated string to another row of the same table in SA10. No error occurred. This was what I did:

  1. In Interactive SQL GUI, select * from the table, find out the row from the displayed result, right-click on the truncated string, choose "copy cell".

  2. click on same column of another row, paste the value.

  3. run "commit"

(04 Aug '13, 07:33) wcl

When you found the row with the truncated character, was the character written out using hex escape notation (eg 'xE6x80')? Whenever UNLOAD TABLE encounters a truncated character at the end of a string, it uses escapes otherwise the file may not be parseable when it is being loaded. The next byte which would likely be a closing quote or column delimiter would be interpreted as a follow byte for the truncated character. Data from the next column could then be interpreted as part of the current column.

I experimented with a v10 database containing a column with a truncated character at the end of a string and it did write it out with escapes.

(04 Aug '13, 08:00) John Smirnios

Yes, after unloading to *.dat files, the truncated character was written out like that: \xE6\x80

(04 Aug '13, 21:38) wcl

Just tested, unloading and reloading from/to SA10 also failed.

So, this is not a migration issue between SA10 and SA16. It's an issue (bug?) of how UNLOAD and LOAD functions handle truncated multi-byte character?

My LOAD statement, which was generated by the UNLOAD function of the SC10 GUI.:

LOAD TABLE "owner"."table1" ("col1", "col2") FROM 'c:/temp/unload/400.dat' FORMAT 'ASCII' QUOTES ON ORDER OFF ESCAPES ON CHECK CONSTRAINTS OFF COMPUTES OFF STRIP OFF DELIMITED BY ',' ENCODING 'UTF-8' go

(05 Aug '13, 00:41) wcl
showing 1 of 6 show all flat view

If the column is defined as varchar the length is default defined as bytes not characters, so the UTF-8 string might require more bytes to store the same characters in the migrated DB. One workaround is to define the varchar with an explicit character term like

varchar(3 characters)
permanent link

answered 08 Jan '14, 10:41

Martin's gravatar image

Martin
8.6k119151237
accept rate: 14%

Hi, I have the same problem upgrading from ASA 9.0.X to ASA 11 or above. The problem is that default for string_rtruncation option is changed from Off to On. So you have to set this option before reloading data using:

set option public.string_rtruncation = 'Off' go

Good luck

Giorgio Papagno

permanent link

answered 05 Aug '13, 02:19

Giorgio%20Papagno's gravatar image

Giorgio Papagno
3062410
accept rate: 20%

1

IMHO this won't fix the data, but only suppress the warning message. So there'll be no hint that some truncation has happened.

(05 Aug '13, 03:00) Reimer Pods

Thanks for the above suggestion. But as explained by the other commentator, I don't quite prefer this workaround.

I don't mind losing those truncated character. They are now meaningless anyway.

But I fear I would miss some other truncation error during the migration.

(05 Aug '13, 06:58) wcl
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×209
×114
×45
×10

question asked: 04 Aug '13, 00:32

question was seen: 4,076 times

last updated: 08 Jan '14, 10:41