Characterset issue after upgrade from SQL ANywhere version 9 to 17

We have a loading addres in an application where the address is "Dialničná cesta 5, Hala D". This address has been in the application for years.

This weekend the database was upgraded from version 9 to version 17 latest build. And now this address is empty in our application. It seems that the address consists of weird characters but have never been noticed in version 9 because of no issues.

If I export the address to a text file using '># c:\address.txt' the text file contains the following data:

\x00D\x00i\x00a\x00l\x00n\x00i\x01\x0d\x00n\x00á\x00 \x00c\x00e\x00s\x00t\x00a\x00 \x005\x00,\x00 \x00H\x00a\x00l\x00a\x00 \x00D

So it looks like every character is preceded with '\x00'.

The character-set in the version 9 database is cp1252. The version in the version 17 database is windows-1252. So I would suspect it is the same character-set but still it seems to be handled differently.

Does anybody know why this has been changed or have a solution how to change this.

The data is coming from a SAP ERP system.

character-set

asked 04 Sep '20, 07:32

Frank Vestjens
1.3k●38●48●66
accept rate: 20%

flat view

One Answer:

active answers oldest answers newest answers popular answers

IIRC, v9 didn't do much character set conversion and the collation was more about sorting things correctly assuming that the application could be trusted to encode the strings correctly. Looking at the data above, it looks like your application was inserting UCS2 or UTF16 strings into a cp1252 database. Surprisingly (since you are using a 1252 database), your data looks like it might be UTF16BE (big endian). However, it's hard to tell for sure because you are using a client app and translations can occur. If you use UNLOAD TABLE, you can see what the server is actually storing.

After v9, character sets were handled much more accurately and conversions are performed where necessary. I would suggest that you do a dbunload of your v9 database and take a good hard look at the data to see how it is actually encoded (I'm happy to help you figure that out if it is not obvious -- just post a few samples that don't violate any privacy). Hopefully a consistent encoding is used throughout. Then change the reload.sql script and add the correct ENCODING "xxxx" clause to each of the LOAD TABLE statements and execute the reload.sql in a new empty v17 database.

permanent link

answered 04 Sep '20, 08:00

John Smirnios
12.0k●3●96●166
accept rate: 37%

Thanks a lot. I'll check this.

(05 Sep '20, 02:12) Frank Vestjens

Replies hidden

I unloaded 1 table from the database and it is showing ENCODING 'windows-1252'.

but the table still contains 2 records with the different character set:

\x00V\x00A\x00S\x00C\x00O\x00 \x00T\x00E\x00C\x00H\x00 \x00S\x00P\x00 \x00Z\x00 \x00O\x00
\x00F\x01P\x00 \x00U\x00T\x00.\x00 \x001\x007\x008

(22 Oct '20, 04:55) Frank Vestjens

flat view

Your answer

toggle preview

community wiki:

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

*italic* or _italic_
**bold** or __bold__
link:[text](http://url.com/ "title")
image?![alt text](/path/img.jpg "title")
numbered list: 1. Foo 2. Bar
to add a line break simply add two spaces to where you would like the new line to be.
basic HTML tags are also supported

learn more about Markdown

Question tags:

character-set ×15

question asked: 04 Sep '20, 07:32

question was seen: 846 times

last updated: 22 Oct '20, 04:55

SAP SQL Anywhere

Characterset issue after upgrade from SQL ANywhere version 9 to 17

Follow this question

Related questions