We have a loading addres in an application where the address is "Dialničná cesta 5, Hala D". This address has been in the application for years.

This weekend the database was upgraded from version 9 to version 17 latest build. And now this address is empty in our application. It seems that the address consists of weird characters but have never been noticed in version 9 because of no issues.

If I export the address to a text file using '># c:\address.txt' the text file contains the following data:

\x00D\x00i\x00a\x00l\x00n\x00i\x01\x0d\x00n\x00á\x00 \x00c\x00e\x00s\x00t\x00a\x00 \x005\x00,\x00 \x00H\x00a\x00l\x00a\x00 \x00D

So it looks like every character is preceded with '\x00'.

The character-set in the version 9 database is cp1252. The version in the version 17 database is windows-1252. So I would suspect it is the same character-set but still it seems to be handled differently.

Does anybody know why this has been changed or have a solution how to change this.

The data is coming from a SAP ERP system.

asked 04 Sep, 07:32

Frank's gravatar image

accept rate: 17%

IIRC, v9 didn't do much character set conversion and the collation was more about sorting things correctly assuming that the application could be trusted to encode the strings correctly. Looking at the data above, it looks like your application was inserting UCS2 or UTF16 strings into a cp1252 database. Surprisingly (since you are using a 1252 database), your data looks like it might be UTF16BE (big endian). However, it's hard to tell for sure because you are using a client app and translations can occur. If you use UNLOAD TABLE, you can see what the server is actually storing.

After v9, character sets were handled much more accurately and conversions are performed where necessary. I would suggest that you do a dbunload of your v9 database and take a good hard look at the data to see how it is actually encoded (I'm happy to help you figure that out if it is not obvious -- just post a few samples that don't violate any privacy). Hopefully a consistent encoding is used throughout. Then change the reload.sql script and add the correct ENCODING "xxxx" clause to each of the LOAD TABLE statements and execute the reload.sql in a new empty v17 database.

permanent link

answered 04 Sep, 08:00

John%20Smirnios's gravatar image

John Smirnios
accept rate: 38%

Thanks a lot. I'll check this.

(05 Sep, 02:12) Frank
Replies hidden

I unloaded 1 table from the database and it is showing ENCODING 'windows-1252'.

but the table still contains 2 records with the different character set:

\x00V\x00A\x00S\x00C\x00O\x00 \x00T\x00E\x00C\x00H\x00 \x00S\x00P\x00 \x00Z\x00 \x00O\x00
\x00F\x01P\x00 \x00U\x00T\x00.\x00 \x001\x007\x008

(22 Oct, 04:55) Frank
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:


question asked: 04 Sep, 07:32

question was seen: 170 times

last updated: 22 Oct, 04:55