We have a loading addres in an application where the address is "Dialničná cesta 5, Hala D". This address has been in the application for years. This weekend the database was upgraded from version 9 to version 17 latest build. And now this address is empty in our application. It seems that the address consists of weird characters but have never been noticed in version 9 because of no issues. If I export the address to a text file using '># c:\address.txt' the text file contains the following data: \x00D\x00i\x00a\x00l\x00n\x00i\x01\x0d\x00n\x00á\x00 \x00c\x00e\x00s\x00t\x00a\x00 \x005\x00,\x00 \x00H\x00a\x00l\x00a\x00 \x00D So it looks like every character is preceded with '\x00'. The character-set in the version 9 database is cp1252. The version in the version 17 database is windows-1252. So I would suspect it is the same character-set but still it seems to be handled differently. Does anybody know why this has been changed or have a solution how to change this. The data is coming from a SAP ERP system. asked 04 Sep '20, 07:32 Frank Vestjens |
IIRC, v9 didn't do much character set conversion and the collation was more about sorting things correctly assuming that the application could be trusted to encode the strings correctly. Looking at the data above, it looks like your application was inserting UCS2 or UTF16 strings into a cp1252 database. Surprisingly (since you are using a 1252 database), your data looks like it might be UTF16BE (big endian). However, it's hard to tell for sure because you are using a client app and translations can occur. If you use UNLOAD TABLE, you can see what the server is actually storing. After v9, character sets were handled much more accurately and conversions are performed where necessary. I would suggest that you do a dbunload of your v9 database and take a good hard look at the data to see how it is actually encoded (I'm happy to help you figure that out if it is not obvious -- just post a few samples that don't violate any privacy). Hopefully a consistent encoding is used throughout. Then change the reload.sql script and add the correct ENCODING "xxxx" clause to each of the LOAD TABLE statements and execute the reload.sql in a new empty v17 database. answered 04 Sep '20, 08:00 John Smirnios Thanks a lot. I'll check this.
(05 Sep '20, 02:12)
Frank Vestjens
Replies hidden
I unloaded 1 table from the database and it is showing ENCODING 'windows-1252'. but the table still contains 2 records with the different character set: \x00V\x00A\x00S\x00C\x00O\x00 \x00T\x00E\x00C\x00H\x00 \x00S\x00P\x00 \x00Z\x00 \x00O\x00 \x00F\x01P\x00 \x00U\x00T\x00.\x00 \x001\x007\x008
(22 Oct '20, 04:55)
Frank Vestjens
|