dbremote on windows + linux (Character encoding problem)

Hello,

we use SQL Anywhere 10.0.1.4103.

The consolidate server is running under windows and a most of the remote sites are running windows too. We have recently added a few linux remotes, but there we see a replication problem with some characters.

When a windows remote enter text in the database, sometimes the replication fails on the linux systems with this message:

I. 2013-09-03 09:03:00. INSERT INTO DBA.CardEntries(CardEntries,ClientCard,CreationDate,Amount,
                              REMOTENAME,Salesperson,EntryProcessed,BatchID)
                        VALUES ('O000CB','J001UW','12:26:15.446918 2013/08/30',169,'SiteTest1','Sil Sch.r',0,NULL)
E. 2013-09-03 09:03:00. SQL-Anweisung fehlgeschlagen: (-131) Syntaxfehler bei 'Sil Sch.,0,NULL)' in Zeile 3
E. 2013-09-03 09:03:00. Wird übersprungen:
E. 2013-09-03 09:03:00. INSERT INTO DBA.CardEntries(CardEntries,ClientCard,CreationDate,Amount,
                              REMOTENAME,Salesperson,EntryProcessed,BatchID)
                        VALUES ('O000CB','J001UW','12:26:15.446918 2013/08/30',169,'SiteTest1','Sil Sch.r',0,NULL)

The "offending" text is the entry for the sales person which should be "Sil Schär", but apparently it stumbles over the ä character.

All databases have the same encodings:

CHAR Collation: 1251LATIN1

CHAR Encoding: windows-1252

NCHAR Collation: UCA

NCHAR Encoding: UTF-8

The replication is done via Email. In the connection string of dbremote we specify nothing special, only uid,pwd,eng,dbn

I think that the linux dbremote is using a wrong character set when deconding the emails received and then trys to apply the sql operation in a wrong encoding.

Strange is, that messages with the ü character are correctly replicated, but the ä seems to cause problems....

Any ideas how to solve it ?

asked 16 Sep '13, 09:50

ASchild
777●22●27●40
accept rate: 14%

"ß" won't be a problem in CH, right?

(16 Sep '13, 10:01) Volker Barth

Replies hidden

Yes, we don't use the "ß", but then, we have éèà and ç for our french customers as well... :)

(16 Sep '13, 10:17) ASchild

flat view

3 Answers:

active answers oldest answers newest answers popular answers

Try adding "charset=none" in the connection string for dbremote on all nodes. John's comment that dbremote is assuming OS charset when reading a message is correct, and Volker points to very relevant section of the readme file.

permanent link

answered 16 Sep '13, 10:14

Reg Domaratzki
7.9k●3●43●119
accept rate: 36%

Thanks, this solved the problem. I assume a "CharSet=windows-1252" would also work?

(16 Sep '13, 10:20) ASchild

Replies hidden

Assuming that is the proper character set, yes. If all the database use the same encoding, I prefer using charset=none everywhere to tell dbremote never to do characeter set translation, and this solution will work on every computer, regardless of the locale specified on the computer.

(16 Sep '13, 10:33) Reg Domaratzki

flat view

Here's a note to SQL Remote and charsets from the newest v12 EBF (3942).

In my understanding, it doesn't describe a bugfix but a "how to" - and possibly that might work for you, too:

    ================(Build #3850  - Engineering Case #730270)================

SQL Remote always assumes that all databases involved in replication share 
    the same character set. By default, SQL Remote will always apply source CHAR 
    data to a target database using the default character set for the operating 
    system it is running on, ignoring the source data character set.

When using a database character set that is different than the default character 
    set for the operating system, dbremote must be instructed to perform explicit 
    data conversion to that character set on its connection string:

e.g. dbremote -c “CHARSET=utf8;…”

or instruct dbremote to always use the CHAR character set 
    of the target database to apply the remote CHAR data:

e.g. dbremote -c “CHARSET=none;…”

permanent link

answered 16 Sep '13, 10:00

Volker Barth
40.5k●365●556●827
accept rate: 34%

flat view

I think you are right. Something is likely interpreting the data as UTF8 on Linux. In cp1252, 'ä' is encoded as 0xE4 which introduces a 3-byte character and that will end up gobbling up the 'r' and the closing quote as part of that character. That's why you see the syntax error. In cp1252 'ü' is encoded as 0xFC which is not a valid lead byte (or follow byte for that matter) so it just gets passed through as a single byte and doesn't cause problems.

I don't know anything about the dbremote or email side of things though. I expect that the emails being sent don't have the encoding specified in the header and therefore the other side is assuming OS charset? Perhaps you can add such a header yourself?

permanent link

answered 16 Sep '13, 10:02

John Smirnios
12.0k●3●96●166
accept rate: 37%

flat view

Your answer

toggle preview

community wiki:

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

*italic* or _italic_
**bold** or __bold__
link:[text](http://url.com/ "title")
image?![alt text](/path/img.jpg "title")
numbered list: 1. Foo 2. Bar
to add a line break simply add two spaces to where you would like the new line to be.
basic HTML tags are also supported

learn more about Markdown

Question tags:

sa-10 ×119
sql-remote ×103
dbremote ×78
character-set ×15
encoding ×9

question asked: 16 Sep '13, 09:50

question was seen: 3,958 times

last updated: 16 Sep '13, 10:33

SAP SQL Anywhere

dbremote on windows + linux (Character encoding problem)

Follow this question

Related questions