cyrillic alphabet not available in initial replication , but available for later insert and update

Hello, I have a central db (UCA , UTF-8 , sybase 11.0.1.2960) and remote db’s (same set-up). In this database I store brands , CREATE TABLE "icat"."Brand" ( "BrandID" NUMERIC(16,0) NOT NULL, "Name" NVARCHAR(50) NOT NULL PRIMARY KEY ( "BrandID" ASC ) ) when I add a new brand that contains cyrillic (“трудно”) this has no problem replicating to one of my remote db’s. The statements looks like INSERT INTO icat.Brand(BrandID, Name) VALUES (1000003637,TO_NCHAR(0xD0A2D0A0D0A3D094D09DD09E,'UTF-8')

Mind the TO_NCHAR and the fact that the name field is NVARCHAR.

However , when I create a new remote DB and I run the sql remote to generate the replication files and read these at the remote db, the “трудно” turns into some other even weirder characters (squares and what not). If I then update the name in the central and run the replication it will be correct, using the to_nchar in the update statement.

Anyone have a clue as to why the initial load of data is handled different from later updates in inserts?

Kind regards Dimitri

sql-remote

asked 21 Aug '15, 10:16

dimitridepro...
81●1●2●7
accept rate: 100%

edited 24 Aug '15, 07:59

Maybe I am not understanding the nature of your problem but the sample string (shown) being passed to TO_NCHAR() does not seem to include any Cyrillic characters. ?did I miss something?

In UTF-8 0x313233C387C389C389C387 is

'1', '2', '3' [ 0x31, 0x32, 0x33 ] followed by 'Ç', 'É', 'É', 'Ç' [0xc387, 0xc389, 0xc389,0xc387 ]

http://www.decodeunicode.org/en/u+00c7/properties http://www.decodeunicode.org/en/u+00c9/properties

Since it works after you change the data, I suspect the input side of this is somehow failing you.

HTH

(21 Aug '15, 12:53) Nick Elson S...

Replies hidden

The insert statement that I copied must of been from another test, I'll make another test on Monday and post that. But the problem are not the insert statements that I can get from the log , rather the initial load of data which does not produce these kinds of statements , rather just a select and a count of how many records are inserted

(21 Aug '15, 15:38) dimitridepro...

TO_NCHAR(0xD0A2D0A0D0A3D094D09DD09E,'UTF-8') is what I get when I pass трудно

(24 Aug '15, 02:30) dimitridepro...

I run the sql remote to generate the replication files and read these at the remote db

What exactly do you mean by that? Do you use the DBXTRACT utility to create the remote database? As to the "read these" - do you relate to a reload.sql and the according unloaded DAT.files that are referenced by LOAD TABLE statements? Or are these message files sent by DBREMOTE?

(24 Aug '15, 08:31) Volker Barth

Replies hidden

All messages are generated and read by using the dbremote.exe

During the initial replication the log on the receiving (remote database) it says

select name , .... from brand

5000 rows synchronized.

Later when I insert or update a record at the central database, the log at the remote database will have insert and update statements, using the to_nchar

(24 Aug '15, 08:41) dimitridepro...

So you are issuing SYNCHRONIZE SUBSCRIPTION statements at the consolidated to "fill" the remote database? (I'm asking since we have always extracted data from the consolidated and re-loaded those into the remotes locally before we have shipped them to remote users...)

Do you use the CharSet (CS) connection parameter when using SQL Remote?

When you compare the contents of the message file of an initial replication and of a "normal" run (say, by using DBREMOTE -v -o MySrConsole.log), do the data for the unicode column differ for the same row (say, when altered to the same value)?

(24 Aug '15, 09:31) Volker Barth

Adding the charSet connection parameter to the dbremote command seems to fix everything.

However, I was under the impression that I was going to need to convert all my varchar fields to nvarchar to hold the Cyrillic texts. But yesterday during the tests I ran with the charSet I noticed that even the varchar fields had no problem with the Cyrillic symbols. Is this normal behavior?

(25 Aug '15, 02:08) dimitridepro...

If you only (or primarily) use Cyrillic, I guess the single-byte codepage 1251 and the according collation 1251CYR should be sufficient to store these values in CHAR fields (and historically, for SQL Anywhere databases, they will have been sufficient before Unicode support/NCHAR has been introduced with v10...). However, you will not be able to store characters from different languages/scripts there, say Latin characters.

(25 Aug '15, 03:36) Volker Barth

showing 3 of 8 show all flat view

One Answer:

active answers oldest answers newest answers popular answers

Adding the charSet= utf-8 connection parameter to the dbremote command seems to fix everything.

"C:\Program Files (x86)\SQL Anywhere 11\BIN32\dbremote.exe" -c "eng=icat9636;dbn=icat9636;CharSet=utf-8" -b -qc -r -os 50M -o "d:\applicationdata\ICAT\Sync\SqlRemoteLogs\icatcentral\dbremote_messages.log" -l 100000 -t -v "D:\Databases\Sybase\icat\icatlocal"

permanent link

answered 25 Aug '15, 03:54

dimitridepro...
81●1●2●7
accept rate: 100%

edited 25 Aug '15, 03:59

Comment Text Removed

flat view

Your answer

toggle preview

community wiki:

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

*italic* or _italic_
**bold** or __bold__
link:[text](http://url.com/ "title")
image?![alt text](/path/img.jpg "title")
numbered list: 1. Foo 2. Bar
to add a line break simply add two spaces to where you would like the new line to be.
basic HTML tags are also supported

learn more about Markdown

Question tags:

sql-remote ×103

question asked: 21 Aug '15, 10:16

question was seen: 2,214 times

last updated: 25 Aug '15, 03:59

SAP SQL Anywhere

cyrillic alphabet not available in initial replication , but available for later insert and update

Follow this question

Related questions