So I've inherited this database with some invalid xml data in one table.

I know for a fact that the character is a control char and I can find it in the column (which is of type XML):

SELECT CHARINDEX(char(26), BLA_COLUMN) FROM BLA_TABLE where ID_RECORD = 1234 Returns a valid index. Now, when I try to replace that char with something else:

UPDATE BLA_TABLE SET BLA_COLUMN = REPLACE(BLA_COLUMN, char(26), char(32)) WHERE ID_RECORD = 1234

The statement fails with message: "XML parser error: character: 603, line:1, column:603 Illegal control character SQLCODE=-888, ODBC 3 State="HY000"

Any ideas how to get rid of that character? (Running SQLA 12)

Thank you!

EDIT

Looks like Volker's suggestion to first update the BLA_COLUMN to a Null value then to update to a valid XML, replacing char(26) with char (32), provides a workaround. Thanks Volker!

asked 05 Jun '14, 06:05

tzup's gravatar image

tzup
36081324
accept rate: 0%

edited 05 Jun '14, 09:47

Volker%20Barth's gravatar image

Volker Barth
29.5k291441646

Also untested...

UPDATE BLA_TABLE 
   SET BLA_COLUMN = REPLACE ( CAST ( BLA_COLUMN, LONG VARCHAR ), 
                              char(26), char(32)) 
 WHERE ID_RECORD = 1234
(05 Jun '14, 07:46) Breck Carter
1

Same error as above unfortunately

(05 Jun '14, 08:42) tzup
Replies hidden

Do you have a support contract with SAP - it's beginning to look like you may need to open a case.

But in the meantime: The docs say: You can cast between the XML data type and any other data type that can be cast to or from a string. Note that there is no checking that the string is well-formed when it is cast to XML.

So it's looking a bit like you can write in anything, but can only read out well formed XML.

To check this can you try just selecting the XML into a string without trying to write anything back - I'm just wondering if there is more than one illegal character which is why Breck's method fails.

ie something like

create variable mytext long varchar;
set mytext = (SELECT (BLA_COLUMN) FROM BLA_TABLE where ID_RECORD = 1234);

OR

create variable mytext long varchar;
set mytext = (SELECT cast(BLA_COLUMN as long varchar) FROM BLA_TABLE where ID_RECORD = 1234);

and see if you can then read what is in mytext.

(05 Jun '14, 08:48) Justin Willey

Reading works fine in both cases. The updating with a valid XML proved to be tricky. (Please see my Edit)

(05 Jun '14, 09:17) tzup

Is the same true if you try to set the column to null (which would look like the previous contents is parsed, too)...?


EDIT: As this seems to have done the trick, it shows that

  • the parser does not validate the old contents (preventing some kind of "correction deadlock")
  • and NULL is not that bad now and then(Breck, do you hear me?:)
permanent link

answered 05 Jun '14, 07:48

Volker%20Barth's gravatar image

Volker Barth
29.5k291441646
accept rate: 32%

edited 05 Jun '14, 09:24

If that is happening it's hard to see how you would ever be able to fix it - it's odd that it could get in there in the first place unless the behaviour has been changed at some time and it is now stricter. If this is the case, it's beginning to look more like a bug than a feature :)

(05 Jun '14, 08:33) Justin Willey
1

Setting the column to Null first, then updating it with a valid XML (ie char 26 replaced with char 32) seems to do the trick.

(05 Jun '14, 08:45) tzup
2

I had a go reproducing the problem (on 10.0.1 and 16.0) but couldn't - I could select and update the bad XML without an issue - so I was wondering what was causing the problem. Anyway, you have a work around so that's great.

create table TestXML(PK int default autoincrement, XMLStuff xml, primary key(PK));

insert into TestXML(XMLStuff) values ('<Bad XML>');
insert into  TestXML(XMLStuff) values ('<GoodXML/>');
insert into  TestXML(XMLStuff) values (string('<ReallyBadXML',char(26),'>'));
insert into  TestXML(XMLStuff) values (string('<EvenWorseXML',char(26)));

select * from TestXML where PK = 1;
-- works OK

update TestXML set XMLStuff = '<BetterXML/>' where  PK = 1;
-- works OK

select * from TestXML where PK = 3;
-- works OK

update TestXML set XMLStuff = '<BetterXML/>' where  PK = 3;
-- works OK

select * from TestXML where PK = 4;
-- works OK

update TestXML set XMLStuff = '<BetterXML/>' where  PK = 4;
-- works OK

drop table TestXML;
(05 Jun '14, 10:00) Justin Willey
Replies hidden
1

12.0.1.4085 shows the same success.

Also trying to turn a valid XML into an invalid one doesn't raise an error, such as:

update TestXML set XMLStuff =
  replace(XMLStuff, char(47), char(26)) where  PK = 2;
-- works OK
(05 Jun '14, 10:39) Volker Barth
3

SQL Anywhere only validates XML when parsing it (for example, using OPENXML). It is not validated when inserting or updating a value, or casting to the XML type. I wonder if the original database has a trigger, computed column, or check constraint that ends up parsing the XML.

(05 Jun '14, 14:31) Ivan T. Bowman
Replies hidden
1

You guys are right. Having looked more carefully at the table, I've notice the presence of a trigger that indeed does some parsing on update (disabling the trigger allowed the updates of fixing the xml just fine). Can't believe that didn't cross my mind! So there you go, case closed. Thank you!

(06 Jun '14, 00:00) tzup
showing 4 of 6 show all flat view

I think you need to convert the xml data to something that SQLA doesn't try to interpret.

eg (untested)

create variable mytext long varchar;
set mytext = (SELECT (BLA_COLUMN) FROM BLA_TABLE where ID_RECORD = 1234);
set mytext= REPLACE(mytext, char(26), char(32));
update BLA_TABLE set BLA_COLUMN = mytext where ID_RECORD = 1234;
commit;

If you have to do this a lot you could write a user defined function. Also if you have characters which have problems because of collation tables eg

update person set notes=replace(notes,'ú','£')
and every u in the field will also be changed as the collation sequence helpfully reckons that 'ú' is the same as 'u', you can use this approach:

create function BinaryReplace(in x long varchar,in targetascii smallint,
in replacementascii smallint) returns long varchar deterministic begin
  declare rv long varchar;
  declare l integer;
  declare i integer;
  declare c char(1);
  set l=length(x);
  set i=0;
  set rv='';
  while i < l loop
    set i=i+1;
    set c=substr(x,i,1);
    if ascii(c) = targetascii then
      set c=char(replacementascii)
    end if;
    set rv=rv+c
  end loop;
  return rv
end;

not fast, but quicker than typing!

permanent link

answered 05 Jun '14, 06:47

Justin%20Willey's gravatar image

Justin Willey
6.5k104135200
accept rate: 21%

edited 05 Jun '14, 08:24

Weird! I can't even run a statement like: UPDATE BLA_TABLE set BLA_COLUMN = '<bla/>' where ID_RECORD=1234; without the XML parser complaining as above

(05 Jun '14, 07:43) tzup
Replies hidden

Is Breck's method allowed to work without the parser complaining?

(05 Jun '14, 08:25) Justin Willey
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×405
×18
×3

question asked: 05 Jun '14, 06:05

question was seen: 2,728 times

last updated: 06 Jun '14, 00:00