Introduction

At the recent excellent Technical Summit held in Waterloo, I was challenged to make the case for a BinaryReplace() built-in function.

This would be a function that replaced a character string with another character string without reference to collation semantics etc (unlike the existing Replace() function - see this question for details of how Replace() interprets data. ByteReplace() might be a better name or it could perhaps be implemented as an extension to Replace().

What is is the problem?

Imagine you have a database containing data imported from another system, which uses a different character set or is from a different locale. You find that a particular character is incorrectly represented - a frequent example in the UK is the Pound Sterling sign - £. This often appears as ú. No problem you think - write a little routine that runs through all the affected tables and columns doing something like

update person set notes=replace(notes,'ú','£')

This is q£ite fast, b£t oh dear, yo£ soon find yo£ have q£ite a problem on yo£r hands. ALL yo£r £s (letter after t) have been changed to £s.

So what you (I've rolled back) have to do is write a letter by letter comparison user defined function (not very elegant example below) which now takes hours and hours to run. The same issue arises with trying to replace capital letters with lower case or vice versa (assuming a case insensitive database) and any operation involving accented characters that are treated alike by a collation.

There may be a very obvious solution I'm missing - possibly involving casting everything to binary beforehand (which can work for making case sensitive comparisons) - but it has so far evaded me. When I have had to deal with very large scale problems it's been faster to dump the data out, use a grep tool and read it all in again.

create function BinaryReplace(in x long varchar,in targetascii smallint,
in replacementascii smallint) returns long varchar deterministic begin
  declare rv long varchar;
  declare l integer;
  declare i integer;
  declare c char(1);
  set l=length(x);
  set i=0;
  set rv='';
  while i < l loop
    set i=i+1;
    set c=substr(x,i,1);
    if ascii(c) = targetascii then
      set c=char(replacementascii)
    end if;
    set rv=rv+c
  end loop;
  return rv
end;

asked 09 Dec '14, 13:30

Justin%20Willey's gravatar image

Justin Willey
7.6k137179249
accept rate: 20%

1

I'm quite sure the enhancement request will already take this into account:

In contrast to Justin's sample, the replacement should not work on a single character/byte but should work - like REPLACE() - with a "search-string" and a "replace-string", allowing both strings to have different length.

E.g. with German locale, it's often necessary to replace umlauts with their "extended" form (and vice versa, though that's more difficult as it's not a bijective function...), say like

select byte_replace('Müller', 'ü', 'ue');
(10 Dec '14, 04:43) Volker Barth
Replies hidden

Just another thought:

For replacing (in contrast to finding), I think it is usually helpful to distinguish case and/or accent, even in a case-insensitive database. In the sample above, I would certainly not want to replace upper-case letters with lower-case and vice versa, even with character semantics. So instead of a separate "binary replace", it might be more comfortable to enhance REPLACE() with a "collation-name/collation-tailoring-option", just like COMPARE().

(10 Dec '14, 14:08) Volker Barth

Your request was noted at the summit and added into our set of enhancement requests. I can't make promises but it would be nice if it made it into Zermatt. I also noticed that there is no binary version of LOCATE either.

permanent link

answered 09 Dec '14, 13:40

John%20Smirnios's gravatar image

John Smirnios
11.9k396164
accept rate: 38%

I hadn't realised I was that persuasive! LOCATE would be great too. Many thanks.

(09 Dec '14, 13:42) Justin Willey
Replies hidden
2

Did you think the conference was for you to learn cool things about SQLAnywhere? It's real purpose is so that we can find out what you need. ;)

(09 Dec '14, 13:44) John Smirnios
1

v17? Zermatt? Guys, what are you talking about? Where can us poor non-participants learn about those cool things?


OK, for Zermatt, I just have to make it a some hundred miles southward:)

(09 Dec '14, 15:24) Volker Barth
Replies hidden
(22 Jul '15, 08:46) Mark Culp

...and the according changes/enhancements can be found here (under the "Improved support for BYTE strings" section):

Changes to SQL statements, functions, procedures, and data types

(22 Jul '15, 10:02) Volker Barth
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×113
×13

question asked: 09 Dec '14, 13:30

question was seen: 2,085 times

last updated: 22 Jul '15, 10:02