We had a database created like this: CREATE DATABASE 'norwegian.db' COLLATION '819NOR' NCHAR COLLATION 'UCA';

But got problems when trying to store characters outside that collation.

So we reloaded the database to: CREATE DATABASE 'norwegian.db' COLLATION 'UTF8BIN' NCHAR COLLATION 'UTF8BIN';

But now we have problem with sorting and comparison. Well sorting we solved with: set option sort_collation='48' or set option sort_collation='UCA(locale=nob)' (not sure if they are equal, or wich to prefere?)

But when comparing: Name like 'Øyv%' or select if 'æ' = lower('Æ') then 'true' else 'false' end if returns false

How do we create a database that will store in UTF8 but sort and compare in norwegian?

asked 27 May '21, 14:26

Ove%20Halseth2's gravatar image

Ove Halseth2
71238
accept rate: 0%

You may find someone with the answer...

If not, consider what a collation-confused person such as myself might do: Perform experiments in which exactly one single variable/parameter/option is changed at a time, followed by all the tests that must be passed to certify "Success!".

Experimental design may take longer than the experiments, but care is necessary to avoid backtracking, wandering, frustration and the inevitable production bugs.

Some parameters to consider are ENCODING and Collation Tailoring Options.

There is a lot of material in the V17 Help here https://help.sap.com/viewer/61ecb3d4d8be4baaa07cc4db0ddb5d0a/17.0/en-US/813826126ce210149074e3a77d2e1dce.html

There have been related discussions in this forum and elsewhere so asking Auntie Google my yield fruit (but be careful of obsolete information).

...and people who know the answer do exist, it's just a matter of finding them :)

(27 May '21, 16:03) Breck Carter

Hi.

Try to use 1252NOR instead. We always use that in our applications, and it has never created any issues for us.

br,

Bjarne Anker Maritech Systems AS

permanent link

answered 31 May '21, 07:41

Bjarne%20Anker's gravatar image

Bjarne Anker
745323648
accept rate: 10%

converted 02 Jun '21, 08:00

Breck%20Carter's gravatar image

Breck Carter
32.5k5417261050

Just two questions/remarks:

  • Do you use CHAR or NCHAR data types for those international texts? Because by default NCHAR data will use the UCA collation, which should work fine in your original database. However, if you are using CHAR data types and want to store international data, the default single byte character sets won't allow more than 256 char, obviously. If that limit is critical, you need to use UCA there, as well.

  • UTF8BIN is a multi-byte character set with binary sorting, so it by design treats each characters with and without accents or case as different. That is apparently not your goal.

If you want multi-byte characters both with CHAR and NCHAR, I guess the following should do:

CREATE DATABASE 'norwegian.db' COLLATION 'UCA(locale=no;case=ignore;accent=ignore)' NCHAR COLLATION 'UCA(locale=no;case=ignore;accent=ignore)';

I can't tell whether the "collation tailoring options" do fit and whether Norwegian has particular sorting variants like German or Swedish (via sorttype=phonebook), so you might adjust that.

Note that using the UCA collation by default uses UTF-8 as encoding.

Just to add: Whether you need to distinguish between Bokmål and Nynorsk is fully out of my scope... :)

permanent link

answered 28 May '21, 03:03

Volker%20Barth's gravatar image

Volker Barth
40.1k361549819
accept rate: 34%

edited 28 May '21, 03:42

Looks promising! Tested: set temporary option sort_collation='UCA(locale=nob;case=ignore;accent=ignore)';

And got correct sorting, so will schedule a conversion of the database and see if it helps :)

Oh, and I did not get locale=no to work, but locale=nob did work. I guess it stands for Norwegian Bokmål. But then I would expect that locale=non for Nynorsk should work too, but it did not.

But it does not matter as both sort the same.

(01 Jun '21, 08:10) Ove Halseth2
Replies hidden

Note, you can also "test collation sorting" without having to create a new database by using the COMPARE function and specifying the desired collation, see here.

That does also help to compare results of different collation tailoring options side-by-side.

(01 Jun '21, 08:24) Volker Barth
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×17

question asked: 27 May '21, 14:26

question was seen: 725 times

last updated: 01 Jun '21, 08:27