Which utf-8 collation matches iso8859-1 sorting ? Is there a

Posted by cverbiest on 18-May-2017 09:42

After migration to utf-8 our customer noted that the item table is sorted differently.

We need utf-8 because we have to be able to store different languages but we'd like to have the same sorting as in iso8859-1.

I know this is done by specifying the collation but I can't find documentation explaining which collection table is what.

e.g. I expected  ICU-be.df to be for Belgium but then I noticed ICU_48-be.df is for Belarus,  I suspect ICU-be.df  is for Belarus as well.

Mutlple questions

Is there an overview table explaining the different collations and there uses ?

What is the difference between the 48 collation tables and the others ? which are the more up-to-date versions ?

Is there a table that matches iso8859 or table that would sort [] (square brackets) in the same position as iso8859-1 ?

/* iso-8859-1 sorting */
EBRUBARROMA HT60
EBRUBARROMA HT65
EBRUBARROMA HT80
EBRUBARROMA [HT]


/* utf-8 sorting */
EBRUBARROMA [HT]
EBRUBARROMA HT60
EBRUBARROMA HT65
EBRUBARROMA HT80

All Replies

Posted by Libor Laubacher on 18-May-2017 10:10

ICU-be is indeed Belarussian. Collation is per language, not per Country, I believe. You can see if ICU-en_BE makes any difference, or ICU-UCA if you are currently using Basic.

ICU_48 is the more recent, the ICU without version listed is version 24.

For the complete list you can check DLC/prolang/readme

Posted by cverbiest on 18-May-2017 10:30

Reply from Libor

I have replied on the forum but for whatever reason it rejected the post saying I need moderator approval 

ICU-be is for Belarus/sian indeed, it’s per language not per country. You can see if ICU-en_BE makes a different, of ICU_UCA if your are using default Basic.

ICU 48 is the more recent version, the non versioned ICU files are version 24.

You can also check DLC/prolang/README.

Posted by cverbiest on 20-May-2017 02:31

I tried all tables in $DLC/prolang/utf8.

They all sort [] brackets the same way

Is it possible to create our own collation table ?

The documentation (https://documentation.progress.com/output/ua/OpenEdge_latest/index.html#page/dvint/modifying-openedge-collation-tables.html) states "If you need to modify an International Components for Unicode (ICU) collation, contact Progress Software Technical Support for assistance."

Posted by kirchner on 22-May-2017 07:08

I never tried it with any Unicode, but I once had the need to create a custom codepage which would compare characters exactly the same as some other system and it was not very difficult.

This helped me a lot:

documentation.progress.com/.../index.html

In my case I didn't had to change the database collation or codepage, just the session codepage with the -cpcoll parameter.

And I also worked with Progress TS and they give me some good directions on what to do.

But yeah, no docs on how to do it for Unicode that I'm aware of.

This thread is closed