Encoding mishmash with myspell-bg

Projekt:Trisquel
Komponente:Programs
Kategorie:Fehlerbericht
Priorität:normal
Zugewiesen:nicht zugewiesen
Status:closed
Beschreibung

Another issue concerning myspell-bg, which renders the package almost completely broken. This is fairly old bug that concerns many distributions, but for some reason no one wants to fix it, although it should be quite easy to do.

The myspell-bg package installs these two files:
/usr/share/myspell/dicts/bg_BG.aff
/usr/share/myspell/dicts/bg_BG.dif

They are character-encoded in "cp1251", also known as "windows-1251" and "microsoft-cp1251". In the file bg_BG.aff, the first line is "SET microsoft-cp1251". This line apparently tells other applications how to decode the dictionaries.

The issue is, a lot of programs don't know which encoding "microsoft-cp1251" is. Of all programs that I've tried, only OpenOffice recognises it. All other programs, either display every word as incorrect or crash because of this. When the encoding line is replaced to either "cp1251" or "windows-1251", all other programs begin to work as expected, except OpenOffice – then it begins to displays every word as incorrect. Here my result from testing the different encodings:

microsoft-cp1251:
* OpenOffice: OK
* ABrowser: CRASHES
* Pidgin: ALL WORDS ARE INCORRECT
* gedit: ALL WORDS ARE INCORRECT
* Claws Mail: ALL WORDS ARE INCORRECT

windows-1251 or cp1251:
* OpenOffice: ALL WORDS ARE INCORRECT
* ABrowser: OK
* Pidgin: OK
* gedit: OK
* Claws Mail: OK

I manually correct the issue by transcoding both files to UTF-8, and setting the SET line to "SET UTF-8". This way all programs begin to work as expected. One can transcode with the following command:
iconv -f cp1251 -t utf-8 -o output.file input.file

I should also point out, that I've never heard the encoding being called "microsoft-cp1251", only in the myspell-bg package.

I propose to either distribute the dictionaries encoded in UTF-8 with the SET line altered to "SET UTF-8", or fix the SET line to "SET cp1251" and alter OpenOffice to recognize it.

Here are more discussing on the problem: https://bugs.launchpad.net/ubuntu/+source/bgoffice/+bug/346856

Fr, 04/01/2011 - 05:47
Version:»

The issue is relevant to 4.5 also.

So, 09/18/2011 - 22:02
Status:active» closed

Fixed in upstream.