Some languages are not so simple to translate, because of
character-encoding or direction problems. If you have successfully
overcome such a problem, please post your experiences to one of the
Koha mailing lists or send them to <st.hedges
AT gmail DOT com>. They will be added to
this section.
Posted by Parthasarathi Mukhopadhyay
Finally I have done Koha localization (in Bengali language - an Indian regional language) for both the version Windows and Linux. The procedure I adopted is as follows:
I copied npl
directory under opac-template
in a new folder npl_bengali.
Then changed charset=ISO-8859-1 to charset=UTF-8
everywhere it (charset=ISO-8859-1) appeared in .INC file under /include directory.
In the system parameter setup I then changed opac theme to npl_bengali (it appeared as an
option automatically).
Finally used an Unicode compliant virtual keyboard (Avro) to enter query, to enter cataloguing data and to change interface language. [ed. note: www.omicronlab.com/avrokeyboard/]
It works nicely. I have also tested import and export of Koha data (bibliographic, members etc.) in MySQL directly and found it works correctly.
I'm entering data in Linux server through Windows XP client because virtual keyboard is available in Windows version only. Possibly in future we'll overcome this limitation also.
For more information, see Five Laws and Ten Comandments: The Open Road of Library Automation in India.
Posted by Tumer Garip
Here is what we had to do to use Koha in utf-8, hoping that it helps in some of your discussions:
We are using Koha since 2.2.0 now at 2.2.2b
We use English for intranet and English-Turkish for opac
The platform is Windows
We changed the character set of the database to utf-8 with the iso-xxxx data in it. No problem for MySQL as you are moving up the ladder. No need to reload the data (10 min)
Changed all the charset=iso-xxxx in the templates to read utf-8 and saved the files as utf-8 (15 min.) in a simple text editor.
Character decode in biblio for MARC21 is very ambiguous for us because it is not very clear which character encoding it is changing from. All the MARC records we bulkimport are MARC-8 , iso2709 or ANSEL or whatever you want to call them. So we simply wrote a one to one character mapping of MARC-8 to utf-8 for our Turkish accented characters. Here it is:
#Additional Turkish characters
s/(\xf0)s/þ/gm;
s/(\xf0)S/Þ/gm;
s/(\xf0)c/ç/gm;
s/(\xf0)C/Ç/gm;
s/\xe7\x49/Ý/gm;
s/(\xe6)G/Ð/gm;
s/(\xe6)g/ð/gm;
s/\xB8/ý/gm;
s/\xB9/£/gm;
s/(\xe8|\xc8)o/ö/gm ;
s/(\xe8|\xc8)O/Ö/gm ;
s/(\xe8|\xc8)u/ü/gm ;
s/(\xe8|\xc8)U/Ü/gm ;
s/\xc2\xb8/ý/gm;
All the character codes are directly from LC's website
about MARC21. Since we provided the actual characters rather than their
codes we saved the Biblio.pm
as utf8 to save time. (Half a day together with research)
We have a full working Koha as utf8 supporting all characters and we keep doing the same thing every time we get an update.
Translation of opac files through .po files do not work for us. As we see it, this po translator is simply a search and replace text engine. So it converts the string ' English English <somevariable> English.' to ' Turkish Turkish <somevariable> Turkish'. Which is useless as it should be 'Turkish <somevariable> Turkish Turkish'.
So we sat down and translated the opac templates to proper Turkish. It is now easier for our people to follow the changes in cvs and implement the changes to templates rather than doing complete translations every time.
The whole update up till now is taking less than half a day with one person doing it.
We as Windows people do not have much experience with this po editor. But as far as I know it supports utf-8 so what's the hassle about these translations? As far as we understand it the official language of Koha is English and if someone is translating it to some other language it is their responsibility to find the resources to translate it in time to be implemented as an additional language. Even if this requires a complete rewrite of some templates.
Finally we believe that Koha should start using utf-8 ASAP before the move to zebra to gain experience. If zebra is implemented with all this iso stuff we will have more problems with each translation requiring a different character set and sort order set and character mapping to set, etc.
Koha is more powerful with more features, stability and performance and I believe people will be more happy to see improvement in these even if they have to spend a little bit more resource on their own translations.