xoops forums

Dona_Brasil

Not too shy to talk
Posted on: 2008/8/8 15:30
Dona_Brasil
Dona_Brasil (Show more)
Not too shy to talk
Posts: 153
Since: 2005/10/28
#11

Re: How do I change charset=UTF-8

How about the compatibility for people who upgrade? The data in my database is not necessarily UTF-8, is it?

ghia

Community Support Member
Posted on: 2008/8/8 16:04
ghia
ghia (Show more)
Community Support Member
Posts: 4954
Since: 2008/7/3 1
#12

Re: How do I change charset=UTF-8

Quote:
anderssk wrote:
If you change to latin1 in global.php and in MySQL You run with ISO encoding.
Thats a big way around instead of re-encode formulaire's language files.
No, it isn't. If you do it right from the start, you have only two entries in the XOOPS install setup to change. Otherwise you end up with converting a lot of language and sql files. English sites won't have much problems, but when multilanguage characters are heavily used as in french and german, it becomes a total different story.
Quote:
If you run the installation with UTF and uses Formulaire with english language, You don't have any problems?
I don't know. I did only the combinations as described in my post.
Quote:
Dona_Brasil wrote:
How about the compatibility for people who upgrade? The data in my database is not necessarily UTF-8, is it?
You can look up your current encoding in PHPmyAdmin, which is in most times the default MySQL setting of latin1 and the latin1_swedish_ci collation. I recommend to use these settings with the install setup of XOOPS 2.3.0.

anderssk

Quite a regular
Posted on: 2008/8/8 21:18
anderssk
anderssk (Show more)
Quite a regular
Posts: 335
Since: 2006/3/21
#13

Re: How do I change charset=UTF-8

You are right about the database. It will make some problems with upgrading from 2.0.18 to 2.3.0

But the 4 language files in ../install/language/yourlanguage has to be encoded in UTF-8

trabis

Core Developer
Posted on: 2008/8/8 21:31
trabis
trabis (Show more)
Core Developer
Posts: 2268
Since: 2006/9/1 1
#14

Re: How do I change charset=UTF-8

And the I ask, what sense is there in having lang files in utf-8. I would need to buy an special editor just to make small changes in language files. Utf-8 sucks big for me. Why can I just use my old and pretty simple notepad? This will soon become the faq most readed.

ghia

Community Support Member
Posted on: 2008/8/8 22:52
ghia
ghia (Show more)
Community Support Member
Posts: 4954
Since: 2008/7/3 1
#15

Re: How do I change charset=UTF-8

Quote:
anderssk wrote:
But the 4 language files in ../install/language/yourlanguage has to be encoded in UTF-8
In my release, there weren't any other language files present except for English. I don't think that would be a problem, while these files are only temporary used with the installation, which generates utf-8 html. Or does the setting of the database encoding to something else, also change the encoding of the html?

There is also an SQL file for the standard XOOPS smilies and ranks. Will that work out allright for any chosen encoding of the database?

Maybe the language files of the install should contain a definition for the default character set encoding for the database.
I for myself would opt for latin1 for our national languages Dutch, French and German.

Quote:
trabis wrote:
And the I ask, what sense is there in having lang files in utf-8. I would need to buy an special editor just to make small changes in language files. Utf-8 sucks big for me. Why can I just use my old and pretty simple notepad? This will soon become the faq most readed.
Are you talking about Windows Notepad? Because there you can select the encoding when doing file - save as. It has ANSI, UTF-8, Unicode little and big endian.
For the moment most modules are ISO encoded for their language and sql files. There will be a lot of problems when going to utf-8.

And if there was to take a unicode characterset I would prefer UTF-16. This makes it easier to reserve dataspace for strings: it's simply the double of the required string length.
Unfortunate MySQL isn't yet complete on that.

trabis

Core Developer
Posted on: 2008/8/8 23:07
trabis
trabis (Show more)
Core Developer
Posts: 2268
Since: 2006/9/1 1
#16

Re: How do I change charset=UTF-8

Quote:

ghia_ wrote:
Are you talking about Windows Notepad? Because there you can select the encoding when doing file - save as. It has ANSI, UTF-8, Unicode little and big endian.
.


Yes, but, if you have a utf-8 file and you edit in notepad, it will show weird characteres, it will not converte them. You can not use notepad to convert utf-8 to Iso. Well, I have not figured how to do it yet. :(

anderssk

Quite a regular
Posted on: 2008/8/9 7:48
anderssk
anderssk (Show more)
Quite a regular
Posts: 335
Since: 2006/3/21
#17

Re: How do I change charset=UTF-8

The Notepad++ can also convert from UTF-8 to ANSI as well from ANSI to UTF-8

Open Your language-files and select Format -> convert to UTF-8 (without BOM) og
Format -> conbert to ANSI
Save the file again.

ghia

Community Support Member
Posted on: 2008/8/10 0:24
ghia
ghia (Show more)
Community Support Member
Posts: 4954
Since: 2008/7/3 1
#18

Re: How do I change charset=UTF-8

Quote:
trabis wrote:
Yes, but, if you have a utf-8 file and you edit in notepad, it will show weird characteres, it will not converte them. You can not use notepad to convert utf-8 to Iso. Well, I have not figured how to do it yet. :(
Normaly Notepad (V5.0) saves a BOM at the beginning of an UTF-8 text, but it is also good in recognising UTF-8 without BOM, like this one. The only way to see how the file is recognised is to use file save as, where the encoding will be set to the current format.
If an UTF-8 file is incorrect marked as a text file you see weird characters mostly a kind of a and more than one character in the place of one accented character. In that case copy  or type alt 0239 alt 0187 alt 0191 to the beginning of the file. Save, close notepad and reopen then the file. Now the inserted characters are no longer displayed and used as BOM, which forces the file to read as UTF-8.
If you see all alike weird characters such as blocks and your file is correctly interpreted as UTF-8, that means that your fonts characterset is insufficient. Do select all and set with format - font the font to lucida sans unicode, dejavu sans mono or download FreeFont and set it to FreeMono. You should now be able to view most of the characters.
Then save as with encoding Ansi.

anderssk

Quite a regular
Posted on: 2008/8/10 7:55
anderssk
anderssk (Show more)
Quite a regular
Posts: 335
Since: 2006/3/21
#19

Re: How do I change charset=UTF-8

#18
Give notepad++ a try.
It's also show witch encoding the file have, and you can change it with one click