Hey, alpha.
In the HTML for the forums, there is the following code:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
The charset is defined as iso-8859-1. It's typically bad form to explicitely define the language of a page for internationalisation reasons. Most browsers will force the page into whatever charset is defined in the meta tags and ignore the browser settings. This makes the forums inconvenient to use with unicode and other languages. Could you possibly remove the definition of the character set?
Posts
So, if there's not any reason you need to force a Latin-1 encoding, it'd be handy if that declaration weren't there.
Also, this may sound nasty, but the forum is predominantly English, whilst we do love our international neighbours with all our hearts... it is appreciated when everyone speaks the same language, since it makes discussion a lot easier, and breaks down those international barriers.
Yes, and I'm sure the moment Alpha changes this code, the forums are going to go into a total breakdown, with everyone starting their own language-based threads; all the Italians will be in the Italian thread, all the Germans in the German thread, all the French people in the French thread... Oh, wait, those languages all use the same character set as English. And we haven't had a problem with it. At all.
I do believe I have already seen use of Kanji on the forums here. The current setup isn't preventing its use, and changing the setup has other benifits.
I think your worries are a tad overblown, but... *shrug*
chair to Creation and then suplex the Void.
There are people on this forum right now with sigs in different charsets, it's really not the end of civilization.
Edit: And while we may be a english speaking forum, that doesn't mean that we never discuss anything else. This thread was originally created because we were discussing Greek in D&D, and having to manually tell their browsers that, despite the charset declaration in the html, phpbb wasn't really just outputting Latin-1 encoded content got pretty old for people.
Would it, though? I thought ISO-88599-1 corresponded to the first couple sets of UTF? Anyway, if they're using xhtml, aren't you supposed to make a charset declaration?
That was well thought out and written, and I confess to not having given the whole thign as much thoguht,
chair to Creation and then suplex the Void.
This is correct. But to really make the change to UTF-8, you also need to set the HTTP Content-type entity header to "text/html; charset=UTF-8". Otherwise some browsers behave goofy when they get conflicting information (HTTP header says 8859-1 & meta tag says UTF-8 ). Setting the HTTP header also instructs the UA to encode data (like in a POST) in the same Content-type that the server sent.
EDIT: I should I also mention that there would have to be server side changes to the posting system to notify php that the content is coming in UTF-8 rather than 8859-1. Not sure how this works in php, in Java you have to make a call to request.setCharacterEncoding("UTF-8") before you read any of the parameters from the request object. I've implemented multi-language websites including languages like Chinese & Arabic and it's non-trivial.