The charset is defined as iso-8859-1. It's typically bad form to explicitely define the language of a page for internationalisation reasons. Most browsers will force the page into whatever charset is defined in the meta tags and ignore the browser settings. This makes the forums inconvenient to use with unicode and other languages. Could you possibly remove the definition of the character set?
I too have run into this. Every time some asks for Japanese help in H/A I have to manually change the encoding in my browser to be able to see what I've just typed, and that only persists 'till the end of the session in Safari.
So, if there's not any reason you need to force a Latin-1 encoding, it'd be handy if that declaration weren't there.
Also, this may sound nasty, but the forum is predominantly English, whilst we do love our international neighbours with all our hearts... it is appreciated when everyone speaks the same language, since it makes discussion a lot easier, and breaks down those international barriers.
Also, this may sound nasty, but the forum is predominantly English, whilst we do love our international neighbours with all our hearts... it is appreciated when everyone speaks the same language, since it makes discussion a lot easier, and breaks down those international barriers.
Yes, and I'm sure the moment Alpha changes this code, the forums are going to go into a total breakdown, with everyone starting their own language-based threads; all the Italians will be in the Italian thread, all the Germans in the German thread, all the French people in the French thread... Oh, wait, those languages all use the same character set as English. And we haven't had a problem with it. At all.
My point is that we're an English speaking forum, so why enable the use of other character sets? It seems rather pointless.
I'm worried about spread and abuse by japanophiles.
I do believe I have already seen use of Kanji on the forums here. The current setup isn't preventing its use, and changing the setup has other benifits.
I think your worries are a tad overblown, but... *shrug*
Just_Bri_Thanks on
...and when you are done with that; take a folding
chair to Creation and then suplex the Void.
Also, this may sound nasty, but the forum is predominantly English, whilst we do love our international neighbours with all our hearts... it is appreciated when everyone speaks the same language, since it makes discussion a lot easier, and breaks down those international barriers.
Yes, and I'm sure the moment Alpha changes this code, the forums are going to go into a total breakdown, with everyone starting their own language-based threads; all the Italians will be in the Italian thread, all the Germans in the German thread, all the French people in the French thread... Oh, wait, those languages all use the same character set as English. And we haven't had a problem with it. At all.
My point is that we're an English speaking forum, so why enable the use of other character sets? It seems rather pointless.
The posting of other character sets already works, in so far as I can type them in and phpbb accepts them just fine. The only thing having the Latin-1 charset declaration does is cause some browsers to display things wrong.
There are people on this forum right now with sigs in different charsets, it's really not the end of civilization.
Edit: And while we may be a english speaking forum, that doesn't mean that we never discuss anything else. This thread was originally created because we were discussing Greek in D&D, and having to manually tell their browsers that, despite the charset declaration in the html, phpbb wasn't really just outputting Latin-1 encoded content got pretty old for people.
If they're going to do anything, they should change it to UTF-8 (although that'd mess up all with posts, topics, user names, etc. already in the database using the last 128 characters in ISO-8859-1).
Leaving it undefined is dumb, though.
Would it, though? I thought ISO-88599-1 corresponded to the first couple sets of UTF? Anyway, if they're using xhtml, aren't you supposed to make a charset declaration?
The only characters ISO-8859-1 and UTF-8 encode with the same bytes are the 7-bit ASCII set. If you want to represent the other half of ISO-8859-1, it'd take two to four bytes per character (depending on the character). UTF-8 can't use for individual characters because it has to ensure that, as the Wikipedia page says, "no byte sequence of one character is contained within a longer byte sequence of another character."
As it stands now, the forum overrides my encoding by specifying its own, giving us some common ground. This is hardly "bad form." Taking that away is pretty stupid. Not defining a character set is basically the worst thing you can do for "internationalisation reasons." The only change here that wouldn't be a regression is changing the default encoding to a more universal one, like a Unicode encoding.
That was well thought out and written, and I confess to not having given the whole thign as much thoguht,
Just_Bri_Thanks on
...and when you are done with that; take a folding
chair to Creation and then suplex the Void.
The only characters ISO-8859-1 and UTF-8 encode with the same bytes are the 7-bit ASCII set. If you want to represent the other half of ISO-8859-1, it'd take two to four bytes per character (depending on the character). UTF-8 can't use for individual characters because it has to ensure that, as the Wikipedia page says, "no byte sequence of one character is contained within a longer byte sequence of another character."
As it stands now, the forum overrides my encoding by specifying its own, giving us some common ground. This is hardly "bad form." Taking that away is pretty stupid. Not defining a character set is basically the worst thing you can do for "internationalisation reasons." The only change here that wouldn't be a regression is changing the default encoding to a more universal one, like a Unicode encoding.
This is correct. But to really make the change to UTF-8, you also need to set the HTTP Content-type entity header to "text/html; charset=UTF-8". Otherwise some browsers behave goofy when they get conflicting information (HTTP header says 8859-1 & meta tag says UTF-8 ). Setting the HTTP header also instructs the UA to encode data (like in a POST) in the same Content-type that the server sent.
EDIT: I should I also mention that there would have to be server side changes to the posting system to notify php that the content is coming in UTF-8 rather than 8859-1. Not sure how this works in php, in Java you have to make a call to request.setCharacterEncoding("UTF-8") before you read any of the parameters from the request object. I've implemented multi-language websites including languages like Chinese & Arabic and it's non-trivial.
ask_lesko on
Get free money from the government to open up a coffee shop!
Posts
So, if there's not any reason you need to force a Latin-1 encoding, it'd be handy if that declaration weren't there.
Also, this may sound nasty, but the forum is predominantly English, whilst we do love our international neighbours with all our hearts... it is appreciated when everyone speaks the same language, since it makes discussion a lot easier, and breaks down those international barriers.
I do believe I have already seen use of Kanji on the forums here. The current setup isn't preventing its use, and changing the setup has other benifits.
I think your worries are a tad overblown, but... *shrug*
chair to Creation and then suplex the Void.
There are people on this forum right now with sigs in different charsets, it's really not the end of civilization.
Edit: And while we may be a english speaking forum, that doesn't mean that we never discuss anything else. This thread was originally created because we were discussing Greek in D&D, and having to manually tell their browsers that, despite the charset declaration in the html, phpbb wasn't really just outputting Latin-1 encoded content got pretty old for people.
Would it, though? I thought ISO-88599-1 corresponded to the first couple sets of UTF? Anyway, if they're using xhtml, aren't you supposed to make a charset declaration?
That was well thought out and written, and I confess to not having given the whole thign as much thoguht,
chair to Creation and then suplex the Void.
This is correct. But to really make the change to UTF-8, you also need to set the HTTP Content-type entity header to "text/html; charset=UTF-8". Otherwise some browsers behave goofy when they get conflicting information (HTTP header says 8859-1 & meta tag says UTF-8 ). Setting the HTTP header also instructs the UA to encode data (like in a POST) in the same Content-type that the server sent.
EDIT: I should I also mention that there would have to be server side changes to the posting system to notify php that the content is coming in UTF-8 rather than 8859-1. Not sure how this works in php, in Java you have to make a call to request.setCharacterEncoding("UTF-8") before you read any of the parameters from the request object. I've implemented multi-language websites including languages like Chinese & Arabic and it's non-trivial.