Recommended Posts

  • 3 weeks later...
  • 1 month later...

I think there may be a bug with this tool guys. I am trying to figure out if the username is being sent correctly for a user who has a character like ? in their name.

What encoding is used? Is the HTML returned from a request encoded?

For example, if the user logging in is UserA?

If I use utf-8, then no character is returned and I just get UserA

If I use utf-7, then I get this and I get UserA

If I use ASCII, then I get a ? and I get UserA?

I think there may be a bug with this tool guys. I am trying to figure out if the username is being sent correctly for a user who has a character like ? in their name.

What encoding is used? Is the HTML returned from a request encoded?

For example, if the user logging in is UserA?

If I use utf-8, then no character is returned and I just get UserA

If I use utf-7, then I get this and I get UserA

If I use ASCII, then I get a ? and I get UserA?

585724876[/snapback]

Any Neowin coders know about this??

  • 2 weeks later...
There's no encoding used, I don't believe. It's whatever is in the database. Look at how IPB handles it.

585798060[/snapback]

Well if I don't use any encoding, then I am getting names without the special characters. If I use encoding, then I get some value in the position of the special character, but I can't decode it.

Is anyone else using this tool? Can anyone else confirm that users who have special characters in their name are coming across ok?

I may be wrong, but you might have to encode it so it comes across ok.

Anytime you have a whitespace or weird character in the URL of your browser, it is converted(encoded) so it can be set across the net ok. We might need to do the same for this...

yeah, neowin does use Latin-1 (ISO-8859-1) (view page source, it's in the content-type header tag)

the problem is i dont know how the script handles internationalised characters (im guessing the TM symbol was entered as Unicode), i also have no idea how it translates between Unicode and Latin-1 (e.g. how it handles wide characters) but it does look like it doesnt support wide characters (cause it's handling the two bytes separatly, not as one wide character)

Edit: ok, reading the wiki more, if the symbol isn't unicode, it's a problem with codepages, ISO-8859-1 doesnt have the TM symbol, but Windows-1252 does have it, the problem is that they are basically the same, so web browsers (and im guessing other apps) treat them as such. ISO-8859-1 has a range of control characters (which are invalid in HTML) while Windows-1252 has characters there (including the TM symbol)

Edited by The_Decryptor
yeah, neowin does use Latin-1 (ISO-8859-1) (view page source, it's in the content-type header tag)

the problem is i dont know how the script handles internationalised characters (im guessing the TM symbol was entered as Unicode), i also have no idea how it translates between Unicode and Latin-1 (e.g. how it handles wide characters) but it does look like it doesnt support wide characters (cause it's handling the two bytes separatly, not as one wide character)

Edit: ok, reading the wiki more, if the symbol isn't unicode, it's a problem with codepages,  ISO-8859-1 doesnt have the TM symbol, but Windows-1252 does have it, the problem is that they are basically the same, so web browsers (and im guessing other apps) treat them as such. ISO-8859-1 has a range of control characters (which are invalid in HTML) while Windows-1252 has characters there (including the TM symbol)

585802611[/snapback]

Well just for S&G, let me try using that Windows-1252 instead... :)

  • 10 months later...
  • 3 weeks later...
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.