This chat, apparently, is a gigantic coding horror

Started by
24 comments, last by Sik_the_hedgehog 9 years, 6 months ago
So lots of little quirks about the chat.
Some people here discovered some time ago that typing the message "=A=" gives "__E&E__". Well, isn't that just the cutest little thing?

Today, I found out why.
I noticed when inspecting network requests to and from the chat, that "=" gets transformed into "__E__"
Hey, that's curious... looks a bit like an HTML entity would, doesn't it?

Sure enough, it seems a small handful of HTML entities are translated:
input output
__A__  &
__PS__ +
__C__  ,
__E__  =
So the original string, "=A=", expands to "__E__A__E__", but for some reason, "__A__" is replaced first, thus giving the strange "__E&E__". However, "__A__E__A__" is correctly contracted to "&E&".
This gives me the terrible feeling that someone somewhere is chaining string replace on the message like so:

message = message
    .replace("__PS__","+")
    .replace("__A__","&")
    .replace("__E__","=")
    .replace("__C__",",");
... which doesn't make sense, since both PHP and javascript have a callback for regular expression replace:

$replacements = [
    "__A__" => "&", # or html equivalent
    "__PS__" => "+",
    "__C__" => ",",
    "__E__" => "="
];

preg_replace($message,"/__[A-Z]{1,2}__/g",function ($match) use(&$replacements) {
    return @$replacements[$match] ?: $match;
});

Sooo... wtf, IPS.
Advertisement
Yea when me+bact first discovered them(which was a complete accident, as bact was showing some code iirc, and it had ,A, in it) anyway me+him surmised it's some type custom encoding scheme, rather than using base64.

Also if you went through the source code did you try out the bad words?
Check out https://www.facebook.com/LiquidGames for some great games made by me on the Playstation Mobile market.

Also if you went through the source code did you try out the bad words?

I did not go through the source code, but I'm pretty positive this is what's going on. I know there are other quirks, but what do you mean by "trying out the bad words"?
ipb.chat.badwords.set( 'yeahmobi', [ 1, "[A REALLY ANNOYING AND DISREPUTABLE COMPANY THAT SPAMS FORUMS]" ] );
ipb.chat.badwords.set( 'YeahMobi', [ 1, "[A REALLY ANNOYING AND DISREPUTABLE COMPANY THAT SPAMS FORUMS]" ] );
these however do not actually do anything when you type it into chat unfortuantly=-(

edit: also when i was reading the title, i expected code snippets of people in chat, not the chat itself =-P
Check out https://www.facebook.com/LiquidGames for some great games made by me on the Playstation Mobile market.
What, you mean like this thing that happened yesterday?


Stormynature+> L. Spiro I would never think of you as a whiner...just a very deep navel gazer confused by the mysteries of belly button lint
L. Spiro+> I did do some naval grazing back in my time.
riuthamus has entered the room
riuthamus> naval gazing....
riuthamus> i had to read the context to understand
riuthamus> now that i do, im not sure i wanted to...
NightCreature83+> so you go navel gazing instead?
fastcall22> suddenly, i have a strong feeling to clean my navel
fastcall22> my naval ship, that is
fastcall22> all aboard the u.s.s. yourmom, now taking off to the issofat island
riuthamus kicked fastcall22


EDIT:
Or do you mean like anything WiredCat posts? His code gives me nightmares...
ah, that was a classic one fast =-P
Check out https://www.facebook.com/LiquidGames for some great games made by me on the Playstation Mobile market.

Yes, I actually found the code that caused that issue. It comes from some "dirtyMessage" and "undirtyMessage" functions (can't check the exact names now, try and grep the JS code for "dirty") in the javascript code that do exactly what you are saying: replacing special symbols by encoded values to send them over the network. But the encoding and decoding functions are not inverses of each other, and in some circumstances the encoding is ambiguous, which causes the decoding function to decode the wrong thing, resulting in the bug observed. Basically, whoever implemented it tried to be smart, and failed.

Should've just used base64.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

It's especially funny since javascript already has a built in function called encodeURIComponent which replaces certain characters with their respective safe counterparts.

Yeah that was my first thought, but I'm not sure if some symbols aren't still parsed after the decoding (can somebody who knows better confirm if this is the case or not?).

Also why + gets encoded to __PS__ and not __P__? *OCD mode kicks in*

Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

Okay, now that I've checked it's not "dirty" but "clean", close enough.. behold the cleanMessage() function:


function (message) {
  message=message.replace(/\r/g,'');
  message=message.replace(/\n/g,"__N__");
  message=message.replace(/,/g,"__C__");
  message=message.replace(/=/g,"__E__");
  message=message.replace(/\+/g,"__PS__");
  message=message.replace(/&/g,"__A__");
  message=message.replace(/%/g,"__P__");
  return message;
}

And the unCleanMessage() function:


function (message) {
  message=message.replace(/__PS__/g,"+");
  message=message.replace(/__P__/g,"%");
  message=message.replace(/__A__/g,"&");
  message=message.replace(/__E__/g,"=");
  message=message.replace(/__C__/g,",");
  message=message.replace(/__N__/g,"<br />");
  return message;
}

Now take a look at what happens if you type in "=A=" for instance.. the two equal signs get replaced and so it gets encoded to "__E__A__E__". Which then promptly gets decoded to "__E&E__" as the middle part "__A__" happens to get replaced first. Oops laugh.png

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

This topic is closed to new replies.

Advertisement