IRC messages are all messed up :(

Started by
6 comments, last by hplus0603 19 years, 1 month ago
Hey, I've been trying to parse my irc messages i'm getting form the server for the past day and a half, and I cant get it down right, this is what i'm doing to try to do to separate the messages: I know that irc messages end with a carrigereturn-line feed pair char(13) char(10) so i tried to separate my messages whenever i saw those characters basically, what my code is doing is: -it first recieves the messages and stores it in a char array called recvbuf that is 512 characters long. -then i convert that char array into a string object (s1)so i can manipulate it slightly better. -I then add whatever s1 to s, which is a string that holds all the messges it gets - i then find the position of the linefeed character (char 10) in my s string. -now i'm inside my WHILE loop while(i != -1){ - i now create another string called subs that will be the substring of s that starts at position 0 and contains all the characters up to, but not including the line feed character (char 10). -now i have to remove the carrige return character frmo the very end of subs, so i erase the last character. -i then push that string onto a queue i have created to hold all the server messages i got. -i now clear the subs string, and erase all the characters in s, starting from 0 up to and includeing the line feed character (char 10); -i then find the next line-feed characer (char 10) and start the loop over again until there are no more line-feed characters. Here's the code: string s; while(1){ bytesRecv = recv( m_socket, recvbuf, 512, 0 ); string s1(recvbuf); s = s + s1; int i = s.find( char(10) ); while(i != -1){ subs = s.substr(0 , i); subs.erase( subs.length()-1, 1); msgQueue.push( subs ); subs.clear(); s.erase(0,i+1); i = s.find( char(10) ); } } I figure this should work, but all my messages i get from the server after i pop from the msgQueue are garbled up some how, like this. I cant figure out why i'm getting these weird symbols in front of the PING message and such. NOTICE AUTH :*** Looking up your hostname └ÿ∩&åNOTICE AUTH :*** Checking Ident NOTICE AUTH :*** Found your hostname └Ö░NÇαj╨☻Çj╨≈9«NÇ☻NOTICE AUTH :*** No ident response ICE AUTH :*** Found your hostname └Ö░NÇαj╨☻Çj╨≈9«NÇ☻PING :1198091985 o ident response ICE AUTH :*** Found your hostname These garbled characters are different each time i run my program. And sometimes i even get the ping messages attached to the end of another message and its not separated by a char(10)char(13) pair. Anyone have expeirence in parsing irc messages that could give me some pointers on how to fix this? Any help would be greatly appreaciated. Thanks
-------------------------------------Physics Labhttp://www.physics-lab.netC++ Labhttp://cpp.physics-lab.net
Advertisement
Quote:
string s1(recvbuf);



recv() does not zero-terminate the receiving buffer. Thus, this construction of the string may gobble up whatever gunk is in the buffer after the actual received data, until it happens to hit a nul character.

You should zero-terminate the buffer yourself, or use the string constructor that takes pointer and length. You should also check the return value from recv(); if it's less than 0 you have an error and need to deal with it.
enum Bool { True, False, FileNotFound };
I tried doing that, but it didn't help much
I'm starting to think its the undernet server that i'm joining, i think they're not using the standard protocol, this is the one i'm basing it off

ftp://ftp.rfc-editor.org/in-notes/rfc1459.txt

-------------------------------------Physics Labhttp://www.physics-lab.netC++ Labhttp://cpp.physics-lab.net
I'm surprised you're having so many troubles. I personally used these specs., along with this as a guide, and I didn't have too many troubles. Might I suggest that rather than trying to parse your messages, just your application so you can receive messages, *then* work on parsing them.

My implementation may have had something to do with it. I made 2 threads, 1 for receiving and printing data (which quite often wrote over whatever I was writing at the time, because it was only a *very* basic console application), and another to send whatever it was I chose to write (followed by '\r\n'), both terminating once the connection was broken (usually after I typed in 'QUIT' and hit enter). Another thing I did was set all of the memory of my buffers to zero, that way the only way it would ever print garbage was if it managed to read all 512 bytes, which is a highly unlikely event.
While programming an IRC client, I noticed that a lot of server implementations do not follow the linefeed standard. In fact, some even mix and match. Sometimes it's 13-10, sometimes 10, sometimes 13. It's something that has to be taken into consideration. Here is my implementation of a message retriever.

// function that takes a token (space character for instance// and a source string, then chops up the string into chunks based on the token// and then puts each individual chunk one by one into a vector// it returns the number of chunks foundvector<string> STLStrTok(string token, const string &SrcString){	vector<string> DestVector;	size_t PreviousPosition = 0;	// set the current position to wherever we find the first token at	// 0 says start at the beginning of the string	size_t CurrentPosition = SrcString.find(token, 0);	// if we've reached the end of the string, and no token was found	// add the entire string as one giant chunk	// and then return 1	if(CurrentPosition==string::npos)	{		DestVector.push_back(SrcString);		return DestVector;	}	// else, if the string isn't blank, push the substring into the vector	if(SrcString.substr(PreviousPosition, CurrentPosition-PreviousPosition)!="")		DestVector.push_back(SrcString.substr(PreviousPosition, CurrentPosition-PreviousPosition));	// keep doing it until we hit beyond the end of the source string	while(CurrentPosition!=string::npos)	{		PreviousPosition = CurrentPosition+token.length();		CurrentPosition = SrcString.find(token, PreviousPosition);		if(SrcString.substr(PreviousPosition, CurrentPosition-PreviousPosition)!="")			DestVector.push_back(SrcString.substr(PreviousPosition, CurrentPosition-PreviousPosition));	}	return DestVector;}class irc_message{public:	irc_message(void)	{		default_prefix = "";		prefix = "";		cmd = "";		params_vector.clear();	}	irc_message(string srcDefaultPrefix)	{		default_prefix = srcDefaultPrefix;		prefix = "";		cmd = "";		params_vector.clear();	}	bool SetMessage(string srcMessageString)	{		// is it completely blank? don't bother resetting the message, nothing has been changed yet		if(srcMessageString == "")			return false;		// convert all whitespace to space chars, there shouldn't be any 0xd or 0xa's at this point, mind ya...		for(string::size_type i = 0; i < srcMessageString.length(); i++)			if(isspace(srcMessageString))				srcMessageString = ' ';		// rip it up using the spaces as separators		vector<string> MessageParts = STLStrTok(" ", srcMessageString);		// was it all spaces? :) don't bother resetting the message, nothing has been changed yet		if(MessageParts.size() == 0)			return false;		vector<string>::size_type curr_pos = 0;		// if there's a prefix, store it		if(MessageParts[0][0] == ':')		{			// first, is that all there was?			if(MessageParts.size() == 1)				return false;			// remove the leading colon			prefix = MessageParts[0].substr(1, MessageParts[0].length() - 1);			// store cmd			cmd = MessageParts[1];			curr_pos = 2;		}		else		{			prefix = default_prefix;			cmd = MessageParts[0];			curr_pos = 1;		}		for(vector<string>::size_type i = curr_pos; i < MessageParts.size(); i++)			params_vector.push_back(MessageParts);		return true;	}	string GetPrefix(void)	{		return prefix;	}	string GetCommand(void)	{		return cmd;	}	vector<string> GetParamsAsVector(void)	{		return params_vector;	}	string GetParamsAsString(void)	{		string ret = "";		vector<string>::size_type ParamsSize = params_vector.size();		if(ParamsSize == 0)			return ret;		for(vector<string>::size_type i = 0; i < ParamsSize; i++)		{			ret += params_vector;			if(i != ParamsSize - 1)				ret += " ";		}		return ret;	}private:	string default_prefix;	string prefix;	string cmd;	vector<string> params_vector;};vector<irc_message> GetIRCMessages(string input_buffer){	// used to store strings that didn't have a terminator between calls to this function	static string hold_buffer = "";	// vector of irc_messages	vector<irc_message> IRCMessages;	// add anything in the hold_buffer to the beginning of the input_buffer	input_buffer = hold_buffer + input_buffer;	// find the last LF	string::size_type LF_POS = input_buffer.find_last_of('\n');	// if no LF was found	if(string::npos == LF_POS)	{		// put the current input_buffer into the hold buffer (see input_buffer = hold_buffer + input_buffer; above)		hold_buffer = input_buffer;		// return empty vector		return IRCMessages;	}	else if(LF_POS != input_buffer.length() - 1) // if the last LF found is not the last char in the input_buffer	{		// place the remainder of input_buffer in the hold buffer		hold_buffer = input_buffer.substr(LF_POS + 1, input_buffer.length() - LF_POS - 1);		// trim input_buffer by removing the remainder (keeping the last LF found in input_buffer)		input_buffer = input_buffer.substr(0, LF_POS + 1);	}	else	{		// input_buffer ended with LF (how convenient), so no need to put anything in the hold buffer		hold_buffer = "";	}	// vector of message strings	vector<string> MessageStrings;	// break up the input_buffer by CRLF	MessageStrings = STLStrTok("\r\n", input_buffer);	// now break up all the new strings by LF	// start fresh	vector<string> TempMessageStrings = MessageStrings;	MessageStrings.clear();	// for each string in TempMessages	for(vector<string>::const_iterator msg_iter = TempMessageStrings.begin(); msg_iter != TempMessageStrings.end(); msg_iter++)	{		// break up by LF		vector<string> tmp = STLStrTok("\n", *msg_iter);		// insert any strings found (there will be minimum 1 if no LF was found) into MessageStrings		MessageStrings.insert(MessageStrings.end(), tmp.begin(), tmp.end());	}	// convert message strings to irc_message objects	for(vector<string>::const_iterator msg_iter = MessageStrings.begin(); msg_iter != MessageStrings.end(); msg_iter++)	{		irc_message msg;		if(msg.SetMessage(*msg_iter))			IRCMessages.push_back(msg);		// else insert error into error message queue	}	// final final result	return IRCMessages;}...// entry into main loopchar *RXBuffer = new(nothrow) char[8193];if(RXBuffer == 0){	cout << "  " << "Fatal memory allocation error when creating network receive buffer." << endl;	return -1;}else{	// do initial zero'ing out of receive buffer	memset(RXBuffer, '\0', sizeof(char) * 8193);}// recv / response loopwhile(!APP_TERMINATE){	int bytes_received = TCPClient.Receive(RXBuffer, 8192);	if(bytes_received < 0)	{		cout << "  " << TCPClient.LastNetworkError() << endl;		return -1;	}	else	{		if(bytes_received == 0)		{			Sleep(1);		}		else		{			// 0 terminate the buffer so we can turn it into a string			RXBuffer[bytes_received] = '\0';			// get messages			vector<irc_message> IRCMessages = GetIRCMessages(string(RXBuffer));			if(IRCMessages.size() > 0)			{				cout << endl << (long unsigned int)IRCMessages.size() << " msgs" << endl;				for(vector<irc_message>::iterator msg_iter = IRCMessages.begin(); msg_iter != IRCMessages.end(); msg_iter++)				{					cout << "MESSAGE" << endl;					cout << "FROM  : " << msg_iter->GetPrefix() << endl;					cout << "CMD   : " << msg_iter->GetCommand() << endl;					cout << "PARAMS: " << msg_iter->GetParamsAsString() << endl;					cout << endl;				}			}			// process irc_messages here		}	}}if(RXBuffer != 0)delete [] RXBuffer;
Cool thanks for the info, i'll take a look at the code in a little bit, I got a Discreet Transformations exam in an hour
-------------------------------------Physics Labhttp://www.physics-lab.netC++ Labhttp://cpp.physics-lab.net
Hey Anonymous poster!

Thanks so much, you're code really helped me figure out how to extract the messages.

My client is working great.


But I just have one more question.

Does anyone know of any good thread classes that are easy to use, one that my irc class can just inherit from? The one that i'm using right now isn't very good.

Thanks
-------------------------------------Physics Labhttp://www.physics-lab.netC++ Labhttp://cpp.physics-lab.net
Gavinl: yeah, be glad you're taking Discreet Transformations. They're so much better than those Conspicuous Transformations.

:-)
enum Bool { True, False, FileNotFound };

This topic is closed to new replies.

Advertisement