Client sends data, server never receives them, no error. [SOLVED]

Started by
9 comments, last by Fil 19 years, 2 months ago
Shortly I'd like to know what can be done to be sure data has been acknowledged. I've been debugging my thesis program for more then 3 month, but I can't understand why socket doesn't work yet. I have a client/server program for perform a huge elaboration. The server has a thread (derived from CWinThread) for each client. I'm using MFC CSocket derived classes to receive data. The server sends the input of a piece of the elaboration to a client that sends the output to the server. At the end the server collects all the solution of the subproblem and works out the solution of the main problem. enum {OP_UNKNOWN, OP_SUM, OP_MUL}; Server calls:

int op=OP_SUM;
BOOL bOK=(pClientSocket->Send(&op,sizeof(op))!=SOCKET_ERROR);
ASSERT(bOK);
bOK=(pClientSocket->Send(&x,sizeof(x))!=SOCKET_ERROR);
ASSERT(bOK);
bOK=(pClientSocket->Send(&y,sizeof(y))!=SOCKET_ERROR);
ASSERT(bOK);





The client receives that way(OnReceive method of CSocket derived class):

int op,x,y,z;

// receive input
BOOL bOK=(Receive(&op,sizeof(op))!=SOCKET_ERROR);
bOK=(Receive(&x,sizeof(x))!=SOCKET_ERROR);
bOK=(Receive(&y,sizeof(y))!=SOCKET_ERROR);

// perform elaboration (actually is more complex ;))
switch (op)
{
    case OP_SUM: z=x+y; break;
    case OP_MUL: z=x*y; break;
    default: ASSERT(0); break;
}

// return output to server
int code=MSG_RESULT;
bOK=(pServerSocket->Send(&code,sizeof(code))!=SOCKET_ERROR);
ASSERT(bOK);
bOK=(pServerSocket->Send(&z,sizeof(z))!=SOCKET_ERROR);
ASSERT(bOK);





The server receives the output in its OnReceive method:

int code;
BOOL bOK=(Receive(&code,sizeof(code))!=SOCKET_ERROR);
ASSERT(bOK);
switch (code)
{
    case MSG_RESULT: 
        bOK=(Receive(&z,sizeof(z))!=SOCKET_ERROR);
        ASSERT(bOK);
    break;

    /* ...other cases... */

    default: ASSERT(0); break;
}




A code similar to what I show you (my thesis is of course more complex ;)) is executed hundreds of time without errors, but at times server or client become idle. I added a log file and I see that client sends the output to the server and at a certain point it never receives it or, in other run, the server sends input for another piece of elaboration to the client but it never receives it. I hear send()/receive() returns no error if it copies data to send/receive in its internal buffer successfully and not if data arrives to the other end. Can I have any suggestion how to solve my problem, please? Thank you very much in advance! [Edited by - Fil on February 6, 2005 7:42:55 AM]
Fil (il genio)
Advertisement
Are you using TCP or UDP? If you're using TCP, then the data will eventually get to the other end -- it's guaranteed by the protocol. If you're using UDP, then packets may get lost and not re-transmitted. Are your sockets blocking or non-blocking? You want blocking sockets, because otherwise you could get a few bytes (for the operation, say), without yet having received the rest of the data. I suggest making sure that you are using TCP, and blocking sockets, before looking at other causes.

Assuming you're using TCP, with blocking sockets, there could of course be other problems, too, such as race conditions or deadlocks in your design. Proving that a distributed system is both deadlock free and free of race conditions is quite hard (google for "temporal logic" and the protocol proof framework "spin" from Bell Labs, for example). As a first step: if you use the code as you have shown us here, does it still go idle?

PS: When you say "elaboration," I think you really mean "computation" ;-)
enum Bool { True, False, FileNotFound };
I use CSocket with TCP, only the receive is blocking.
I always start the server and one of the client inside VC++ 6.0 debugging environment, so when at least one client goes idle, I break the execution of the server; all its threads are in PumpMessage and (I hope) there is no deadlock (no thread blocked in CSingleLock::Lock() method, for example. I thought about PetriNet to be (quite) sure to have no deadlock, but I think it isn't very feasible). I recently try with only a client (within the same pc) and after 38 minutes the processing stopped: the client never received the 156th input! I begin all the processing again and now the client is still working (3+ hours and 1395 works processed). It's very strange: such kind of behaviour suggests me it has something to do with thread synchronization... I don't know what to do! [depressed]

I'm going to follow your suggestion: I will try with the code I posted previously then I'll post the results. With an easyer problem I'll have more chances to solve the thesis.

Thanks for your help for the moment [smile].

PS- Yes, I mean computation. [grin]
Fil (il genio)
You can deadlock the protocol without deadlocking the machine. If the protocol is run like a state machine, and both ends go into a state where they wait for the other end to send something, then the protocol is deadlocked, even if the threads are sitting waiting for messages.

One way to debug this is to create detailed log files of what you receive, when, and what you send, when, on each client and server; then analyze the 5-10 transactions before it falls quiet to see what's going on.
enum Bool { True, False, FileNotFound };
In the last weeks I've tried to enhance the log info and I studied all the results, as hplus0603 suggested me. The only thing is missing in the log file is lock/unlock of critical sections. From the log files, the client that goes idle, says it has successfully sent the output to server (and its log file ends). Server says it has received it, but say also it has sent another input without errors to that client.

Now I don't know what else to do.
I've tried to attach VC++ to the process of the client that went idle. It has all its thread in PumpMessage function (so there is no thread deadlock).
For deadlocking the protocol the answer is no again: I do this kind of operation for each send/receive:

- write to log "I'm starting a send" (with "this", socket, buffer pointer and buffer length)
- do the send (writing to log how many bytes I want to sent and how many have been actually sent (and they're always the same, without errors))
- write to log "I've finished the send" (same data as before)
(the same for receive).

If the protocol was in deadlock I would see "I'm starting a send/receive" without "I've finished the send/receive" message in the log file. And that's doesn't happen.

It seems the error is in the client, since the server sends always other input to clients... what can I do? It seems that socket is unable to send the message "Server has sent to you something" to its thread.

Help me please with wathever advise: I *must* end my thesis before march! [depressed]

Thank you very much, for the moment for your replies!
Fil (il genio)
Always, always, always when debugging protocol problems:

Use Ethereal!

(free, open-source, packet sniffer)

It kicks ass, and is good at automatically reconstructing protocols. i.e. you click on a packet, and it shows you the full, colourised, text of the exchange that that packet was part of. (assuming it's a known protocol, or you've added a filter for that protocol).

One of the most essential parts of the network developers toolkit...

redmilamber
Studying the streams on the wire may in the end lead to understanding what the bug is (writing to the wrong socket? writing incomplete packets?). I second the recommendation for using Ethereal.

Your logging probably isn't detailed enough, though. You should call getpeername() on the socket each time you send data to it, and log that address information. You should log the return values of the send and receive functions. You should log the actual data sent and received by each call. Look for things like re-using file descriptors, memory overwrites, half-formed packets, if() branches not taken, ...
enum Bool { True, False, FileNotFound };
I used Ethereal... but I was not too lucky :( All the packets server and client exchange are ok. It seems the client simply stop working, but Ethereal says all the packets arrive to client (I send a NOP message once a minute and when client goes idle it never write it has received them in the log).

So I'm pretty sure to say there's something wrong in the client... even if its threads are no deadlocking. I know my log could be not detailed enough; now it contains the following information:

- if a thread is waiting for an event
- if an event/semaphore is setted/released
- how many bytes are sent/received from which socket (adding getpeername() result, as hplus0603 suggested), and what is returned by each function (Ethereal says me what is sent/received each time)

Each time I write something in the log I add at which time it happens (I syncronized my computers with an external server, but anyway Ethereal gives me the right times).

Quote:
Look for things like re-using file descriptors, memory overwrites, half-formed packets, if() branches not taken, ...


Memory overwrites are checked by a memory manager I added since the beginning... I think the check is done when mem is freed, so maybe it can be the problem (since it could be random as the moment when client goes idle).
I'd exclude half-formed packets, since Ethereal assures me that all packets are ok (sent/acknowledged/checksum ok, etc).
I will profile my client program to see if() branches not taken, I will check (using a tool I wrote some time ago) if I re-use file descriptors.

I think memory overwrites could be the most likely problem: it idles the client always with no apparent reason. Maybe the memory block that hold the socket is overwritten by someone; in this case, server can send data to client, but client is no more able to receive them. Don't you agree with me? Unlucky I cannot check things like socket window handle, because they are private members of CSocket/CAsyncSocket classes, I will try using CObject::AssertValid() method each time I use it (and each time I use all the objects of CObject derived classes).

I thank you very much for your help and suggestions [smile]. Reply me if you have other ideas... meanwhile I keep on debugging [depressed]
Again, thank you very much!
Fil (il genio)
I've done all the things I can do to debug my thesis' program, but the bug is still there. I use Ethereal, log files, memory manager, etc. with no success. I thought to add some TRACE/log inside CSocket code, so I copied header and implementation file making a new class, CMFCSocketObject that is exactly identical with original CSocket, but with TRACE/log added. When I was looking for WM_SOCKET_NOTIFY in google to know which header file I was missing, I found a bad news at this link.

So the problem is not in my program but on Microsoft code.[evil] That's the reason why I cannot find it (even with your precious suggestions)!!

Now I'm going to change MFC CSocket class with something that uses WinSock directly and, since I've lost too much time, I'm searching something already on internet, but I cannot find a class that does what is supposed to.
I found many classes on codeproject, codeguru, flipcode and, of course, gamedev. I read all the results google gave me, but I always find buggy classes, or not blocking, or derived from CAsyncSocket (deriving also its bugs), sometimes classes with too many low level functions (so they waste more time then WinSock and there is no correctness guarantee). Have you ever found some better (working) alternative?

I hope in your suggestions, thanks for your patience, but I don't know what to do. I wasted 4 month, I need some quick magic to avoid MFC CAsyncSocket class (and to finish my thesis by march[help]).

Thank you very much in advance.
Fil (il genio)
It's not THAT hard to just re-implement the parts of the CSocket API that you need, on top of WinSock 2.2. Or, maybe better, just rip out the CSocket part and implement your own wrapper. That's the best advice I can give you at this point, sorry.
enum Bool { True, False, FileNotFound };

This topic is closed to new replies.

Advertisement