Graceful and Ungraceful Socket Disconnects

Networking and Multiplayer Programming

Started by PyroBoy February 09, 2002 12:27 AM

19 comments, last by PyroBoy 22 years, 1 month ago

122

Author

February 09, 2002 12:27 AM

I''m having a problem recognising graceful and ungraceful disconnects on the client side. It''s a bit involved, so bear with me... What I''m mainly dealing with is the return value of recv() and its relationship to socket disconnection. For recv(), The docs say that a positive return value means that that number of bytes of data came through. A return value of 0 indicates the socket was disconnected gracefully on the other end and a negative return value indicates an error (as would happen in an ungraceful disconnect). When one side wants to disconnect, it calls closesocket() on the connected socket, and the other end''s recv() (in theory) returns 0 and the disconnected app can be notified that the other side has closed the connection gracefully. When the connection goes down for any other reason, recv() (again, in theory) returns a negative value on both sides of the connection and both sides are notified of a crash-n-burn style disconnect. Now, this setup works great on the server side - when the client closes its socket, recv() returns 0 on the server side, and when the client is closed down hard(ctrl-alt-del it out of existance without calling closesocket()), recv() returns a negative value indicating the client went down unexpectedly. But the exact same process (run by the exact same code - it''s encapsulated in a connection class) initiated from the *server* side malfunctions. When the server closes its sockets gracefully, the recv() call on the client side returns -1, indicating an error (ungraceful disconnect). When the server is shut down ungracefully, recv() still returns -1 on the client side. Why would graceful disconnects via the 0 return value of recv() work one way and not the other? Does the meaning of the return value of recv() depend on whether it''s the client or server side calling it? The docs don''t seem to think so... Should I be doing more to initiate a disconnection than simply calling closesocket()?

PyroBoy

122

Author

February 11, 2002 05:22 PM

anyone even know what I''m talking about here? :-)

JonStelly

127

February 11, 2002 05:39 PM

Actually, just calling closesocket() is _NOT_ a graceful disconnect. You need to call shutdown() with SD_SEND or SD_BOTH to initiate the session teardown.

Sequence should be something like:

  bool socket::closeconnection(){  if(shutdown(m_hSocket, SD_BOTH)){    //ERROR    return false;  }  if(closesocket(m_hSocket)){    //ERROR    return false;  }  return true;}

PyroBoy

122

Author

February 11, 2002 08:42 PM

I added a shutdown to my sockets, and that didn''t solve anything.

Weird thing... The compiler finds the shutdown() function just fine, but doesn''t recognise SD_SEND, SD_RECIEVE or SD_BOTH.
I found them in winsock2.h, so I tried calling the function with the 0x01(SD_SEND) and 0x02(SD_BOTH) values. Neither produced a graceful disconnect initiated from the server side, but both still worked fine initiated from the client side.

This is really starting to puzzle me...

JonStelly

127

February 11, 2002 11:01 PM

Not sure what it could be then. Shutdown behavior should be the same on both sides of the connection. You''re going to have to start posting code now.

PyroBoy

122

Author

February 11, 2002 11:40 PM

ok. Here ya go:

  //Set up structs for socket checkingfd_set sockSet;timeval time;time.tv_sec = 0;time.tv_usec = 0;//Recieve loopwhile (checkForData){    //Check if there''s data to retrieve    sockSet.fd_count = 1;    sockSet.fd_array[0] = sock;    if (select(0, &sockSet, NULL, NULL, &time) == 1)    {        //Clear recieve buffer        memset(recieveBuffer, 0, bufferSize);        //Receive        int dataRead = recv(sock, recieveBuffer, bufferSize, 0);			        //Data came in        if (dataRead > 0)        {            //Do stuff with the data that came in        }        //Data didn''t come in, meaning there''s been a disconnect or error, since to        //get here in the first place the socket select()ed as readable        else        {            //Decide if the disconnect was planned            if (dataRead == 0)            {                //Graceful disconnect            }            else            {	                //Ungraceful disconnect            }        }    }}

So as you can see, the app loops, calling select until the socket is readable, and then it calls recv. It then interprets the return value as either a data transmission, graceful disconnect or some sort of error.

Here''s the shutdown code that''s supposed to trigger a graceful disconnect:

  shutdown(sock, 0x02); //"0x02" resolves to SD_BOTH. The function seems content with                               // this direct value and is not returning errorsclosesocket(sock);

Simple enough. It just doesn''t work when the server initiates the disconnection. Everything works perfectly when the client decides to do the disconnection. That''s the really prickly part - if this is wrong, it should fuck up in *both* directions! :-)

JonStelly

127

February 12, 2002 12:12 AM

Ok, I need to retract my earlier statement. closesocket does initiate a graceful teardown in the following situations:

1)SO_DONTLINGER is set
2)SO_LINGER is set to a nonzero value

SO_DONTLINGER is the default. So assuming you''re not doing anything fancy, you should be ok(your code looked fine to me). One caveat, if there is data waiting to be sent when you call closesocket, depending on the socket''s linger setting, you may get a hard close.

So what I''m guessing is that you''ve got outbound data queued on your server and it''s causing problems somewhere. This is a document from the Platform SDK docs that comes with VS .NET (hope the formatting looks ok):

Graceful Shutdown, Linger Options, and Socket Closure
The following material is provided as clarification for the subject of shutting down socket connections closing the sockets. It is important to distinguish the difference between shutting down a socket connection and closing a socket.

Shutting down a socket connection involves an exchange of protocol messages between the two endpoints, hereafter referred to as a shutdown sequence. Two general classes of shutdown sequences are defined: graceful and abortive (also called hard). In a graceful shutdown sequence, any data that has been queued but not yet transmitted can be sent prior to the connection being closed. In an abortive shutdown, any unsent data is lost. The occurrence of a shutdown sequence (graceful or abortive) can also be used to provide an FD_CLOSE indication to the associated applications signifying that a shutdown is in progress.

Closing a socket, on the other hand, causes the socket handle to become deallocated so that the application can no longer reference or use the socket in any manner.

In Windows Sockets, both the shutdown function, and the WSASendDisconnect function can be used to initiate a shutdown sequence, while the closesocket function is used to deallocate socket handles and free up any associated resources. Some amount of confusion arises, however, from the fact that the closesocket function implicitly causes a shutdown sequence to occur if it has not already happened. In fact, it has become a rather common programming practice to rely on this feature and to use closesocket to both initiate the shutdown sequence and deallocate the socket handle.

To facilitate this usage, the sockets interface provides for controls by way of the socket option mechanism that allow the programmer to indicate whether the implicit shutdown sequence should be graceful or abortive, and also whether the closesocket function should linger (that is not complete immediately) to allow time for a graceful shutdown sequence to complete. These important distinctions and the ramifications of using closesocket in this manner are still not widely understood.

By establishing appropriate values for the socket options SO_LINGER and SO_DONTLINGER, the following types of behavior can be obtained with the closesocket function:

Abortive shutdown sequence, immediate return from closesocket.
Graceful shutdown, delaying return until either shutdown sequence completes or a specified time interval elapses. If the time interval expires before the graceful shutdown sequence completes, an abortive shutdown sequence occurs, and closesocket returns.
Graceful shutdown, immediate return—allowing the shutdown sequence to complete in the background. Although this is the default behavior, the application has no way of knowing when (or whether) the graceful shutdown sequence actually completes.
One technique that can be used to minimize the chance of problems occurring during connection teardown is to avoid relying on an implicit shutdown being initiated by closesocket. Instead, use one of the two explicit shutdown functions, shutdown or WSASendDisconnect. This in turn causes an FD_CLOSE indication to be received by the peer application indicating that all pending data has been received. To illustrate this, the following table shows the functions that would be invoked by the client and server components of an application, where the client is responsible for initiating a graceful shutdown.

PyroBoy

122

Author

February 12, 2002 04:47 PM

I tried altering the program so that there isn't any data transfer near the disconnection, to eliminate the possibility that queued data is what's screwing things up. It worked fine. So it's definately something to do with there being data left to send on the socket. That at least explains why it worked one way and not the other.

The docs on closesocket state:

"If SO_LINGER is set with a nonzero time-out interval on a blocking socket, the closesocket call blocks on a blocking socket until the remaining data has been sent or until the time-out expires. This is called a graceful disconnect."

Alright, so sockets are created as blocking by default(again, according to the docs), and I don't set them to nonblocking mode at any point. I also set the linger option like so:

    LINGER ling;ling.l_onoff = 1;     //Before you say it, I also tried these 2 values inside a htonl(). No change in behavior.ling.l_linger = 5;setsockopt(sock, SOL_SOCKET, SO_LINGER, (const char *)&ling, sizeof(LINGER));

setsockopt does not return an error.

So according to the documentation, my blocking, 5-second-definately-nonzero-lingering socket ought to be gracefully shut down by closesocket, and closesocket should block until either all the data is sent or the 5 seconds elapses. If the 5 secs elapses before it can send all the data, then it should shut down hard. Not before.

But it doesn't. closesocket returns way before 5 seconds is up, and a hard disconnect is recieved on the other side.

AAAARRRGGHHHH!!!! :-)

Edited by - PyroBoy on February 12, 2002 5:48:46 PM

JonStelly

127

February 12, 2002 05:58 PM

What is closesocket''s return value when called on the server? Check that and check WSAGetLastError(). It might give a little more information about what''s going on. If the error is WSAEWOULDBLOCK, you''ve put your socket into nonblocking mode somewhere.

You could try looping a few times on closesocket() if it''s failing, maybe sleeping for 1 second between calls, but that''s a really ugly fix for something that should work.

PyroBoy

122

Author

February 12, 2002 06:22 PM

closesocket returns 0 when called from the server side. According to that, the disconnect should have happened smoothly. But I''m still detecting a hard close on the other side.

Graceful and Ungraceful Socket Disconnects

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Graceful and Ungraceful Socket Disconnects

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines