Sign in to follow this  

Overlapped I/O

This topic is 4599 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I've managed to set up client/server in windows using winsock using asynchronous sockets in the past, and it worked fine. However, I was looking through the winsock FAQ, and I see that for thousands of connections I should use "Overlapped I/O". Though I don't think it's something I need to worry about for quite a while, I can't figure out what this nebulous term means.

Share this post


Link to post
Share on other sites
I copied this from a Network Programming for Microsoft Windows ebook i have, hope is of use, but for high performing servers, you should consider the IOCP(Input Output Completion Ports) model instead.

Note: You might find it hard to read the text from this post, so i sugest you copy the text and paste it on a word processor.


The Overlapped Model
The overlapped I/O model in Winsock offers applications better system performance than any of the I/O models explained so far. The basic design of the overlapped model allows your application to post one or more Winsock I/O requests at a time using an overlapped data structure. At a later point, the application can service the submitted requests after they have completed. This model is available on all Windows platforms except Windows CE. The overall design of the model is based on the Win32 overlapped I/O mechanisms available for performing I/O operations on devices using the ReadFile and WriteFile functions.

Originally, the Winsock overlapped I/O model was available only to Winsock 1.1 applications running on Windows NT. Applications could take advantage of the model by calling ReadFile and WriteFile on a socket handle and specifying an overlapped structure that we will describe later. Since the release of Winsock 2, overlapped I/O has been incorporated into new Winsock functions, such as WSASend and WSARecv. As a result, the overlapped I/O model is now available on all Windows platforms that feature Winsock 2.


NOTE
--------------------------------------------------------------------------------
With the release of Winsock 2, overlapped I/O can still be used with the functions ReadFile and WriteFile under Windows NT and Windows 2000. However, this functionality was not added to Windows 95 and Windows 98. For compatibility across platforms and for performance reasons, you should always consider using the WSARecv and WSASend functions instead of the Win32 ReadFile and WriteFile functions. This section will only describe how to use overlapped I/O through the new Winsock 2 functions.

To use the overlapped I/O model on a socket, you must first create a socket by using the flag WSA_FLAG_OVERLAPPED, as follows:

s = WSASocket(AF_INET, SOCK_STREAM, 0, NULL, 0,
WSA_FLAG_OVERLAPPED);




If you create a socket using the socket function instead of the WSASocket function, WSA_FLAG_OVERLAPPED is implied. After you successfully create a socket and bind it to a local interface, overlapped I/O operations can commence by calling the Winsock functions listed below and specifying an optional WSAOVERLAPPED structure.


WSASend


WSASendTo


WSARecv


WSARecvFrom


WSAIoctl


AcceptEx


TransmitFile

As you probably already know, each one of these functions is associated with sending data, receiving data, and accepting connections on a socket. As a result, this activity can potentially take a long time to complete. This is why each function can accept a WSAOVERLAPPED structure as a parameter. When these functions are called with a WSAOVERLAPPED structure, they complete immediately—regardless of whether the socket is set to blocking mode (described at the beginning of this chapter). They rely on the WSAOVERLAPPED structure to manage the return of an I/O request. There are essentially two methods for managing the completion of an overlapped I/O request: your application can wait for event object notification, or it can process completed requests through completion routines. The functions listed above (except AcceptEx) have another parameter in common: lpCompletionROUTINE. This parameter is an optional pointer to a completion routine function that gets called when an overlapped request completes. We will explore the event notification method next. Later in this chapter, you will learn how to use optional completion routines instead of events to process completed overlapped requests.

Event notification
The event notification method of overlapped I/O requires associating Win32 event objects with WSAOVERLAPPED structures. When I/O calls such as WSASend and WSARecv are made using a WSAOVERLAPPED structure, they return immediately. Typically you will find that these I/O calls fail with the return value SOCKET_ERROR. The WSAGetLastError function reports a WSA_IO_PENDING error status. This error status simply means that the I/O operation is in progress. At some later time, your application will need to determine when an overlapped I/O request completes by waiting on the event object associated with the WSAOVERLAPPED structure. The WSAOVERLAPPED structure provides the communication medium between the initiation of an overlapped I/O request and its subsequent completion, and is defined as

typedef struct WSAOVERLAPPED
{
DWORD Internal;
DWORD InternalHigh;
DWORD Offset;
DWORD OffsetHigh;
WSAEVENT hEvent;
} WSAOVERLAPPED, FAR * LPWSAOVERLAPPED;




The Internal, InternalHigh, Offset, and OffsetHigh fields are all used internally by the system and should not be manipulated or used directly by an application. The hEvent field, on the other hand, is a special field that allows an application to associate an event object handle with a socket. You might be wondering how to get an event object handle to assign to this field. As we described in the WSAEventSelect model, you can use the WSACreateEvent function to create an event object handle. Once an event handle is created, simply assign the overlapped structure's hEvent field to the event handle and begin calling a Winsock function—such as WSASend or WSARecv—using the overlapped structure.

When an overlapped I/O request finally completes, your application is responsible for retrieving the overlapped results. In the event notification method, Winsock will change the event-signaling state of an event object that is associated with a WSAOVERLAPPED structure from nonsignaled to signaled when an overlapped request finally completes. Because an event object is assigned to the WSAOVERLAPPED structure, you can easily determine when an overlapped I/O call completes by calling the WSAWaitForMultipleEvents function, which we also described in the WSAEventSelect I/O model. WSAWaitForMultipleEvents waits a specified amount of time for one or more event objects to become signaled. We can't stress this point enough: remember that WSAWaitForMultipleEvents is capable of waiting on only 64 event objects at a time. Once you determine which overlapped request has completed, you need to determine the success or failure of the overlapped call by calling WSAGetOverlappedResult, which is defined as

BOOL WSAGetOverlappedResult(
SOCKET s,
LPWSAOVERLAPPED lpOverlapped,
LPDWORD lpcbTransfer,
BOOL fWait,
LPDWORD lpdwFlags
);




The s parameter identifies the socket that was specified when the overlapped operation was started. The lpOverlapped parameter is a pointer to the WSAOVERLAPPED structure that was specified when the overlapped operation was started. The lpcbTransfer parameter is a pointer to a DWORD variable that receives the number of bytes that were actually transferred by an overlapped send or receive operation. The fWait parameter determines whether the function should wait for a pending overlapped operation to complete. If fWait is TRUE, the function does not return until the operation has been completed. If fWait is FALSE and the operation is still pending, WSAGetOverlappedResult returns FALSE with the error WSA_IO_INCOMPLETE. Since in our case we waited on a signaled event for overlapped completion, this parameter has no effect. The final parameter, lpdwFlags, is a pointer to a DWORD that will receive resulting flags if the originating overlapped call was made with the WSARecv or the WSARecvFrom function.

If the WSAGetOverlappedResult function succeeds, the return value is TRUE. This means that your overlapped operation has completed successfully and that the value pointed to by lpcbTransfer has been updated. If the return value is FALSE, one of the following statements is true:


The overlapped I/O operation is still pending (as described above).


The overlapped operation completed, but with errors.


The overlapped operation's completion status could not be determined because of errors in one or more of the parameters supplied to WSAGetOverlappedResult.

Upon failure, the value pointed to by lpcbTransfer will not be updated, and your application should call the WSAGetLastError function to determine the cause of the failure.

Figure 8-7 demonstrates how to structure a simple server application that is capable of managing overlapped I/O on one socket, using the event notification described above. The application outlines the following programming steps:


Create a socket, and begin listening for a connection on a specified port.


Accept an inbound connection.


Create a WSAOVERLAPPED structure for the accepted socket, and assign an event object handle to the structure. Also assign the event object handle to an event array to be used later by the WSAWaitForMultipleEvents function.


Post an asynchronous WSARecv request on the socket by specifying the WSAOVERLAPPED structure as a parameter.


NOTE
--------------------------------------------------------------------------------
This function will normally fail with SOCKET_ERROR error status WSA_IO_PENDING.


Call WSAWaitForMultipleEvents using the event array, and wait for the event associated with the overlapped call to become signaled.


After WSAWaitForMultipleEvents completes, reset the event object by using WSAResetEvent with the event array, and process the completed overlapped request.


Determine the return status of the overlapped call by using WSAGetOverlappedResult.


Post another overlapped WSARecv request on the socket.


Repeat steps 5-8.

This example can easily be expanded to handle more than one socket by moving the overlapped I/O processing portion of the code to a separate thread and allowing the main application thread to service additional connection requests.

Figure 8-7. Simple overlapped example using events

void main(void)
{
WSABUF DataBuf;
DWORD EventTotal = 0;
WSAEVENT EventArray[WSA_MAXIMUM_WAIT_EVENTS];
WSAOVERLAPPED AcceptOverlapped;
SOCKET ListenSocket, AcceptSocket;

// Step 1:
// Start Winsock and set up a listening socket
...

// Step 2:
// Accept an inbound connection
AcceptSocket = accept(ListenSocket, NULL, NULL);

// Step 3:
// Set up an overlapped structure

EventArray[EventTotal] = WSACreateEvent();

ZeroMemory(&AcceptOverlapped,
sizeof(WSAOVERLAPPED));

AcceptOverlapped.hEvent = EventArray[EventTotal];

DataBuf.len = DATA_BUFSIZE;
DataBuf.buf = buffer;

EventTotal++;

// Step 4:
// Post a WSARecv request to begin receiving data
// on the socket

WSARecv(AcceptSocket, &DataBuf, 1, &RecvBytes,
&Flags, &AcceptOverlapped, NULL);

// Process overlapped receives on the socket.

while(TRUE)
{
// Step 5:
// Wait for the overlapped I/O call to complete
Index = WSAWaitForMultipleEvents(EventTotal,
EventArray, FALSE, WSA_INFINITE, FALSE);

// Index should be 0 because we
// have only one event handle in EventArray

// Step 6:
// Reset the signaled event
WSAResetEvent(
EventArray[Index - WSA_WAIT_EVENT_0]);

// Step 7:
// Determine the status of the overlapped
// request
WSAGetOverlappedResult(AcceptSocket,
&AcceptOverlapped, &BytesTransferred,
FALSE, &Flags);

// First check to see whether the peer has closed
// the connection, and if so, close the
// socket

if (BytesTransferred == 0)
{
printf("Closing socket %d\n", AcceptSocket);

closesocket(AcceptSocket);

WSACloseEvent(
EventArray[Index - WSA_WAIT_EVENT_0]);
return;
}

// Do something with the received data.
// DataBuf contains the received data.
...

// Step 8:
// Post another WSARecv() request on the socket

Flags = 0;
ZeroMemory(&AcceptOverlapped,
sizeof(WSAOVERLAPPED));

AcceptOverlapped.hEvent = EventArray[Index -
WSA_WAIT_EVENT_0];

DataBuf.len = DATA_BUFSIZE;
DataBuf.buf = Buffer;

WSARecv(AcceptSocket, &DataBuf, 1,
&RecvBytes, &Flags, &AcceptOverlapped,
NULL);
}
}




On Windows NT and Windows 2000, the overlapped I/O model also allows applications to accept connections in an overlapped fashion by calling the AcceptEx function on a listening socket. AcceptEx is a special Winsock 1.1 extension function that is available in the Mswsock.h header file and the Mswsock.lib library file. This function was originally intended to work with Win32 overlapped I/O on Windows NT and Windows 2000, but it also works with overlapped I/O in Winsock 2. AcceptEx is defined as

BOOL AcceptEx (
SOCKET sListenSocket,
SOCKET sAcceptSocket,
PVOID lpOutputBuffer,
DWORD dwReceiveDataLength,
DWORD dwLocalAddressLength,
DWORD dwRemoteAddressLength,
LPDWORD lpdwBytesReceived,
LPOVERLAPPED lpOverlapped
);




The sListenSocket parameter represents a listening socket. The sAcceptSocket parameter is a socket to accept an incoming connection. The AcceptEx function is different from the accept function in that you have to supply the accepted socket instead of having the function create it for you. Supplying the socket requires you to call the socket or WSASocket function to create a socket that you can pass to AcceptEx via the sAcceptSocket parameter. The lpOutputBuffer parameter is a special buffer because it receives three pieces of data: the local address of the server, the remote address of the client, and the first block of data sent on a new connection. The dwReceiveDataLength parameter specifies the number of bytes in lpOutputBuffer used for receiving data. If this parameter is specified as 0, no data will be received in conjunction with accepting the connection. The dwLocalAddressLength and dwRemoteAddressLength parameters represent how many bytes in lpOutputBuffer are reserved for storing local and remote address information when a socket is accepted. These buffer sizes must be at least 16 bytes more than the maximum address length for the transport protocol in use. For example, if you are using the TCP/IP protocol, the size should be set to the size of a SOCKADDR_IN structure + 16 bytes. The lpdwBytesReceived parameter returns the number of data bytes received. This parameter is set only if the operation completes synchronously. If the AcceptEx function returns ERROR_IO_PENDING, this parameter is never set and you must obtain the number of bytes read from the completion notification mechanism. The final parameter, lpOverlapped, is an OVERLAPPED structure that allows AcceptEx to be used in an asynchronous fashion. As we mentioned earlier, this function works with event object notification only in an overlapped application because it does not feature a completion routine parameter.

A Winsock extension function named GetAcceptExSockaddrs parses out the local and remote address elements from lpOutputBuffer. GetAcceptExSockaddrs is defined as

VOID GetAcceptExSockaddrs(
PVOID lpOutputBuffer,
DWORD dwReceiveDataLength,
DWORD dwLocalAddressLength,
DWORD dwRemoteAddressLength,
LPSOCKADDR *LocalSockaddr,
LPINT LocalSockaddrLength,
LPSOCKADDR *RemoteSockaddr,
LPINT RemoteSockaddrLength
);




The lpOutputBuffer parameter should be set to the lpOutputBuffer returned from AcceptEx. The dwReceiveDataLength, dwLocalAddressLength, and dwRemoteAddressLength parameters should be set to the same values as the dwReceiveDataLength, dwLocalAddressLength, and dwRemoteAddressLength parameters that were passed to AcceptEx. The LocalSockaddr and RemoteSockaddr parameters, which are pointers to SOCKADDR structures with the local and remote address information, receive a pointer offset from the originating lpOutputBuffer parameter. This makes it easy to reference the elements of a SOCKADDR structure from the address information contained in lpOutputBuffer. The LocalSockaddrLength and RemoteSockaddrLength parameters receive the size of the local and remote addresses.

Completion routines
Completion routines are the other method your application can use to manage completed overlapped I/O requests. Completion routines are simply functions that you optionally pass to an overlapped I/O request and that the system invokes when an overlapped I/O request completes. Their primary role is to service a completed I/O request using the caller's thread. Additionally, applications can continue overlapped I/O processing through the completion routine.

To use completion routines for overlapped I/O requests, your application must specify a completion routine, along with a WSAOVERLAPPED structure, to an I/O bound Winsock function (described earlier). A completion routine must have the following function prototype:

void CALLBACK CompletionROUTINE(
DWORD dwError,
DWORD cbTransferred,
LPWSAOVERLAPPED lpOverlapped,
DWORD dwFlags
);




When an overlapped I/O request completes using a completion routine, the parameters contain the following information:


The parameter dwError specifies the completion status for the overlapped operation as indicated by lpOverlapped.


The cbTransferred parameter specifies the number of bytes that were transferred during the overlapped operation.


The lpOverlapped parameter is the WSAOVERLAPPED structure passed into the originating I/O call.


The dwFlags parameter is not used and will be set to 0.

There is a major difference between overlapped requests submitted with a completion routine and overlapped requests submitted with an event object. The WSAOVERLAPPED structure's event field, hEvent, is not used, which means you cannot associate an event object with the overlapped request. Once you make an overlapped I/O call with a completion routine, your calling thread must eventually service the completion routine once it has completed. This requires you to place your calling thread in an alertable wait state and process the completion routine later, after the I/O operation has completed. The WSAWaitForMultipleEvents function can be used to put your thread in an alertable wait state. The catch is that you must also have at least one event object available for the WSAWaitForMultipleEvents function. If your application handles only overlapped requests with completion routines, you are not likely to have any event objects around for processing. As an alternative, your application can use the Win32 SleepEx function to set your thread in an alertable wait state. Of course, you can also create a dummy event object that is not associated with anything. If your calling thread is always busy and not in an alertable wait state, no posted completion routine will ever get called.

As you saw earlier, WSAWaitForMultipleEvents normally waits for event objects associated with WSAOVERLAPPED structures. This function is also designed to place your thread in an alertable wait state and to process completion routines for completed overlapped I/O requests if you set the parameter fAlertable to TRUE. When overlapped I/O requests complete with a completion routine, the return value is WSA_IO_COMPLETION instead of an event object index in the event array. The SleepEx function provides the same behavior as WSAWaitForMultipleEvents except that it does not need any event objects. The SleepEx function is defined as

DWORD SleepEx(
DWORD dwMilliseconds,
BOOL bAlertable
);




The dwMilliseconds parameter defines how long in milliseconds SleepEx will wait. If dwMilliseconds is set to INFINITE, SleepEx waits indefinitely. The bAlertable parameter determines how a completion routine will execute. If bAlertable is set to FALSE and an I/O completion callback occurs, the I/O completion function is not executed and the function does not return until the wait period specified in dwMilliseconds has elapsed. If it is set to TRUE, the completion routine executes and the SleepEx function returns WAIT_IO_COMPLETION.

Figure 8-8 outlines how to structure a simple server application that is capable of managing one socket request using completion routines as described above. The application illustrates the following programming steps:


Create a socket and begin listening for a connection on a specified port.


Accept an inbound connection.


Create a WSAOVERLAPPED structure for the accepted socket.


Post an asynchronous WSARecv request on the socket by specifying the WSAOVERLAPPED structure as a parameter and supplying a completion routine.


Call WSAWaitForMultipleEvents with the fAlertable parameter set to TRUE, and wait for an overlapped request to complete. When an overlapped request completes, the completion routine automatically executes and WSAWaitForMultipleEvents returns WSA_IO_COMPLETION. Inside the completion routine, post another overlapped WSARecv request with a completion routine.


Verify that WSAWaitForMultipleEvents returns WSA_IO_COMPLETION.


Repeat steps 5 and 6.

Figure 8-8. Simple overlapped sample using completion routines

SOCKET AcceptSocket;
WSABUF DataBuf;

void main(void)
{
WSAOVERLAPPED Overlapped;

// Step 1:
// Start Winsock, and set up a listening socket
...

// Step 2:
// Accept a new connection
AcceptSocket = accept(ListenSocket, NULL, NULL);

// Step 3:
// Now that we have an accepted socket, start
// processing I/O using overlapped I/O with a
// completion routine. To get the overlapped I/O
// processing started, first submit an
// overlapped WSARecv() request.

Flags = 0;

ZeroMemory(&Overlapped, sizeof(WSAOVERLAPPED));

DataBuf.len = DATA_BUFSIZE;
DataBuf.buf = Buffer;

// Step 4:
// Post an asynchronous WSARecv() request
// on the socket by specifying the WSAOVERLAPPED
// structure as a parameter, and supply
// the WorkerRoutine function below as the
// completion routine

if (WSARecv(AcceptSocket, &DataBuf, 1, &RecvBytes,
&Flags, &Overlapped, WorkerRoutine)
== SOCKET_ERROR)
{
if (WSAGetLastError() != WSA_IO_PENDING)
{
printf("WSARecv() failed with error %d\n",
WSAGetLastError());
return;
}
}

// Since the WSAWaitForMultipleEvents() API
// requires waiting on one or more event objects,
// we will have to create a dummy event object.
// As an alternative, we can use SleepEx()
// instead.

EventArray[0] = WSACreateEvent();

while(TRUE)
{
// Step 5:
Index = WSAWaitForMultipleEvents(1, EventArray,
FALSE, WSA_INFINITE, TRUE);

// Step 6:
if (Index == WAIT_IO_COMPLETION)
{
// An overlapped request completion routine
// just completed. Continue servicing
// more completion routines.
break;
}
else
{
// A bad error occurred--stop processing!
// If we were also processing an event
// object, this could be an index to
// the event array.

return;
}
}
}

void CALLBACK WorkerRoutine(DWORD Error,
DWORD BytesTransferred,
LPWSAOVERLAPPED Overlapped,
DWORD InFlags)
{
DWORD SendBytes, RecvBytes;
DWORD Flags;

if (Error != 0 || BytesTransferred == 0)
{
// Either a bad error occurred on the socket
// or the socket was closed by a peer
closesocket(AcceptSocket);
return;
}

// At this point, an overlapped WSARecv() request
// completed successfully. Now we can retrieve the
// received data that is contained in the variable
// DataBuf. After processing the received data, we
// need to post another overlapped WSARecv() or
// WSASend() request. For simplicity, we will post
// another WSARecv() request.

Flags = 0;

ZeroMemory(&Overlapped, sizeof(WSAOVERLAPPED));

DataBuf.len = DATA_BUFSIZE;
DataBuf.buf = Buffer;

if (WSARecv(AcceptSocket, &DataBuf, 1, &RecvBytes,
&Flags, &Overlapped, WorkerRoutine)
== SOCKET_ERROR)
{
if (WSAGetLastError() != WSA_IO_PENDING )
{
printf("WSARecv() failed with error %d\n",
WSAGetLastError());
return;
}
}
}





Share this post


Link to post
Share on other sites
Thank you for the resource, but after looking over that and searching IOCP, I gather the advantages of these methods come through using multiple threads, which I don't quite understand yet.

Share this post


Link to post
Share on other sites
Quite the oposite, these kind of models allow you to manage several thousand connections without the need to create a thread for each one.

OT Note: Although now, with the arrival of multicore cpus and even the cheaper deals with dual cpu systems, threaded programming is encouraged, the amount of threads should be close to the amount of cpu cores, and not thousands.

Share this post


Link to post
Share on other sites

This topic is 4599 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this