views:

243

answers:

4

Hi,

I'm wondering if anyone knows how to calculate the upload speed of a Berkeley socket in C++. My send call isn't blocking and takes 0.001 seconds to send 5 megabytes of data, but takes a while to recv the response (so I know it's uploading).

This is a TCP socket to a HTTP server and I need to asynchronously check how many bytes of data have been uploaded / are remaining. However, I can't find any API functions for this in Winsock, so I'm stumped.

Any help would be greatly appreciated.

EDIT: I've found the solution, and will be posting as an answer as soon as possible!

EDIT 2: Proper solution added as answer, will be added as solution in 4 hours.

+2  A: 

You can get a lower bound on the amount of data received and acknowledged by subtracting the value of the SO_SNDBUF socket option from the number of bytes you have written to the socket. This buffer may be adjusted using setsockopt, although in some cases the OS may choose a length smaller or larger than you specify, so you must re-check after setting it.

To get more precise than that, however, you must have the remote side inform you of progress, as winsock does not expose an API to retrieve the amount of data currently pending in the send buffer.

Alternately, you could implement your own transport protocol on UDP, but implementing rate control for such a protocol can be quite complex.

bdonlan
My only remote-side protocol option is HTTP, so I just need a way of checking how many bytes Winsock has actually sent to the server.
Saul Rennison
Then you'll have to reduce the SNDBUF size to the point where you can make measurements, I suppose... Note that this may negatively impact performance, though.
bdonlan
A: 

If your app uses packet headers like

0001234DT

where 000123 is the packet length for a single packet, you can consider using MSG_PEEK + recv() to get the length of the packet before you actually read it with recv().

The problem is send() is NOT doing what you think - it is buffered by the kernel.

getsockopt(sockfd, SOL_SOCKET, SO_SNDBUF, &flag, &sz));
fprintf(STDOUT, "%s: listener socket send buffer = %d\n", now(), flag);
sz=sizeof(int);
ERR_CHK(getsockopt(sockfd, SOL_SOCKET, SO_RCVBUF, &flag, &sz));
fprintf(STDOUT, "%s: listener socket recv buffer = %d\n", now(), flag);

See what these show for you.

When you recv on a NON-blocking socket that has data, it normally does not have MB of data parked in the buufer ready to recv. Most of what I have experienced is that the socket has ~1500 bytes of data per recv. Since you are probably reading on a blocking socket it takes a while for the recv() to complete.

Socket buffer size is the probably single best predictor of socket throughput. setsockopt() lets you alter socket buffer size, up to a point. Note: these buffers are shared among sockets in a lot of OSes like Solaris. You can kill performance by twiddling these settings too much.

Also, I don't think you are measuring what you think you are measuring. The real efficiency of send() is the measure of throughput on the recv() end. Not the send() end. IMO.

jim mcnamara
A: 

Since you don't have control over the remote side, and you want to do it in the code, I'd suggest doing very simple approximation. I assume a long living program/connection. One-shot uploads would be too skewed by ARP, DNS lookups, socket buffering, TCP slow start, etc. etc.

Have two counters - length of the outstanding queue in bytes (OB), and number of bytes sent (SB):

  • increment OB by number of bytes to be sent every time you enqueue a chunk for upload,
  • decrement OB and increment SB by the number returned from send(2) (modulo -1 cases),
  • on a timer sample both OB and SB - either store them, log them, or compute running average,
  • compute outstanding bytes a second/minute/whatever, same for sent bytes.

Network stack does buffering and TCP does retransmission and flow control, but that doesn't really matter. These two counters will tell you the rate your app produces data with, and the rate it is able to push it to the network. It's not the method to find out the real link speed, but a way to keep useful indicators about how good the app is doing.

If data production rate is bellow the network output rate - everything is fine. If it's the other way around and the network cannot keep up with the app - there's a problem - you need either faster network, slower app, or different design.

For one-time experiments just take periodic snapshots of netstat -sp tcp output (or whatever that is on Windows) and calculate the send-rate manually.

Hope this helps.

Nikolai N Fetissov
+1  A: 

I solved my issue thanks to bdolan suggesting to reduce SO_SNDBUF. However, to use this code you must note that your code uses Winsock 2 (for overlapped sockets and WSASend). In addition to this, your SOCKET handle must have been created similarily to:

SOCKET sock = WSASocket(AF_INET, SOCK_STREAM, IPPROTO_TCP, NULL, 0, WSA_FLAG_OVERLAPPED);

Note the WSA_FLAG_OVERLAPPED flag as the final parameter.

In this answer I will go through the stages of uploading data to a TCP server, and tracking each upload chunk and it's completion status. This concept requires splitting your upload buffer into chunks (minimal existing code modification required) and uploading it piece by piece, then tracking each chunk.

My code flow

Global variables

Your code document must have the following global variables:

#define UPLOAD_CHUNK_SIZE 4096

int g_nUploadChunks = 0;
int g_nChunksCompleted = 0;
WSAOVERLAPPED *g_pSendOverlapped = NULL;
int g_nBytesSent = 0;
float g_flLastUploadTimeReset = 0.0f;

Note: in my tests, decreasing UPLOAD_CHUNK_SIZE results in increased upload speed accuracy, but decreases overall upload speed. Increasing UPLOAD_CHUNK_SIZE results in decreased upload speed accuracy, but increases overall upload speed. 4 kilobytes (4096 bytes) was a good comprimise for a file ~500kB in size.

Callback function

This function increments the bytes sent and chunks completed variables (called after a chunk has been completely uploaded to the server)

void CALLBACK SendCompletionCallback(DWORD dwError, DWORD cbTransferred, LPWSAOVERLAPPED lpOverlapped, DWORD dwFlags)
{
    g_nChunksCompleted++;
    g_nBytesSent += cbTransferred;
}

Prepare socket

Initially, the socket must be prepared by reducing SO_SNDBUF to 0.

Note: In my tests, any value greater than 0 will result in undesirable behaviour.

int nSndBuf = 0;
setsockopt(sock, SOL_SOCKET, SO_SNDBUF, (char*)&nSndBuf, sizeof(nSndBuf));

Create WSAOVERLAPPED array

An array of WSAOVERLAPPED structures must be created to hold the overlapped status of all of our upload chunks. To do this I simply:

// Calculate the amount of upload chunks we will have to create.
// nDataBytes is the size of data you wish to upload
g_nUploadChunks = ceil(nDataBytes / float(UPLOAD_CHUNK_SIZE));

// Overlapped array, should be delete'd after all uploads have completed
g_pSendOverlapped = new WSAOVERLAPPED[g_nUploadChunks];
memset(g_pSendOverlapped, 0, sizeof(WSAOVERLAPPED) * g_nUploadChunks);

Upload data

All of the data that needs to be send, for example purposes, is held in a variable called pszData. Then, using WSASend, the data is sent in blocks defined by the constant, UPLOAD_CHUNK_SIZE.

WSABUF dataBuf;
DWORD dwBytesSent = 0;
int err;
int i, j;

for(i = 0, j = 0; i < nDataBytes; i += UPLOAD_CHUNK_SIZE, j++)
{
    int nTransferBytes = min(nDataBytes - i, UPLOAD_CHUNK_SIZE);

    dataBuf.buf = &pszData[i];
    dataBuf.len = nTransferBytes;

    // Now upload the data
    int rc = WSASend(sock, &dataBuf, 1, &dwBytesSent, 0, &g_pSendOverlapped[j], SendCompletionCallback);

    if ((rc == SOCKET_ERROR) && (WSA_IO_PENDING != (err = WSAGetLastError())))
    {
        fprintf(stderr, "WSASend failed: %d\n", err);
        exit(EXIT_FAILURE);
    }
}

The waiting game

Now we can do whatever we wish while all of the chunks upload.

Note: the thread which called WSASend must be regularily put into an alertable state, so that our 'transfer completed' callback (SendCompletionCallback) is dequeued out of the APC (Asynchronous Procedure Call) list.

In my code, I continuously looped until g_nUploadChunks == g_nChunksCompleted. This is to show the end-user upload progress and speed (can be modified to show estimated completion time, elapsed time, etc.)

Note 2: this code uses Plat_FloatTime as a second counter, replace this with whatever second timer your code uses (or adjust accordingly)

g_flLastUploadTimeReset = Plat_FloatTime();

// Clear the line on the screen with some default data
printf("(0 chunks of %d) Upload speed: ???? KiB/sec", g_nUploadChunks);

// Keep looping until ALL upload chunks have completed
while(g_nChunksCompleted < g_nUploadChunks)
{
    // Wait for 10ms so then we aren't repeatedly updating the screen
    SleepEx(10, TRUE);

    // Updata chunk count
    printf("\r(%d chunks of %d) ", g_nChunksCompleted, g_nUploadChunks);

    // Not enough time passed?
    if(g_flLastUploadTimeReset + 1 > Plat_FloatTime())
        continue;

    // Reset timer
    g_flLastUploadTimeReset = Plat_FloatTime();

    // Calculate how many kibibytes have been transmitted in the last second
    float flByteRate = g_nBytesSent/1024.0f;
    printf("Upload speed: %.2f KiB/sec", flByteRate);

    // Reset byte count
    g_nBytesSent = 0;
}

// Delete overlapped data (not used anymore)
delete [] g_pSendOverlapped;

// Note that the transfer has completed
Msg("\nTransfer completed successfully!\n");

Conclusion

I really hope this has helped somebody in the future who has wished to calculate upload speed on their TCP sockets without any server-side modifications. I have no idea how performance detrimental SO_SNDBUF = 0 is, although I'm sure a socket guru will point that out.

Saul Rennison