Programming inquiry

I’m sure some of you have already tackled this problem, and if so, feel free to shake your heads and write this off as n00btacular. I’m just curious how quickly anyone can devise a solution.

This problem is pretty specific, “let n be an integer” style. Here goes:

You’re writing a networked program in C, specifically using TCP. You’re going to receive some data from the server, but you’re not sure how much information there will be. Your only indicator will be, somewhere in the first hundred bytes or so, a “Length” field and a “rnrn” to delineate the end of the header and the start of the body, which will be Length characters long.

If you said “HOLY CARP IT’S HTTP!!1”, you’d be correct, monsieur.

But here’s the problem. You don’t know the size of the header section of the incoming data (some headers could be missing, as many HTTP headers are optional), and you certainly won’t know the size of the incoming body until you find the Length field somewhere in the header and the subsequent “rnrn.”

There is a function, “recv()”, which receives a block of data over a TCP connection of a specified size. But since you have no idea how much information is coming over the wire, you may have to call “recv()” several times before you possess all the data. What would be the best implementation of a function which performs this “recv()” operation several times internally and returns a single reference to the entire block of received data?

The call to this function would look something like this (in pseudo-code):

dataBuffer = receiveAll(socketIdentifier, AnyOtherAdditionalParameters)

It’s certainly not an impossible task. Really, the challenge lies in weighing the pros and cons of each solution, particularly in regards to overall performance and efficiency, and choosing an implementation that solves the problem in the best way possible.

Of course, since C is so finicky about memory, there is also a (rather significant) challenge in making sure all the allocated blocks mesh cleanly, but that lies in the tweaking of the implementation, not the implementation itself.

And by “tweaking” I mean “debugging.” 😛

Oh. And before anyone says anything…yesIknowit’sFridaynightandI’mworking. I’m text messaging/shamelessly flirting with my girlfriend as I program, does that count as a social engagement?

*grin*

Advertisements

About Shannon Quinn

Oh hai!
This entry was posted in Programming, The Lady. Bookmark the permalink.

4 Responses to Programming inquiry

  1. Dan Sibbernsen says:

    Eh, just to get the ball rolling (this is by no means an EFFICIENT solution):

    I’m going to make some assumptions as follows, clarify me if I’m wrong.
    1) You know how many bytes are coming in each transmission.
    2) None of the headers have been corrupted (i.e. the \r\n\r\n would always be there)

    [Pseudo-code]

    bool notFound = true;
    int index = 0;
    while(notFound)
    {
    dataBuffer = receiveAll(socketIdentifier, AnyOtherAdditionalParameters)
    for(index = 0; x < BytesReceived && notFound; x++)
    {
    if(dataBuffer[x…x+3] is “\r\n\r\n”)
    notFound = false;
    }

    }

    //found the length field
    //now just get the rest of the data according to the length (of course, only getting as much as you need, since part of the body might have come in with the data
    dataBuffer = receiveAll(socketIdentifier, length of body to be sent)
    [/Pseudo-code]

    That would be my first idea. I’ll post more if I get time to think about this 🙂

    also, do you know if there’s a list of text transformations for the comments that are posted here? I’m probably going to be commenting a lot, and would like them to be readable 🙂

  2. magsol says:

    I realize I phrased the question terribly. Permit me to take another stab at it.

    You have information coming in (I’ll dub this “input”), which you can access via the function “recv()”. The overall format of the input has two distinct and mutually exclusive possibilities, enumerated here:

    1: blahblahasjo8ehbadasdfiog\r\n\r\n
    2: bljablahblahContent-Length: 12345\r\nkljaogear\r\n\r\n[blahblahblahblahblahblahlahbkadfas]

    In the first case, you don’t know how long the header is, but you do know it is ended with a “\r\n\r\n”. In the second case, again you don’t know how long the header is, but again it ends with a “\r\n\r\n” and it contains a field called “Content-Length” specifying the EXACT number of bytes in the transmission AFTER the “\r\n\r\n”.

    You seem to have understood that well enough, I just wanted clarify that for your assumption #1.

    Here’s the real challenge. “recv()”, which hands back some arbitrary-sized portion of the data, takes a STATIC input buffer in which it puts the chunk of data it receives. “recv()” will make a best possible effort to receive all the data in one chunk, and if the type of input conforms to type 1 above, it is very possible that all the data will fit into a fairly large static buffer in a single call to “recv()”.

    Problem is, you simply don’t know if that’s going to be the case. Even in the case of type 1, you may only get a portion of the header (you’ll know this by not finding a “\r\n\r\n” in the received chunk), and thus you’ll have to make another call to “recv()”.

    So what you’re actually doing is WRITING the function “recvAll()”, which calls “recv()” as many times as it needs to and makes the checks necessary to determine what type of input is arriving, and whether or not it has all arrived. If you want to throw memory management in too, that’s an added level of complexity, but you’ll get added street cred from me. 😛

    I wrote a solution over the weekend that I can post, as it appears to work (though I fully plan on testing it much more thoroughly in the near future). I hope this all makes sense now.

  3. thelegacyx4 says:

    shannon, you are a nerd. ahahahaha

  4. magsol says:

    I checked “yes” to the question “Are you a nerd?” on the GT application for admission.

    By the way, Double-D called. He wants his aviators back.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s