How Common are Data Errors in TCP?

Started by
9 comments, last by tufflax 12 years, 10 months ago

[quote name='tufflax' timestamp='1308871164' post='4827047']
I'm sending the exact same message about 30000 times or so

(zero? (.remaining rbuf)) (let [len (.. rbuf (flip) (getShort))



Something to consider: The maximum number in a signed short is 32767. If you use a message sequence number or similar, and store it as a short, you would get an error after that value.

Sounds like you have a reproducible case, though. That's good! Print out the number of messages you have gotten every 100 messages or so, and check what the value is after crash. Then set a breakpoint after that number of messages, and re-run the case, so you can debug when the crash happens. Or, if you have access to VMWare Workstation, try using Replay Debugging.

You can also log all the data to a big file after receiving it, for later analysis. You may be able to then pipe that file back into the server to repeat the behavior that clients already had, to reproduce the crash faster so you can debug it.

I'm not that familiar with Clojure (been 20 years since I did Scheme :-) so I didn't take the time to read through all your code, sorry. Maybe someone else on the board?
[/quote]

The number is the length of the messages, not a sequence number of some kind. If it was off I would notice it right away because I use a ByteArrayInputStream to read the messages back, and it would not work if I didn't give it a byte array with a whole object in it (the byte array is the next <length> bytes of the socketchannel). When I said it was the exact same messages over and over I didn't lie. :P
Advertisement
Hm, now the server has run for an hour or so without running into the problem, the client has send over 200 000 messages to it. Not so reproducible after all...

But maybe it's a good idea for me to use a framework such as MINA anyway. Do any of you have any experience with it? Any tips?

Btw, Antheus, I have not been able to find any info on bugs in nio. It sounds strange that Sun (or Oracle?) would just not fix bugs in it. And Mina is built upon it, so it must be manageable somehow? You are making me paranoid. :P

This topic is closed to new replies.

Advertisement