Hey, thanks for the reply!
50% CPU, now that you're asking, I didn't pay attention to that detail, lol! I stared at the top of "htop", perhaps it was only 1 core. It was shown with my process at the top, eating most of it, though.
I have to look into "time", and kernel level profilers sounds interesting, too, but probably quite involved to set up?
Cache coherency or rather its lack... there rings a bell, I think that problem is there.
While it does seem very plausible that my CPU just hasn't got enough horse power, I still wonder what iperf is doing better (if not exactly impressively), and why that one day, it seemed consistent zero packet loss after I set 8 MB buffers (4MB was barely not enough, there seemed to be a roughly proportionate effect).
I will try out what happens with only one system call! Edit:
Ok, I tried it with only receive, without poll, and checking for timeout in a low frequency thread. That reduced the CPU load at best slighty. I wonder why it's not always the same - sometimes the core0 is, given my goal data rate, at 95..99% and then there is no packet loss, but when it gets to 100%, not surprisingly, there is.
Core1 is mostly < 2% busy.
I disabled the LXDE desktop completely to see whether the barely enough CPU (99%) would be more repeatable then, but it's not.
I'll look into whether all parts of my gear supports jumbo frames, to reduce the number of packets/sec I get...
=> Too bad, the iMX6 won't go above 1500 MTU.