[.net] memset?

Started by
19 comments, last by Pixar 14 years, 1 month ago
Hey! Is there a way to replicate the functionality of "memset" in C#? I'm doing some unsafe hacking and need to zero out a block of memory very quickly. Thanks!
Advertisement
Not going to happen.

People don't use C# for its runtime performance.
Well there are pointers and etc. to get runtime performance where it's needed, so I figured that if I'm allowed to work with raw buffers, maybe there exists a fast way to zero a buffer.
Maybe you can do it with p/invoke, see the end of this discussion:
http://www.gamedev.net/community/forums/topic.asp?whichpage=1&pagesize=25&topic_id=389036

("People don't use C# for its runtime performance." Hmm. C#'s killer feature isn't speed, but that doesn't mean we shouldn't look for ways to optimize.)
Unless there's a specific .net framework function do do it, you could just use memset.

        [DllImport("msvcrt.dll")]        private static unsafe extern void memset(int[] dest, int c, int count);         private void button1_Click(object sender, EventArgs e)        {            int[] x = new int[10];            for (int i = 0; i < x.Length; ++i) x = 10;            unsafe            {                memset(x, 0, 4);            }            label1.Text = x[0].ToString();        }
Thanks, I'll use the standard memset for now. Does that dll come with windows or .net or both? It kinda sucks that it won't be portable though.

Regarding the speed of C#, I must say that so far I'm quite impressed. I'm writing a software renderer and although it's not a completely a fair comparison I think it fares pretty well against Quake. Most of the time is spent in the rasterizer for both engines (>90%) so testing by just looking at a wall, filling the screen, I get that quake is about 3x faster than my version. But quake gets at least a factor of two speedup by using an optimized assembler version of the inner rasterizer loop (basically utilizing that pentium processors and later can do floating point adds in parallell if you don't fetch the results right away), and quake also uses a subdivided affine perspective correction (only computing the perspective correction every 8 pixels, IIRC), whereas my version does a completely accurate perspective correction. Also I use 32 bit textures and frame-buffer so that might work in quake's favour as well. Overall I think C# is very close to C in speed for this application (which is a rather extreme case for the 80/20 rule, admittedly), most of the speed difference is due to me, not C#.

I wrote it in C# to test out the concepts for fun, never really thinking that it would be this fast, but I've been quite impressed so far! Everything is written in the most naive and obvious way and it's still very fast compared to one of the fastest rasterizers around.
I'd expect the overhead of the p/invoke is going to negate any speed increase you may see (although I'm really not sure you would actually see any speed up). This is especially true if you don't tag the declatation with:

[System.Security.SuppressUnmanagedCodeSecurity]

if you don't do this, .net will do a stack check after each call to make sure the stack is still intact (well, that is my understanding of what this supresses). Needless to say, it'd be *much* slower without this in your case.

It's impressive you the speed you claim for your ray tracer. I've written a couple of little ray tracers before in C# and C++ at the same time (for comparison) and it ended up that C# managed the complex ray tracer better, while C++ managed the linear time one better (as expected). It was still interesting however. Surprisingly, the linear ray tracer with C++ floating point accuracy optimisations off was almost exactly the same speed as the C# one. I can't remember if this was .net 1.1 or 2.0 though.
Well it wasn't a ray-tracer, it was a rasterizer.
I've been pleasantly surprised about many things in C#, speed-wise. For instance, I was sure that garbage collection was slower than manual memory management, but if you benchmark it you'll see that heap allocation is about twenty times faster in a garbage collected language. This really shouldn't be surprising if you know how garbage collection works (allocation in C# is a pointer incrementation with a check, roughly five instructions, whereas in C++ it's a huge mess of best-fit/first-fit algorithms, roughly a hundred instructions) but somehow I had bought in to the myth of the slowness of garbage collection without benchmarking it (bad programmer, BAAAD programmer!!!).
At any rate, C++ still gains a lot due to less runtime checks (out-of-bounds, overflow etc. -- which you can turn off!) and the fact that C++ programs typically allocate on the stack more often, but it's surprising that C# programs fare so incredibly well in comparison!

About the memset, it turns out that using an unchecked loop and setting the memory one word at a time (rather than one byte at a time as in memset?) is a little bit faster than using P/Invoke.
fantastic.
It will be interesting to see what you have created. *hint hint* [wink]

Looking back I should have proof read my previous post :-) *sigh*
Quote:Original post by sebastiansylvan
But quake gets at least a factor of two speedup by using an optimized assembler version of the inner rasterizer loop (basically utilizing that pentium processors and later can do floating point adds in parallell if you don't fetch the results right away)

Assembly isn't required for that. The compiler (includnig the C# compiler) and CPU will both attempt to schedule the instructions to make this possible.

This topic is closed to new replies.

Advertisement