Sign in to follow this  
squashed_bug

[.net] byte [] to an IntPtr

Recommended Posts

Im assuiming you want to convert a byte ptr into an int ptr?
Because they are both ptrs, just either use a C-style cast or
C++ reinterpret_cast:

byte ptr[]; // our pointer...

int* pInt = (int*)ptr; // C style

int* pInt2 = reinterpret_cast<int*> (ptr); // C++

Note that integers are 32bits in size (on 32bit systems),
so incrementing pInt by 1 will jump 4 elements in the byte array.

Hope this helps!

(I dont use .NET, but am assuming the same still applies)

Share this post


Link to post
Share on other sites
Use the following method:
public static IntPtr UnsafeAddrOfPinnedArrayElement (
Array arr,
int index
)
Gets the address of the element at the specified index inside the specified array.

arr: The Array containing the desired element.
index: The index in the arr parameter of the desired element.

MSDN Link

Share this post


Link to post
Share on other sites
Assuming you are using C#, I believe the best way is as follows,

using the System.Runtime.InteropServices.Marshal class:


byte[] data = ...;

IntPtr ptr = Marshal.AllocHGlobal(data.Length);

try
{
Marshal.Copy(data,0,ptr,data.Length);

//deal with ptr

}
finally
{
Marshal.FreeHGlobal(ptr);
}



There are very few cases where you need to use unsafe code. This is not one of them. This is far, far safer (not that pointers are safe at all). It is also debatable if it's actually any slower.


UnsafeAddrOfPinnedArrayElement is dangerous as the address in memory of the array may well change at any time. It is intended for use with pinned memory in C++/Cli (as is my understanding) It also does *not* validate the input is pinned. Apparently you can pin memory in C# using GCHandle, but I haven't done this before, so I don't know how safe it is - and importantly I can't see how you pin an Array, not an IntPtr. So in summary I don't think it's a safe option.


GCHandle gch = GCHandle.FromIntPtr(ptr);

..

gch.Free();

Share this post


Link to post
Share on other sites
As the great Washu tells us, only resort to unsafe code if you are absolutely certain the technique is not possible without it (which is highly unlikely). The performance hit from using unsafe code is HUGE. Check out his journal or blog for more info.

Share this post


Link to post
Share on other sites
For large arrays you should not copy elements around. In this case, if you do not want to resort to unsafe code, use the following snippet:

GCHandle h0 = GCHandle.Alloc(array, GCHandleType.Pinned);
try
{
// h0.AddrOfPinnedObject() will get you your IntPtr
}
finally
{
h0.Free();
}

Share this post


Link to post
Share on other sites
Quote:
Original post by Fiddler
For large arrays you should not copy elements around. In this case, if you do not want to resort to unsafe code, use the following snippet:

GCHandle h0 = GCHandle.Alloc(array, GCHandleType.Pinned);
try
{
// h0.AddrOfPinnedObject() will get you your IntPtr
}
finally
{
h0.Free();
}


Awesome. Where/how did you find/figure this out? [smile]

This is by far the best way, and safe provided you don't buffer overrun I guess

Share this post


Link to post
Share on other sites
Quote:
Original post by capn_midnight
The performance hit from using unsafe code is HUGE.


I'm pretty sure that this isn't correct. I've spent a substantial amount of time researching this subject, and I've found that there is no performance penalty for unsafe code per se in C#.

There is a very slight penalty to using the 'fixed' keyword which is often associated with unsafe because the runtime pins the array, but even that usually isn't bad unless used in an inner loop (normally you'd use 'fixed' outside of a loop).

Just to prove my point, take the following code:


static unsafe int test1(int* ptr, int index)
{
return ptr[index];
}

static int test2(int[] array, int index)
{
return array[index];
}

static void Main(string[] args)
{
int[] foo = new int[1024];

unsafe
{
fixed (int* bar = &foo[0])
{
test1(bar,0);
}
}

test2(foo,0);

}




Now, lets take a look the machine code that test1() generates


return ptr[index];
00000028 mov eax,dword ptr [edi+esi*4]
0000002b mov ebx,eax




and test2()


return array[index];
00000028 cmp edi,dword ptr [esi+4]
0000002b jb 00000032
0000002d call 79333451
00000032 mov eax,dword ptr [esi+edi*4+8]
00000036 mov ebx,eax




Clearly, using the unsafe pointer is faster, because there is no null reference check - which is obvious as the first three instructions of test2().

There is a penalty to using the 'fixed' keyword, which includes a null reference check and the pointer is pinned, but clearly if you were going to iterate over all of the elements in the array using the unsafe pointer would be substantially faster.

[Edited by - krum on April 11, 2007 9:57:10 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by gharen2
I can confirm that math code that's optimized with unsafe code is much faster.

Really? That's funny because my results pulled straight from the JIT tend to disprove that.

Unsafe code, especially in regards to the fixed keyword, does add a relatively large amount of overhead to managed code.

Share this post


Link to post
Share on other sites
Quote:
Original post by krum
Quote:
Original post by capn_midnight
The performance hit from using unsafe code is HUGE.


I'm pretty sure that this isn't correct. I've spent a substantial amount of time researching this subject, and I've found that there is no performance penalty for unsafe code per se in C#.

There is a very slight penalty to using the 'fixed' keyword which is often associated with unsafe because the runtime pins the array, but even that usually isn't bad unless used in an inner loop (normally you'd use 'fixed' outside of a loop).

There can be a huge hit to unsafe code. As I have demonstrated in my journal.
Quote:

Just to prove my point, take the following code:

*** Source Snippet Removed ***

Now, lets take a look the machine code that test1() generates

*** Source Snippet Removed ***

and test2()

*** Source Snippet Removed ***

Clearly, using the unsafe pointer is faster, because there is no null reference check - which is obvious as the first three instructions of test2().

There is a penalty to using the 'fixed' keyword, which includes a null reference check and the pointer is pinned, but clearly if you were going to iterate over all of the elements in the array using the unsafe pointer would be substantially faster.

...
Wrong
	class Program {
static void Main(string[] args) {
int[] numbers = new int[1024];
Random r = new Random();

for(int i = 0; i < numbers.Length; ++i) {
numbers[i] = r.Next();
}

AddOneSafe(numbers);

unsafe { fixed(int* p = numbers) {
AddOneUnsafe(p, numbers.Length);
}}

Console.ReadKey();
}

static void AddOneSafe(int[] array) {
for (int i = 0; i < array.Length; ++i) {
++array[i];
}
}

static unsafe void AddOneUnsafe(int* array, int length) {
for (int i = 0; i < length; ++i) {
++array[i];
}
}
}


produces the following two pieces of assembly:

!u 00152ff0
Normal JIT generated code
ConsoleApplication1.Program.AddOneSafe(Int32[])
Begin 00220128, size 1a
00220128 56 push esi
00220129 33D2 xor edx,edx
0022012B 8B7104 mov esi,dword ptr [ecx+4]
0022012E 85F6 test esi,esi
00220130 7E0E jle 00220140
00220132 8D449108 lea eax,[ecx+edx*4+8]
00220136 830001 add dword ptr [eax],1
00220139 83C201 add edx,1
0022013C 3BF2 cmp esi,edx
0022013E 7FF2 jg 00220132
00220140 5E pop esi
00220141 C3 ret

!u 00152ff8
Normal JIT generated code
ConsoleApplication1.Program.AddOneUnsafe(Int32*, Int32)
Begin 00220158, size 1c
00220158 57 push edi
00220159 56 push esi
0022015A 8BF9 mov edi,ecx
0022015C 8BF2 mov esi,edx
0022015E 33C9 xor ecx,ecx
00220160 85F6 test esi,esi
00220162 7E0D jle 00220171
00220164 8D148F lea edx,[edi+ecx*4]
00220167 830201 add dword ptr [edx],1
0022016A 83C101 add ecx,1
0022016D 3BCE cmp ecx,esi
0022016F 7CF3 jl 00220164
00220171 5E pop esi
00220172 5F pop edi
00220173 C3 ret


Share this post


Link to post
Share on other sites
Quote:
Original post by Washu

Wrong
*** Source Snippet Removed ***
produces the following two pieces of assembly:
*** Source Snippet Removed ***


With all due respect, Washu, this has nothing to do with unsafe code - rather this is a good demonstration of how poor the JIT's optimizer is. Either way, I think we'll find that the safe code may take at least one additional clock cycle per loop due to the additional add


lea eax,[ecx+edx*4+8]


vs


lea edx,[edi+ecx*4]


I believe the fact that the unsafe code has a few more instructions to set the function up is largely insignificant. Even if the CPU can do both LEA instructions in the same number of clock cycles, I think to suggest that in this case unsafe code produces a substantial performance penalty is going overboard.

The more extreme case would be if you had an array of 4x4 transformation matrices, say from an animation, that you needed to multiply with a parent transform. Using safe code could be tremendously slower because it would need to copy each matrix at least twice for each operation, whereas using pointers could reduce the memory bandwidth requirements.

My point is that using unsafe code per se is not slower. What you do with it is - like anything - a whole different story.





Share this post


Link to post
Share on other sites
Quote:
Original post by krum

With all due respect, Washu,


So, I thought I'd test my theory and added some timing code to your example:


using System;
using System.Collections.Generic;
using System.Text;
using System.Runtime.InteropServices;
using System.Security;

namespace washu
{

class Program
{
[DllImport("kernel32.dll"), SuppressUnmanagedCodeSecurity]
extern static void QueryPerformanceCounter(out long clock);

static void Main(string[] args)
{
int[] numbers = new int[1000000];
Random r = new Random();

for (int i = 0; i < numbers.Length; ++i)
{
numbers[i] = r.Next();
}

long t0, t1, t2;

for (int i = 0; i < 100; ++i)
{
QueryPerformanceCounter(out t0);

AddOneSafe(numbers);

QueryPerformanceCounter(out t1);


unsafe
{
fixed (int* p = numbers)
{
AddOneUnsafe(p, numbers.Length);
}
}
QueryPerformanceCounter(out t2);

long time1 = t1 - t0;


Console.WriteLine(String.Format("safe={0} unsafe={1}", (t1-t0)/numbers.Length, (t2-t1)/numbers.Length));
}

Console.ReadKey();
}

static void AddOneSafe(int[] array)
{
for (int i = 0; i < array.Length; ++i)
{
++array[i];
}
}

static unsafe void AddOneUnsafe(int* array, int length)
{
for (int i = 0; i < length; ++i)
{
++array[i];
}
}
}
}





What we find is that the safe code does in fact, on my computer, take on average one additional clock cycle per element. I think you'll find my testing methodology to be accurate enough. But again, this isn't about how good the JIT optimizes, but the performance penalty of the unsafe code block. Unsafe by itself does not generate additional machine code. Using pointers might, and 'fixed' is an issue - I do not dispute that.

On my computer QueryPerformanceCounter() returns CPU clocks. YYMV.


safe=15 unsafe=13
safe=15 unsafe=12
safe=15 unsafe=14
safe=14 unsafe=13
safe=14 unsafe=13
safe=14 unsafe=13
safe=14 unsafe=12
safe=14 unsafe=17
safe=14 unsafe=13
safe=14 unsafe=13
safe=14 unsafe=13
safe=14 unsafe=12
safe=13 unsafe=12
safe=13 unsafe=12
safe=14 unsafe=13
safe=14 unsafe=12
safe=15 unsafe=14
safe=14 unsafe=13
safe=15 unsafe=14
safe=14 unsafe=12
safe=13 unsafe=13
safe=14 unsafe=12
safe=14 unsafe=13
safe=14 unsafe=13
safe=15 unsafe=14
safe=14 unsafe=13
safe=14 unsafe=12
safe=15 unsafe=12
safe=15 unsafe=14
safe=14 unsafe=14
safe=14 unsafe=12
safe=15 unsafe=13
safe=15 unsafe=13
safe=14 unsafe=12
safe=14 unsafe=13
safe=14 unsafe=13
safe=14 unsafe=13
safe=13 unsafe=12
safe=14 unsafe=12
safe=14 unsafe=14
safe=14 unsafe=13
safe=14 unsafe=13
safe=14 unsafe=13
safe=15 unsafe=13
safe=14 unsafe=13
safe=14 unsafe=13
safe=13 unsafe=12
safe=13 unsafe=12
safe=14 unsafe=13
safe=14 unsafe=12
safe=14 unsafe=13
safe=14 unsafe=12
safe=14 unsafe=13

Share this post


Link to post
Share on other sites
wouldn't something totally cryptic like:


static unsafe void AddOneUnsafe(int* array, int length) {
int *a = (int*)array;
for (int i = 0; i < array.Length; ++i)
++*(a++);
}


get better performance if you were using pointers? Wouldn't that get rid of the multiplicaton? Given a large enough dataset, the performance should offset the fixed penality?

Then again, I'm just a n00b, I have no idea what I'm talking about. I am honestly curious.

Share this post


Link to post
Share on other sites
Quote:
Original post by RipTorn
Quote:
Original post by Fiddler
For large arrays you should not copy elements around. In this case, if you do not want to resort to unsafe code, use the following snippet:

GCHandle h0 = GCHandle.Alloc(array, GCHandleType.Pinned);
try
{
// h0.AddrOfPinnedObject() will get you your IntPtr
}
finally
{
h0.Free();
}


Awesome. Where/how did you find/figure this out? [smile]

This is by far the best way, and safe provided you don't buffer overrun I guess


I think I first saw that snippet in the msdn documentation for manual marshalling, but I can't remember for sure. Really, I must have read all available articles on P/Invoke, marshalling, garbage collection, unsafe code and reflection about 3 times each, when I was writing the new Tao.OpenGL bindings :)

I haven't done any real world tests on this code, but it should be on par with 'fixed' regarding performance; what I'm worried about is that the try-finally block may be adding a lot of overhead, but I can't see a way to safely remove it. Moreover, according to msdn, pinning should not be used very frequently or for small objects, since this reduces GC performance - in this case, it might be better to just copy the data. On the other hand, this tecnhique is better for large objects, especially since there is a chance that the GC will not move them even if not pinned.

Share this post


Link to post
Share on other sites
Quote:
Original post by krum
With all due respect, Washu, this has nothing to do with unsafe code - rather this is a good demonstration of how poor the JIT's optimizer is. Either way, I think we'll find that the safe code may take at least one additional clock cycle per loop due to the additional add

This JIT optimizer has an extremely short amount of time to run. Unlike C++ compilation where you can run for hours without people caring, .NET jitting has to happen within microseconds. Frankly there just aren't a hell of a lot of optimizations you can do in that short of a span of time. I should note that x64 code produced is much better than x86 code though (simply because of certain instruction set guarantees).
Quote:
lea         eax,[ecx+edx*4+8]
vs
lea         edx,[edi+ecx*4]


I believe the fact that the unsafe code has a few more instructions to set the function up is largely insignificant. Even if the CPU can do both LEA instructions in the same number of clock cycles, I think to suggest that in this case unsafe code produces a substantial performance penalty is going overboard.

The setup code is expensive, however a loop like that does not introduce a significant hit for either one. However, it did kill your claim that unsafe code in a loop would be significantly faster than safe code. The reality is that an LEA of [r + r * c + c] is about 1 cycle to 1.5 cycles slower than [r + r * c]. Pipelining and cache misses will more than make up for that amount of time.
Quote:
The more extreme case would be if you had an array of 4x4 transformation matrices, say from an animation, that you needed to multiply with a parent transform. Using safe code could be tremendously slower because it would need to copy each matrix at least twice for each operation, whereas using pointers could reduce the memory bandwidth requirements.

My point is that using unsafe code per se is not slower. What you do with it is - like anything - a whole different story.

Eh, careful there. There are plenty of managed operations you can do that will eliminate the copies. ref parameters being one of the big ones. Dropping straight down to unsafe code just because it "appears" faster is a misnomer. Furthermore, the setup code you so diligently tossed away adds an invisible overhead that will hit you when you least expect it. First and foremost, any time a collection runs with pinned objects, those objects don't get moved, this fragments the heap. That fragmentation slows down allocations, and also slows down further collections when they happen (it ends up having to do more work). Now, the chances of a collection happening during short pinned durations is fairly small. The cost of the lock and unlocking of the critical section in order to pin the object is not so small. Furthermore, matrix multiplication typically requires per element access, something that I've shown in my journal with the simplest of operations (that of calculating the magnitude of a vector) is NOT the best generated code in the world for pinned pointers. The fact is, the JIT spends very little time optimizing unsafe code, it spends more time optimizing managed code. There is are areas where "unsafe" code 'might' be faster, for instance ones that deals with memory copies. Although you would be hard pressed to beat the built in memory copy (as it uses what was, in 2005, considered the most optimal form of a memory copy for the P4/AMD chipsets). Even MDX doesn't do that, instead delegating its copy operations to a pinvoke of memcpy.

Frankly, unsafe code is unsafe for many reasons, not just because it can't be verified, but because you are taking great risks in attempting to outguess the JIT and "optimize" your code. If you want to optimize your code, then using in-assembly pre-compiled machine code would be recommended. This way you can hand optimize it using the latest instruction sets, vectorization, and other operations.

Now, there are certainly cases where unsafe code can beat managed code. More often than not though, it falls under the 80/20 rule and such optimizations aren't optimizations, just wastes of time. You're better of investing in better algorithms first.

Share this post


Link to post
Share on other sites
Another one for Washu I guess, as per my earlier post on the method I use, mainly because I have to expose a byte* to an unmanaged dll (the only time I use pointers, and maybe that's what the OP wanted it for), for the unsafe bandwagoners :D, the question being I currently do it like so.

fixed (byte *_bp = byteArray)
{
somethingorother(_bp);
}

Can this be changed to an attribute as part of the invoked prototype so it can auto marshal the array to a pointer ? I was reading http://blogs.msdn.com/ericgu/archive/2004/07/13/181897.aspx and they mentioned some possible change, although this was an idea for whidby so much use for me, as mentioned in that thread, when you're dealing with multiple paramaters it can yield some fugly code (doing it as I do now).

Cheers.

Share this post


Link to post
Share on other sites
Quote:
Original post by Niksan2
Another one for Washu I guess, as per my earlier post on the method I use, mainly because I have to expose a byte* to an unmanaged dll (the only time I use pointers, and maybe that's what the OP wanted it for), for the unsafe bandwagoners :D, the question being I currently do it like so.

fixed (byte *_bp = byteArray)
{
somethingorother(_bp);
}

Can this be changed to an attribute as part of the invoked prototype so it can auto marshal the array to a pointer ? I was reading http://blogs.msdn.com/ericgu/archive/2004/07/13/181897.aspx and they mentioned some possible change, although this was an idea for whidby so much use for me, as mentioned in that thread, when you're dealing with multiple paramaters it can yield some fugly code (doing it as I do now).

Cheers.

Take a look at the MarshalAsAttribute, especially with UnmanagedType=LPArray.

Share this post


Link to post
Share on other sites
Ah great stuff, don't suppose you know off hand if the marshal would be any slower ? most cases i'm doing a singular call, but in certain places I have to do several thousand, if it's negligible it doesn't really matter.

Thanks again.

Share this post


Link to post
Share on other sites
Quote:
Original post by Niksan2
Ah great stuff, don't suppose you know off hand if the marshal would be any slower ? most cases i'm doing a singular call, but in certain places I have to do several thousand, if it's negligible it doesn't really matter.

Thanks again.

Profile and see.

Share this post


Link to post
Share on other sites

static class Program
{
private const string dllName = "test.dll";
[DllImport(dllName)]
public static extern void test([MarshalAs(UnmanagedType.LPArray)]byte[] data, int length );
[DllImport(dllName)]
public static extern void test(byte* data, int length);

static void test1()
{
for (int i = 0; i < 1000000; i++)
{
byte[] d = new byte[20000];
test(d, d.Length);
}
}

static void test2()
{
for (int i = 0; i < 1000000; i++)
{
byte[] d = new byte[20000];
fixed (byte* dp = d)
{
test(dp, d.Length);
}

}
}

[STAThread]
static void Main(string[] cmdLine)
{

test1();
test2();
}
}





I tried with the code above, the dll test function was merely a memset, and I profiled via dotTrace which said:

5202ms for test1() 1763ms spent calling test()
5004ms for test2() 1765ms spent calling test()

So for better looking code hardly seems anything between them (unless there's a need to call several million times).

Cheers :)

[Edited by - Niksan2 on April 17, 2007 4:58:36 AM]

Share this post


Link to post
Share on other sites
Hi squashed_bug,

Please read the following article:
http://blog.rednael.com/2008/08/29/MarshallingUsingNativeDLLsInNET.aspx

It's an in depth article about marshalling between native code and managed code. It shows which types are interoperable, how to import a DLL, how to pass strings, how to pass structures and how to de-reference pointers.

Everything you need to know right there.

Share this post


Link to post
Share on other sites
When I was doing performance tests on the SlimDX math library, pinning reference type vectors and then doing simple operations (add/sub/dot/etc) turned out to be consistently slower than simply copying value type vectors on the stack and processing them like that. The absolute fastest method was using value type vectors and passing both input and output by reference.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this