• Advertisement
Sign in to follow this  

problem with very large float array

This topic is 4791 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I am having some trouble trying to allocate a very large float array. It was originally a three dim array, that could be as large as 1024 * 1024 * 392. Now I have made it a one dim array, and the program crashes trying to allocate it. It says something like bad alloc. I am trying this on a system with 2 gigs of ram. This array should only take 1.6 gigs. What is the problem.

Share this post


Link to post
Share on other sites
Advertisement
That is 1.6GB of contigous memory you are requesting if I'm not mistaking. The OS might not be able to give you that much in one go. Also why are you requesting so much memory up front?

Share this post


Link to post
Share on other sites
Quote:
Original post by corrington_j
I am having some trouble trying to allocate a very large float array. It was originally a three dim array, that could be as large as 1024 * 1024 * 392. Now I have made it a one dim array, and the program crashes trying to allocate it. It says something like bad alloc. I am trying this on a system with 2 gigs of ram. This array should only take 1.6 gigs. What is the problem.


Are you putting the array into a function (just like float array[size]) then it would overflow your heap. If you are using new/delete or malloc/free then i'm not sure. Maybe some sort of limit within windows (can't allocate big block, if its making it all continous) or your program. Can you split it in 4 sections maybe.

The memory size shouldn't matter as all things don't fit into main memory anyway and are paged in and out as needed.

Share this post


Link to post
Share on other sites
so, why does it work with a three dim array, is that not contigeous memory. If i made it two array would it maybe work. Also, it used to be a global, non dynamic array, and it worked. Why

Share this post


Link to post
Share on other sites
it's defined as ProcessedImage = new float[ImageDimX * ImageDimX * NumPixelsZ].
it can be as large as 1024 * 1024 * 392. It crashes when it is 1024 * 1024 * 300 or greater. It seems like it should just be able to flow over into virtual memory, even if there wasn't enough memory. Also i am using VC++ .NET 2003, are there any settings i need to change for large mem blocks. Also i tried splitting it into two blocks, and it fails trying to create the second block. Please help This array interfaces with lots of existing code, and i don't want to have to change all the other code.

Share this post


Link to post
Share on other sites
You are trying to reserve and commit about 1.5 GB of physical space. You could easily be running out of virtual addresses or physical space requesting so much memory at once.

What you want to try first is see if there's even that much virtual space available to reserve. You can do this with VirtualAlloc with MEM_RESERVE. Next, commit pieces of this address space bit by bit until you find your limit.

If that limit is far below the memory you need, then you might just have to manually page your data to temporary files and only process small portions of it at once.

Share this post


Link to post
Share on other sites
What do you want to do with this array ? Is it for a simulation ?

If most of the array contains zeros, then another implementation can reduce memory usage. Maybe you should take a look at Blitz.

http://www.oonumerics.org/blitz/

Share this post


Link to post
Share on other sites
Quote:
Original post by corrington_j
so, why does it work with a three dim array, is that not contigeous memory. If i made it two array would it maybe work. Also, it used to be a global, non dynamic array, and it worked. Why


A 3 dim array example: float t[3][3][3]. This allocates 3x3 memory chunks of 12 bytes (3 floats) than can be placed anywhere (but it most likely to be placed contigeous), if I'm not mistaken, and therefore it may or may not be contigeous memory.

Share this post


Link to post
Share on other sites
If you need that much RAM, you probably need to rethink your algorithm. What are you actually doing? Usually you can break it up so you don't need 1.6GB of memory in one go.

Anyway, requesting that much RAM and wondering why it fails is silly: it must fail since you probably don't have that much physical memroy free with Windows running, let alone in a contiginous block.

Share this post


Link to post
Share on other sites
Consider using a memory mapped file, where the OS pages individual memory pages in and out automatically for you. Look up mmap() on Linux and CreateFileMapping() on Win32.

Share this post


Link to post
Share on other sites
Quote:
Original post by nife
Quote:
Original post by corrington_j
so, why does it work with a three dim array, is that not contigeous memory. If i made it two array would it maybe work. Also, it used to be a global, non dynamic array, and it worked. Why


A 3 dim array example: float t[3][3][3]. This allocates 3x3 memory chunks of 12 bytes (3 floats) than can be placed anywhere (but it most likely to be placed contigeous), if I'm not mistaken, and therefore it may or may not be contigeous memory.


From a programming point of view, an array of type[n][m]

is a contiguous memory space of n*m*p*sizeof(type) bytes. You can cast this to a type* and iterates in the array as you want. This is different from a type*** - dont do that, please! It's horrific :/

From a system point of view, the memory can be non-contiguous - but then the system is responsible for implementing the mecanisms which may make the program believe that it is a contiguous block (see the "Inside Windows" (now called "Windows Internals") series by Russinovich and Solomon).

Quote:
Quoting the msdn:
Typically, a process can access up to 2 GB of memory address space (assuming the /3GB switch was not used), with some of the memory being physical memory and some being virtual memory.

I guess 1.6 GiB is a bit big :) I suggest you to map a disk file - you'll have to provide a good management framework to make it efficient but it will have better chances to succeed.

HTH,

Share this post


Link to post
Share on other sites
Actually in both cases, mapping a file into memory by hand or letting the OS map the pagefile automatically, consumes about 1.5 Gigs of virtual memory.

On IA32 architecture, your address space is 4GB big. Mostly the operating system reserves 2GB, so only the other 2GB remain for your application.

No matter if you allocate 1024*1024*392*sizeof(float) bytes as a 1D array or 3D array, you should run into the same problems.

What you can do about it is to use sparse matrices. Or you may implement a wrapper for accessing the array, which caches parts of the array without using the whole address space. (using C++ operator overloading)

Share this post


Link to post
Share on other sites
I'm curious why you need so much memory...and what algorithm you are going to use to process 1/2 billion of floats!!!

Share this post


Link to post
Share on other sites
ok, so how come if i define it global and non dynamic it works.
float processedImage[1024 * 1024 * 392];
There is no problem with this, it just flows over into virtual memory with 1 gig of ram, and on the system we are actually running it on the 2 gigs is plenty. Can someone just please explain this too me, because i don't really understand the reason it works one way and not another. Please just an explanation of what is going on here, thanks.

Share this post


Link to post
Share on other sites
Have you tried using it yet? Compilers have a nasty (well, sometimes nasty) habit of 'optimising' your code by stripping out things you've declared that they think you aren't using.

Share this post


Link to post
Share on other sites
Quote:
Original post by corrington_j
yes, i did try using it, and with no problems. So why the diff between the dynamic and non-dynamic array.


non-dynamic -> on stack or static.
dynamic -> in free store.

The stack has a maximum size determined during compilation.
Static variables are placed in their own region of memory when the program loads.

Share this post


Link to post
Share on other sites
1.6GB IS TOO BIG!
The 32-bit address space can only address 4GB of memory. Thinking that the OS is going to be kind enough to ensure that there is 1.6GB of contiguous unused address space is a bit on the optimistic side. Even alocating 1 byte of memory in 2 places in the address space (1.4GB mark, 2.8GB mark) would be enough to prevent there being enough contiguous address space.
Chances are you global allocation did not actually work either. The app probably silently failed to load properly or something. Does it fail to load when you make your global array 2GB in size? If not, then the compiler or the OS is lying to you.

Also the maths for the array size [1024 * 1024 * 392] may be done with shorts since they are all less than 32767, so the result would be, uh... zero? Try [1024L * 1024L * 392L].

You definately should not be using so much RAM anyway. Stop trying to get it to work with so much RAM before you waste any more time. You really HAVE to make it use less RAM somehow.
It's not for a voxel engine is it? You could use shorts instead of floats, or come kind of fixed point, or compress them in some other way. You could lower the detail of the grid and generate high detail surface variations via procedural content generation instead for example.
You are going to have to lose accuracy somewhere I think.

Share this post


Link to post
Share on other sites
well i'm working on some old code for tomographic reconstruction of cell image slices, not exactly game related, but i like this forum. The problem is that there is lots of old code, which interfaces with this large array, so i would like to avoid making a major change to the code.

Share this post


Link to post
Share on other sites
Yeah I am usually not here for game-related reasons either. Hence most of my posts being in the General Programming forum.

Did you yourself write this "old code"? If not then I can understand not wanting to change it.
I can see why the data is so huge now. So it's 392 slices eh?! I'm guessing that each point is a brightness between zero and one? I'm also gonna assume that 256 brightness levels are enough?

I think you're gonna have to get your hands dirty with this old code. Under the assumptions I've made you can reduce the memory requirements by four, by using a byte instead of a float.
I know these kinds of things are scary, but I really think you'll have to touch that old code. Don't worry, we can all provide tips for making the changes as painless as possible, and review your changes for you perhaps if you are worried about them, and it'll come out better than ever.

If you decide to go down this path let us know and we will help you some more. I have worked for years with a "Product Support" team that maintained all of the old programs, so I have some good experience for this.
The first step would be to search for every place where the name of the array appears in this "old code", and make a note of what is happening at each one of those places... Don't change anything at this point though.

Share this post


Link to post
Share on other sites
Quote:
Original post by iMalc
Also the maths for the array size [1024 * 1024 * 392] may be done with shorts since they are all less than 32767, so the result would be, uh... zero? Try [1024L * 1024L * 392L].


Nope, they'll all be ints. Unless he's working on a 16 bit system, because, well, then he's got 16 bit ints. In that case, his array would turn out to be 0 indeed. But that'd cause a compile error, wouldn't it?

Share this post


Link to post
Share on other sites
pointer["every integer type you want"]

(ULONG)1024*1024*392 = 411041792 = 0x18800000 < 0xffffffff (MAX_LONG = ~0)

This is not a problem (you could use also signed (long) integers).

Also 16bit systems are able to address 64K!!!

Share this post


Link to post
Share on other sites
I didn't write the old code, and it was quite a mess when i firt got it. It's my first contracting job, so it has been interesting. The reconstruction can use less memory, so it currently works, but it could run faster if i could create this array all at once, rather than having to do the calculations in sections. I think i will just stick with it doing it in two sections, to decrease the memory usage. Also, i could not use bits for the data, because the images are grayscale, not just 1 or 0.

Share this post


Link to post
Share on other sites
Quote:
Original post by corrington_j
...
Also, i could not use bits for the data, because the images are grayscale, not just 1 or 0.


Why you use floats? Use unsigned char!

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement