• Create Account

# Issue with memory mapped files

Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

10 replies to this topic

### #1Chris_F  Members   -  Reputation: 2207

Like
1Likes
Like

Posted 16 July 2013 - 04:47 PM

I'm trying to work with the memory mapped file API in Windows, and am having some pretty serious issues. I'm trying to generate a 16GB file so I started with a little test to see how things would go. Here is the code:

HANDLE file = CreateFile("output.raw", GENERIC_READ | GENERIC_WRITE, 0, NULL, CREATE_NEW, FILE_ATTRIBUTE_NORMAL, NULL);
HANDLE filemap = CreateFileMapping(file, NULL, PAGE_READWRITE, 0x4, 0, NULL);
unsigned char* data = (unsigned char*)MapViewOfFile(filemap, FILE_MAP_ALL_ACCESS, 0, 0, 0);


As far as I know I set this up right. This should create a 16GB file and give me a pointer to the full 16GB of data. To make sure it works I output a magic number every 4 million bytes or so, for the entire length of the file, and then I opened it up in a hex editor to make sure it had worked, which it had. Then I tried writing a gigabytes worth of data, again spread out over the entire 16GB. My computer promptly froze and 10 minutes later, unable to move my mouse or get any other kind of feedback, I pulled the plug.

What am I doing wrong?

### #2Chris_F  Members   -  Reputation: 2207

Like
0Likes
Like

Posted 16 July 2013 - 05:40 PM

OK, so I just noticed that if I right 4GB of data to the first 4GB of the file, my memory usage jumps 4GB in task manager. It doesn't say that my program is using the 4GB of memory, but presumably the OS is using it to cache the data while it writes it to disk?

How the heck do I stop this? The whole reason I'm even trying to use this is because I need to work with an array that is much larger than my memory.

### #3Khatharr  Crossbones+   -  Reputation: 2939

Like
0Likes
Like

Posted 16 July 2013 - 05:56 PM

There are limits to how much you can have paged at one time. You need to use 'views' of the map to isolate the section that you're wanting to work with. It's described in the MSDN docs on the subject. Please check those out and keep us posted.

Edited by Khatharr, 16 July 2013 - 05:58 PM.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

### #4Chris_F  Members   -  Reputation: 2207

Like
0Likes
Like

Posted 16 July 2013 - 06:03 PM

There are limits to how much you can have paged at one time. You need to use 'views' of the map to isolate the section that you're wanting to work with. It's described in the MSDN docs on the subject. Please check those out and keep us posted.

I need to be able to address the entire 16GB (very random access pattern) and I don't understand why I shouldn't be able too on a computer that has an address space that is 16 exabytes.

### #5Bacterius  Crossbones+   -  Reputation: 8324

Like
1Likes
Like

Posted 16 July 2013 - 09:36 PM

Have you tried passing the FILE_FLAG_RANDOM_ACCESS flag to CreateFile? It advises the OS you will be doing random access on that file so it may reclaim the virtual memory pages more quickly. Ideal behaviour in your case would be caching the last few megabytes read in case you suddenly start reading sequentially, but no further to avoid exhausting physical memory, and I would expect the OS to do just that.

So, yeah, try the flag to see if it does anything. Either way Windows isn't going to be running a statistical analysis on your memory access patterns to figure out how best to manage your huge mapping, though I admit it is kind of dumb just letting the system run out of memory like that.

The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

- Pessimal Algorithms and Simplexity Analysis

### #6achild  Crossbones+   -  Reputation: 1631

Like
0Likes
Like

Posted 16 July 2013 - 10:46 PM

If it's going to be entirely random and your computer has less than 20 GB of RAM, this is going to be painful no matter what. Does this at least target SSD drives? Or is it targeting normal hard disk drives? The best flags for performance will be quite different between these two...

Also if it is so random, can you do this on multiple threads at once? Or does the algorithm/whatever require serial processing?

 Why not just say it all huh?

Okay, so worst case scenario is that you are using a normal hdd. In this case, you have two options.

1. Use default options, hint that it will use random access. Typically, windows is going to keep caching as more and more pages of memory get read (since you are not releasing anything and have to map it all at once). It is possible the random access flag will help this somewhat, but don't count on much here (I'd love to be wrong). You will typically get decent performance until memory fills up, followed by a very long time of "frozen" computer. Windows is simply paging a bunch of stuff in and out of memory. No need to pull the plug usually. Just go eat some lunch. Possibly take a nap. Seriously.
2. For consistent time without the crazy hiccups, use the write-through flag. Now you can skip the cache. There may be another flag... perhaps direct buffer when you open the file or something. I can look tomorrow at our code and see. Single threaded for both of these options is the way to go. Perhaps it is best to put all your I/O on its own single thread though...

On the other end of the spectrum would be a computer with 20+ GB RAM, and 2-4 SSDs in a RAID 0 configuration (sure it's slightly risky but not nearly as flimsy as the internet would have you believe)

1. No need for write-through, it will actually slow you down. 2-4 threads writing is the sweet spot. Test this though. All our tests were for maximum sequential throughput in high speed video acquisition. Your usage is very different. There's really not much more to say here... you'll notice 2 different limits at different stages of testing - 1 your RAM bus speed, and 2 your total SSD(s) speed. Again - writing such small bits randomly will probably prove this entirely wrong. Remember you'll actually be reading/writing chunks based on 1 memory page size and 2 sector size - not much difference between an 8 byte and 4096 byte transaction.

I'm guessing your configuration is with a normal hdd given your frozen computer. I'd say shoot for something in between these 2 situations. With any luck you can at least switch it with a known stable SSD. You'll see a dramatic difference.

Is it okay to ask what the application for this is?

Edited by achild, 16 July 2013 - 11:04 PM.

### #7RIGHT_THEN  Members   -  Reputation: 109

Like
0Likes
Like

Posted 17 July 2013 - 12:20 AM

Chirs_F

I am not an expert but with such large RAM. the file is still not read at once.

then could it be a paging problem on the system?

the ram could be occupied with other processes on the system and therefore the file may not

have adequate space as desired. and maximum size allowed on your system maybe less than

the demand your applications are putting on the system.

if at all it has anything to do with paging then at least on windows 7 this is

i am sure you must know about paging but just for the benefit of doubt

paging ->( when computer uses hard disk space to assist RAM because of the lack of it. )

i am sorry if my opinion is not related to your problem

Thanks

Regards

### #8samoth  Crossbones+   -  Reputation: 4660

Like
0Likes
Like

Posted 17 July 2013 - 01:38 AM

There are several things you need to consider.

If your goal is mainly to create a file of this length, then it is not necessary to map the view. Creating the file mapping (assuming proper access rights and protection mode) will create a file to the desired length already.

Mapping the view  is another story. As you've pointed out, your computer should be able to address 16 exabytes. Yes, but only if your program is compiled as 64-bit binary (on a 64-bit computer running a 64-bit operating system). You didn't specify what compiler you're using, but let's assume that -- like most people -- you use a 32-bit compiler, mapping such a large memory area will simply fail, even if your OS is 64 bits.

Note that creating a mapping also consumes considerable amounts of memory and is non-neglegible work, so one wouldn't want to actually map such huge amounts of memory unless really necessary. 16 GiB of memory corresponds to somewhat over 4 million pages table entries, which consume 64 MiB of physical memory.

Mapping 16 GiB when your address space allows for it but you do not have the physical RAM (you'll probably need upwards of 20 GiB) is yet another story, and of course there's the working set size limits.

The maximum working set is limited to ridiculously small sizes by default, unless you change it. Which means that pages will be added to and removed from your working set all the time with such a huge dataset. This means literally millions of page faults (although they are "soft" faults as long as the OS does not run low on zero pages)

Lastly, the amount of zero memory pages is not unlimited. Normally you never notice that, because the idle task always clears unused pages and you normally don't consume that many, so there's always spare ones. However, asking for 16 GiB in one go (and also touching it!) may make you feel it. The OS is required to zero all pages, both in the file on the disk and pages that are mapped into your address space. Since both are "opaque" to your application, the OS cheats as much as it can to hide the overhead. It will for example allocate a file without initializing it and "remember" that anything that is to be read from those sectors is "zero". The same goes for the pages in your address space, it merely "remembers" that these pages are new pages that you've never accessed.

However, eventually, the OS has to write all those zero sectors to disk. Also, eventually, it has to provide a phyically present zeroed memory page (at the very least, the moment you try to access one -- but possibly sooner). At some point, the OS cannot cheat any more, but has to do the actual work. This is when you start to feel it.

Unluckily, Windows is not particularly intelligent when it comes to dirty page writeback either (though I don't know if other operating systems are much smarter). I've experienced this when I wrote a quick-and-dirty free-sector-eraser for my wife's computer (for doing a "kind of security wipe" when getting a computer upgrade with the requirement to turn the old machine in, but with a still intact Windows installation -- so the plan was to first delete the documents folder using Explorer, and then overwrite the now free sectors on the disk 5-6 times simply by writing huge files until the disk was full, thus overwriting all available disk space with random values).

My little tool would map a 1 GiB file, fill the memory with random, close the mapping, and create/map the next 1 GiB file. The idea was that this memory mapping was probably the most efficient way of writing huge amounts of "data" to the disk. Pages that you unmap become "free" as soon as they are no longer "dirty".

New zero pages that you allocate do not need to physically exist unless you touch them, at which point they fault. So, yes, there is a bit of racing for physical pages, but the worst thing to happen is the writer thread is stalled by a page fault which needs to wait for a page to become free and zeroed. That's OK, because all we want is the disk to keep writing with maximum possible speed, we don't actually care about the application's performance. This theory proved true until the machine ran out of physical memory.

At that time, Windows started copying dirty pages which it should just flush to disk and be done with them to the page file (no, I'm not joking!) to make phyical pages available for newly touched pages, then page the dirty pages in again (paging out the now freshly touched pages), and finally write them to disk. Result: Writing at ca. 110 MiB/s the first 2 seconds (pretty much the disk drive's theoretical maximum), then drop to ca. 2 MiB/s.

Now, for the funny part, the "fix" was simply inserting a Sleep(5000); after unmapping each file (*cough*). This gave the disk just about enough time to flush enough pages to disk and throw them away before the application was asking for a new pages. What a crap solution, but it worked...

### #9Chris_F  Members   -  Reputation: 2207

Like
0Likes
Like

Posted 17 July 2013 - 12:31 PM

Is it okay to ask what the application for this is?

Nothing serious. I was just trying to render a large (128K x 128K) Sacks spiral. I was being lazy about it so I though if I simply memory mapped the entire file, I could render it naively and not have to waste any thought optimizing the problem. I knew it would be show as heck with a mechanical hard drive, but I though Windows would handle it more gracefully, i.e. not grind to a halt.

### #10achild  Crossbones+   -  Reputation: 1631

Like
0Likes
Like

Posted 17 July 2013 - 12:58 PM

I see. Well, best suggestion is to not use any file flags such as random_access. To keep the system from hanging, you will want to judiciously use FlushViewOfFile followed by UnmapViewOfFile. On the other hand, that will be very slow if you do it every time you plot a pixel. Hmmm...

It would possibly be much, much, much faster to map, say, 1 or 2 GB at a time. Go through all the values you are plotting, and if they are within the current "mapped" region of the output image, plot them. Otherwise ignore. Then do the next 1 or 2 GB, and go through all values again. The calculations are going to be tremendously quicker than mapping and flushing+unmapping 65kb pages every time you access a pixel.

### #11Waterlimon  Crossbones+   -  Reputation: 2441

Like
0Likes
Like

Posted 17 July 2013 - 02:06 PM

I guess you could have a list of buckets representing mappable areas of the files.

Then, when you create your plottable values, you throw them in the appropriate bucket (what portion of the file they would be based on x,y), including x,y,value.

After you have a lot of them, you go through the buckets one by one:

1.Load chunk of file corresponding to bucket

2.Write all the values to that file chunk (using the x,y coord you stored along with the value)

4.Repeat for next bucket

This would make the writing to file more sequential instead of having to repeatedly load and unload different parts which would happen with completely random access.

o3o

Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

PARTNERS