new and delete operators

Started by
28 comments, last by ShlomiSteinberg 20 years, 11 months ago
I''m not convinced memory fragmentation is going to be that big of a concern. If you''re allocating and freeing the same type (size, pattern) of mem every frame, the holes you leave behind after a frame are going to be the exact same holes you fill next frame, so your overall memory image will stay pretty much intact. Memory managers usually aren''t written by complete bozos.

quote:
its not that avoiding it is a hardcore optimization that requires hours of work.


I agree. No one is saying to always use mem allocation instead of static arrays. I''m just saying the performance hit of using mem allocation is miniscule, and thus it becomes a non-issue compared to where you *should* be spending your optimization effort.

quote:
so, YES it did kill the frame rate. of course you can say "who cares, if you had it running at 80fps and your fillrate was the limit anyway".


AGAIN, if you *profile* your code and discover that excessive mem allocations are in the top 5 of your CPU usage, by all means fix it. But if you really have a culling tree that requires "creating and deleting thousands of elements each frame", I''d question the design of the tree in the first place - I''d bet there are other places that needed optimization more urgently than the calls to new/delete. Doing thousands of anything each frame is a potential for optimization, not just mem allocation.

quote:
but if you know that you will need something all the time or even every frame why not allocate it once and be done with it?


Sure. But it''s the programmer''s call, whether he thinks the mem allocation or reserving mem you don''t need is the lesser of two evils. It depends on the programmer''s style, and the design of the app. It''s a trade-off. It''s wrong to just assume that you should never use dynamic allocation in your main loop. That assumption is based on another assumption that calling ''new'' is expensive - as evidenced by the initial replies in this thread. That assumption is wrong - as evidenced by actually profiling real code.

quote:
if you allocate it all when you load a level you can be sure its there and will "fit" into memory


Agreed. But a level is a case where you know beforehand what mem requirements you''ll need. That''s not always the case.
Brianmiserere nostri Domine miserere nostri
Advertisement
quote:Original post by BriTeg
I''m not convinced memory fragmentation is going to be that big of a concern. If you''re allocating and freeing the same type (size, pattern) of mem every frame, the holes you leave behind after a frame are going to be the exact same holes you fill next frame, so your overall memory image will stay pretty much intact. Memory managers usually aren''t written by complete bozos.


that situation shouldnt be a problem. if you leave your frame the way you found it nothing can happen. but imagine creating a lot of projectiles which live for different times. from time to time items are spawned and sooner or later you cant tell when something will be allocated and freed again. its not too likely but possible, that frequently allocated smaller objects leave more and more holes while larger objects will have to go to "higher and higher" memories to find enough space. not too likely, but a situation where i feel that i lost control and potentially something could go wrong. solving problems that might or might not appear in very special situations might not be affordable, but especially when dealing with software from certain companies i wish they would have at least thought about the obvious problems.

quote:
...and thus it becomes a non-issue compared to where you *should* be spending your optimization effort.


as soon as it takes more than 5 seconds or you have to think about which is better... yes, forget it, use the way thats easier to do and come back later if you have to. but especially the good old question if list or array for projectiles is one, where everybody makes his decision and mine was to sacrifice 100kb in favour of cache and less work. particle systems might be a very similiar problem.

quote:
AGAIN, if you *profile* your code and discover that excessive mem allocations are in the top 5 of your CPU usage, by all means fix it. But if you really have a culling tree that requires "creating and deleting thousands of elements each frame", I''d question the design of the tree in the first place - I''d bet there are other places that needed optimization more urgently than the calls to new/delete. Doing thousands of anything each frame is a potential for optimization, not just mem allocation.


the tree itself was a typical tree.. 4 children, bbox, pointer to patch... the problem was traversing it, as recursion had horrible overhead. using a list and iteration instead was better, but resulted in constant adding and removing nodes. reserving space for the list would have worked of course, but then it wouldnt have been too different from the array i ended up with anyway.
if i remember the numbers that was 30%, 70% and 99.9% of the speed i had with brute force. so by now im really careful about trees if i dont have a very high number of leaves.

that was by far the most useless optimization so far. culling was the main problem (as the app wasnt doing much else on the cpu anyway), but some things cant be optimized further (ok, i didnt care to lay hands on the assembler output).

quote:That assumption is based on another assumption that calling ''new'' is expensive - as evidenced by the initial replies in this thread. That assumption is wrong - as evidenced by actually profiling real code.


an assumption that msvc seemed to prove. with 500 new/delete calls i had about 4ms for that which would be about 25% of the time i can spend on one frame. of course not running the app from within vc++ and compiling with the right options reduced that 0.1ms. so i admit that the time isnt as much of a problem as i thought. fragmentation still might be, the more the less memory your system has.

quote:
Agreed. But a level is a case where you know beforehand what mem requirements you''ll need. That''s not always the case.


not always, but you can often estimate. assume youre fasted "entity spawning" object creates 100/s (and pretend its a shooter with 32 players max... for mmorpg thats pointless but they usually arent based on fast gameplay anyway).. reserving 3200 projectiles would be enough. if of course you would have to reserve 10mb if you usually dont need more than 100kb forget about it. but again thats my personal preference.. having everything in its place and feeling in control (at least until i write behind an array boundary and spend an eternity to find out why a value suddenly changed into nonsense).

and thinking about the example above there would be more urgent problems i think. like not firing 100 bullets if you have less then 100fps and additional processing to create the missing ones later on and calculate their correct position. or in case of hitscan.. ouch, couldnt even fix it without potentially dealing damage to someone whos long gone. alright, dont spend too much on memory issues. concerning the initial post and if its 500 floats every single frame i would definitely either add a static or dont call delete before the end. even if its just to feel better and because no matter how much work it really is it seems to be useless work.
f@dzhttp://festini.device-zero.de
quote:Original post by Trienco
but imagine creating a lot of projectiles which live for different times. from time to time items are spawned and sooner or later you cant tell when something will be allocated and freed again. its not too likely but possible, that frequently allocated smaller objects leave more and more holes while larger objects will have to go to "higher and higher" memories to find enough space.


I''m still not convinced. What is a lot of projectiles? 20? 50? 500? You''re allocating/deallocating memory within only a few kilobytes, max. Hardly enough to create swiss chess out of your RAM. I guess until someone can provide an example with real data and a real memory frag image, this is all speculation on both sides.

quote:
not too likely, but a situation where i feel that i lost control and potentially something could go wrong.


I don''t understand what you mean about loosing control.

quote:
but especially the good old question if list or array for projectiles is one, where everybody makes his decision and mine was to sacrifice 100kb in favour of cache and less work. particle systems might be a very similiar problem.


Again, your decision was a good one. That doesn''t mean it''s bad if someone else chose to call ''new'' for each projectile or particle.

BTW, I ended up changing my saver, and it only took me 10 minutes to change from calling ''new'' several hundred times per frame to using a static array. I now only call ''new'' once per frame. It *is* a cleaner, easier way to do it, but now it sets aside a large chunk of memory it usually doesn''t fully use. The fps didn''t change at all, but yes I feel a little more confident that because the design is a little simpler now, there''s less chance for a bug in my mem handling.
Brianmiserere nostri Domine miserere nostri
quote:Original post by BriTeg
I''m still not convinced. What is a lot of projectiles? 20? 50? 500? You''re allocating/deallocating memory within only a few kilobytes, max. Hardly enough to create swiss chess out of your RAM. I guess until someone can provide an example with real data and a real memory frag image, this is all speculation on both sides.


maybe im a little radical, i had 500 ships with 2 guns each firing 5 shots per second and living for a few seconds. so i''d estimate about 25000 projectiles max and maybe a third used. though i remember i was using a list too (with the constructor and destructor adding and removing a projectile from the list.. dont know if i will ever have code again thats supposed to look like new Projectile(ShipIter->Pos); and then just forget about it *g*). back then the ai for all the ships was the bigger problem.
i think it should have been safe, as all projectiles were the same size a new one could always fill the gap. but i cant forsee that with objects of different sizes and thats what i mean with loosing control (or least feeling like i do).. i cant tell if there''s a situation where things go wrong.

quote:
BTW, I ended up changing my saver, and it only took me 10 minutes to change from calling ''new'' several hundred times per frame to using a static array. I now only call ''new'' once per frame. It *is* a cleaner, easier way to do it, but now it sets aside a large chunk of memory it usually doesn''t fully use. The fps didn''t change at all, but yes I feel a little more confident that because the design is a little simpler now, there''s less chance for a bug in my mem handling.


hehe.. see? thats what i mean with at least to feel a little better even if its not making much difference. it feels a little more neat and tidy ,-) (as long as doing that wont waste several 100kb or more.. though having a look at some games i dont think anybody would care about wasting several mb today *fg*)

f@dzhttp://festini.device-zero.de
If you guys want to run your games on a 486Dx2, ok, thats can make a dif.. But now, you have 1500Mhz?? 3000Mhz??? Gf4 ?? ATi 9700 !?? 25000 or 50000 or 100000 projectile with new and del will never make your FPS slow down!!(if you dont make the new at all the same time :o)

I always use link list and I think its the best way.
..and thats the kinda attitude which causes bloat and means simple things require faster and faster cpus.

Just because the power is there doenst mean you shouldnt think about how you do things and try and get the most out of the cpu/g|vpu combination, not just think ''oh i''ll do it this way coz the system will beable to handle it if its fast enuff''.
hehe.. first someone complains about us pc programmers not caring about memory and now its the same with cpu cycles.
and here i stand and just claim that in 90% of all cases you either require more cpu cycles or more memory.

and until not so lang ago i had 650mhz and i still aim at machines with 1ghz max. having faster machines should let us make better games, not the same games over and over with less and less efficient code (or while we''re at it... better looking games with less and less real gameplay.. theres really a limit to how many games i need where i run around in first person and shoot everything that moves or command units in a topdown view in a rts).
f@dzhttp://festini.device-zero.de
indeed, i''m not against using all the power a fast machine can give as long as you use it sanely, no cases of ''oh lets not worry about that, processors are fast enuff'' style coding.

Granted, i did learn my trade on an Atari STe so i tend to be a tad paranoid about using too many clock cycles, but that can be forgiven as it was an 8Mhz machine
hehe.. when i was programming on an atari st i didnt even know what clock cycles are (though multiple choice text adventures really arent the kind of games where you need to *lol*)
f@dzhttp://festini.device-zero.de
the simple answer is, there is no answer ...

dynamic memory allocation, must be used wisely ... you CANNOT efficiently use static arrays to keep widley changing sorted lists in ... sometime an array stored as a heap is good, or any other form of array based tree ... other times you need different access than tree struture gives you ... such as "remove all elements from ardvark to avarice" ... in this case a dynamic list is the onyl efficient choice ... these decisions have NOTHING to do with modern computers at all ... they are fundamental computer science theory ... some things can be stored in static structures like arrays well, other cannot, and benifit from pointer based lists and trees. Note that you can still write your own memory pooling library without much trouble at all ... so if you want to change sizes every frame ... you can use your thread-local memory pool manager to get 100% efficiency, but still use individual allocation based structures ... but like everything else, there are more wrong ways to optimize memory management than right ones (methods that hurt performance in certain usage situations).

This topic is closed to new replies.

Advertisement