Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 29 Jul 2001
Online Last Active Today, 09:30 AM

#5295318 use ID or Pointers

Posted by on 06 June 2016 - 12:06 PM

This post, and its accompanying comment thread, may be of interest.

#5295304 Overloading new

Posted by on 06 June 2016 - 11:26 AM


Thanks for quick reply, so every time I push an element vector calls new? Is there any way I can get it to allocate memory in one shot in the beginning then or do I have to write one? 



You can use vector's "reserve" member function to tell it to reserve enough capacity for a specified number of items. That will cause it to make only a single allocation (until you try to insert more than that capacity into the vector, and it must expand).
You're still going to have the same problem though. You will need a custom allocator for that vector (or you'll need to not use a vector here)


The easiest thing to do is simply take an off the shelf demo allocator that calls directly into malloc. Here's one. Create your vector with that allocator specified in the template and it won't attempt to call back into your own tracking allocator.

#5295297 Overloading new

Posted by on 06 June 2016 - 11:07 AM

Your vector<ptrdesc> is calling back into operator new on push_back. You'll need to supply it with a custom allocator, or use another container that does not itself depend on operator new.

#5295289 College? Life? Urggg!?

Posted by on 06 June 2016 - 10:12 AM

As the others have said, computer science degree. More importantly, not a game development degree as offered by some schools. If you can find a real computer science degree with a concentration or special program or minor in game development, that would be ideal. I know they're available, but there aren't that many options. A regular computer science degree would do just fine, though.


In terms of actual details of what classes to take and all, worry about that once you're enrolled. It will depend on the specifics of the school. Generally it'll be best to focus on systems-level programming (computer architecture, operating systems, etc) rather than high level stuff (functional programming, big data processing, highly theoretical work, etc).

#5294833 How does Runge-Kutta 4 work in games

Posted by on 03 June 2016 - 12:47 PM

I need to replace this with Runge-Kutta somehow. 

No, you don't.


I don't think that most physics engines use RK4 at all, most use semi-implicit euler for its balance of speed and stability. It's ok for simple stuff like mass/spring systems, but once you incorporate collision detection and response with RK4 there is not much benefit except in certain cases. For each substep you still need to test for collisions and respond so the performance is about 4x worse than 1st-order integration methods. The integration accuracy is better than just performing 4 1st order steps, but there are many headaches involved with using RK4 in a full physics simulation for games.

Yep. RK4, in short, is garbage. I always found it bizarre that Gaffer on Games recommends it by citing its accuracy, which isn't a particularly important integration property for a game. Stability (energy conservation/symplectic) is a vastly more useful property, and RK4 doesn't have it. In most cases, you can simply use semi-implicit euler and be on your way. Hell, it's faster performance to drive semi-implicit euler at higher frequency than to run RK4.

#5293669 Do you usually prefix your classes with the letter 'C' or something e...

Posted by on 26 May 2016 - 02:55 PM

I've simplified to the essentials over the years. I prefix for interfaces, cause C# still rolls that way and I like it. (Non-pure abstract base classes are suffixed with Base, usually.) Single underscore to label _private or _protected class variables*. And g_ for globals because those should look gross. That's pretty much the extent of it.


* The underscore thing also works great in languages that don't HAVE private scoping, or when I don't actually private scope them but they're considered implementation details.

#5293616 .Net DX12

Posted by on 26 May 2016 - 10:29 AM

That's the second time someone's asked this week. I haven't worked with SharpDX so I can't speak to that experience apart from where he used our code -_- But I felt that duplicating his work was not necessarily productive. If that's really not the case, and people really want a hand written SlimDX-based wrapper, then I'll see if I can pull something together.

#5293613 how to chose open_gl libary?

Posted by on 26 May 2016 - 10:25 AM

You didn't really explain what you expect the library to do for you, especially if you want to learn about all this "from scratch". If that's your goal, I would truly do it from zero without any libraries. That said, I quite like SDL 2.x for handling windowing/input functionality across systems.

#5293608 what good are cores?

Posted by on 26 May 2016 - 09:51 AM




Memory bandwidth is the bottleneck these days.

Bring on the triple channel! I was very upset when I learned that DDR3 implementations weren't supporting triple channel! I think it was only one or two intel boards that would. Of course you could always build a system using server hardware.
I was far more disappointed when I read several articles about how "we don't need triple channel memory". Well ya no shit we can't make good use of triple channel if it isn't available to develop on numb-nuts!


Quad channel on DDR4 shows next to no improvement, nevermind triple channel.


Why does it show no improvement?


Let's talk about that, actually.

Can the OS not facilitate operations on multiple memory channels in parallel?
Does the software showing no improvement not make use of multiple channels?

The OS cannot see the multiple channels, in fact. More on this in a moment.

It does seem to me though, that if you create a program that creates blocks of data on each channel it is a trivial act to utilize all four channels and achieve that maximum throughput.

How do you create blocks of data on each channel? I'll wait.


You have to remember, first and foremost, that any given program does not interact with the actual memory architecture of the system. Not ever. Let's work from the bottom up - a single stick of memory. No, wait, that's not the bottom. You have individual chips with internal rows and columns, themselves arranged into banks on the DIMM. Access times to memory are variable depending on access patterns within the stick!


But let's ignore the internals of a stick of RAM and naively call each of them a "channel". How do the channels appear to the OS kernel? Turns out they don't. The system memory controller assembles them into a flat address space ("physical" addressing) and gives that to the kernel to work with. Now a computer is not total chaos, and there is rhyme and reason to the mapping between physical address space and actual physical chips. Here's an example. There are no guarantees that this is consistent across any category of machines, of course. Also note that the mapping may not be zero based and please read the comments in that article regarding Ivy Bridge's handling of channel assignment.


Oh but wait, we're not actually interacting with any of that in development. All of our allocations happen in virtual address space. That mapping IS basically chaos. There's no ability to predict or control how that mapping will be set up. It's not even constant for any given address during the program's execution. You have no ability to gain any visibility into this mapping without a kernel mode driver or a side channel attack. 


Just a reminder that most programmers don't allocate virtual memory blocks either. We generally use malloc, which is yet another layer removed.


The answer to "how do you create blocks of data on each channel" is, of course, that you don't. Even the OS doesn't, and in fact it's likely to choose an allocation scheme that actively discourages multi-channel memory access. Why? Because it has a gigantic virtual<->physical memory table to manage, and keeping that table simple means faster memory allocations and less kernel overhead in allocation. It's been a while since I dug into the internals of modern day kernel allocators, but if you can store mappings for entire ranges of pages it saves a lot of memory versus having disparate entries for each and every memory page. Large block allocations are also likely to be freed as blocks, making free list management easier. Long story short, the natural implementation of an allocator leans towards creating contiguous blocks of memory. How do you deal with that as a CPU/memory controller designer? Based on the link above, you simply alternate by cache line. Or, you scramble the physical address map to individual DRAM banks and chips. Remember that Ivy Bridge channel assignment bit? Yep, that's what happened.


Frankly, the benefits of multi-channel memory probably show up almost exclusively in heavily multitasking situations that are heavy on memory bandwidth. I bet browsers love it :D

#5293391 what good are cores?

Posted by on 25 May 2016 - 10:08 AM


Memory bandwidth is the bottleneck these days.


Bring on the triple channel! I was very upset when I learned that DDR3 implementations weren't supporting triple channel! I think it was only one or two intel boards that would. Of course you could always build a system using server hardware.


I was far more disappointed when I read several articles about how "we don't need triple channel memory". Well ya no shit we can't make good use of triple channel if it isn't available to develop on numb-nuts!


Triple channel is nonsense. It never showed up as beneficial to memory bandwidth outside synthetic benchmarks and very specialized uses. In any case, on the CPU side my personal feeling is that memory bandwidth isn't nearly as big a problem as latency, when it comes to games. It's chaotic accesses and cache misses that kill us. The GPU, on the other hand, can never have too much bandwidth. We're seeing some great new tech on that front with HBM(2) and GDDR5X.

Isn't 33ms still more responsive than 66ms?  :wink:

You also need to be aware that D3D/GL like to buffer an entire frame's worth of rendering commands, and only actually send them to the GPU at the end of the frame, which means the GPU is always 1 or more frames behind the CPU's timeline.

Of course, VR was where that really screwed us, much more so than input latency. That's why we wound up with this: https://developer.oculus.com/documentation/mobilesdk/latest/concepts/mobile-timewarp-overview/

#5292723 how much PC do you need to build a given game?

Posted by on 20 May 2016 - 09:25 PM

Recommended specs are what happen at the end of the dev cycle, post-optimization work. During dev, a game requires much more power because it hasn't been optimized yet, and you may have any number of quick and dirty hacks to get things done. There are also productivity concerns - our game doesn't use a hexcore i7 effectively at all, but the build sure as hell does. 

#5290910 I aspire to be an app developer

Posted by on 09 May 2016 - 07:01 PM

Moved to For Beginners.

#5288614 GPL wtf?

Posted by on 25 April 2016 - 10:44 AM

It would be helpful if you supplied the original article, rather than your interpretation of it.

#5286316 Best laptop for game development?

Posted by on 11 April 2016 - 10:21 AM

Both the Dell Inspiron 15 7000 series and Dell XPS 15 are excellent laptops. The Lenovo Y700 seems to be a great choice as well. In either case, I would opt for a dedicated GPU model if you can. I would not touch MSI again.


Of the laptops you listed just now... the T540 has a dedicated GPU so I would probably put that at the top of the list.

#5284557 When would you want to use Forward+ or Differed Rendering?

Posted by on 31 March 2016 - 07:47 PM

Crudely speaking, the cost of rendering in forward is N objects * M lights. This means that heavily lit geometrically complex environments get very expensive. Deferred was developed because the cost of rendering for that approach is N objects + M lights. Lighting in deferred is very cheap, even thousands of them if they're small. I've used deferred pipelines in the past to run dense particle systems with lighting from every particle and stuff like that. The downside is massive bandwidth requirements, alpha blending problems, anti-aliasing problems, and material limitations.


Forward+ and its variations were developed to get the benefits of cheap lighting in deferred, but without all of the other problems deferred has. While bandwidth use is still pretty high, it tends to cooperate much better with more varied materials, alpha blending, and AA. It also leverages compute tasks for better utilization of GPU overall. In general, I would tend to encourage Forward+/tiled forward as the default rendering pipeline of choices on modern desktop/laptop hardware, unless you have a specific reason not to.