Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 29 Jul 2001
Offline Last Active Today, 10:12 AM

Posts I've Made

In Topic: Do you usually prefix your classes with the letter 'C' or something e...

26 May 2016 - 02:55 PM

I've simplified to the essentials over the years. I prefix for interfaces, cause C# still rolls that way and I like it. (Non-pure abstract base classes are suffixed with Base, usually.) Single underscore to label _private or _protected class variables*. And g_ for globals because those should look gross. That's pretty much the extent of it.


* The underscore thing also works great in languages that don't HAVE private scoping, or when I don't actually private scope them but they're considered implementation details.

In Topic: .Net DX12

26 May 2016 - 10:29 AM

That's the second time someone's asked this week. I haven't worked with SharpDX so I can't speak to that experience apart from where he used our code -_- But I felt that duplicating his work was not necessarily productive. If that's really not the case, and people really want a hand written SlimDX-based wrapper, then I'll see if I can pull something together.

In Topic: how to chose open_gl libary?

26 May 2016 - 10:25 AM

You didn't really explain what you expect the library to do for you, especially if you want to learn about all this "from scratch". If that's your goal, I would truly do it from zero without any libraries. That said, I quite like SDL 2.x for handling windowing/input functionality across systems.

In Topic: what good are cores?

26 May 2016 - 09:51 AM




Memory bandwidth is the bottleneck these days.

Bring on the triple channel! I was very upset when I learned that DDR3 implementations weren't supporting triple channel! I think it was only one or two intel boards that would. Of course you could always build a system using server hardware.
I was far more disappointed when I read several articles about how "we don't need triple channel memory". Well ya no shit we can't make good use of triple channel if it isn't available to develop on numb-nuts!


Quad channel on DDR4 shows next to no improvement, nevermind triple channel.


Why does it show no improvement?


Let's talk about that, actually.

Can the OS not facilitate operations on multiple memory channels in parallel?
Does the software showing no improvement not make use of multiple channels?

The OS cannot see the multiple channels, in fact. More on this in a moment.

It does seem to me though, that if you create a program that creates blocks of data on each channel it is a trivial act to utilize all four channels and achieve that maximum throughput.

How do you create blocks of data on each channel? I'll wait.


You have to remember, first and foremost, that any given program does not interact with the actual memory architecture of the system. Not ever. Let's work from the bottom up - a single stick of memory. No, wait, that's not the bottom. You have individual chips with internal rows and columns, themselves arranged into banks on the DIMM. Access times to memory are variable depending on access patterns within the stick!


But let's ignore the internals of a stick of RAM and naively call each of them a "channel". How do the channels appear to the OS kernel? Turns out they don't. The system memory controller assembles them into a flat address space ("physical" addressing) and gives that to the kernel to work with. Now a computer is not total chaos, and there is rhyme and reason to the mapping between physical address space and actual physical chips. Here's an example. There are no guarantees that this is consistent across any category of machines, of course. Also note that the mapping may not be zero based and please read the comments in that article regarding Ivy Bridge's handling of channel assignment.


Oh but wait, we're not actually interacting with any of that in development. All of our allocations happen in virtual address space. That mapping IS basically chaos. There's no ability to predict or control how that mapping will be set up. It's not even constant for any given address during the program's execution. You have no ability to gain any visibility into this mapping without a kernel mode driver or a side channel attack. 


Just a reminder that most programmers don't allocate virtual memory blocks either. We generally use malloc, which is yet another layer removed.


The answer to "how do you create blocks of data on each channel" is, of course, that you don't. Even the OS doesn't, and in fact it's likely to choose an allocation scheme that actively discourages multi-channel memory access. Why? Because it has a gigantic virtual<->physical memory table to manage, and keeping that table simple means faster memory allocations and less kernel overhead in allocation. It's been a while since I dug into the internals of modern day kernel allocators, but if you can store mappings for entire ranges of pages it saves a lot of memory versus having disparate entries for each and every memory page. Large block allocations are also likely to be freed as blocks, making free list management easier. Long story short, the natural implementation of an allocator leans towards creating contiguous blocks of memory. How do you deal with that as a CPU/memory controller designer? Based on the link above, you simply alternate by cache line. Or, you scramble the physical address map to individual DRAM banks and chips. Remember that Ivy Bridge channel assignment bit? Yep, that's what happened.


Frankly, the benefits of multi-channel memory probably show up almost exclusively in heavily multitasking situations that are heavy on memory bandwidth. I bet browsers love it :D

In Topic: Descriptor binding: DX12 and Vulkan

25 May 2016 - 12:08 PM

I'm also curious to hear how people are starting to take a crack at this.