Jump to content

  • Log In with Google      Sign In   
  • Create Account

Matias Goldberg

Member Since 02 Jul 2006
Online Last Active Today, 04:18 PM

#5316944 When will DX11 become obsolete?

Posted by on Today, 12:23 PM

I don't know why I have to keep repeating myself.

It's because you're wrong and folks here have already explained why.


Down-votes only tell me that somebody is panicking for no reason.

I am as panicked as I am worried motorcycles will takeover cars because they're faster and consume less fuel (conveniently ignoring that cars can transport more people, are more comfortable, can withstand adverse weather conditions better, and have higher chance of survival in the even of an accident).

Most indie titles might not need to get the kind of performance DX12 is offering, but that doesn't mean you should run away from it (neither was the question limited to indie/niche markets).

Indies don't get scared away because of higher performance. They get scared away because of its high maintenance cost.
Not only it's more complex to code, you have to keep one codepath for each vendor (that means 3) because the optimization strategies are different. Not to mention you have to update your codepaths as new hardware is released, and if you did something illegal by spec that just happened to work everywhere, you may find yourself fixing your code when it breaks in 4 years because it's suddenly incompatible with just-released hardware.


Edit: Just an example:

  1. On NVIDIA you should place CBVs, SRVs and UAVs in the root signature much more often. Also avoid interleaving compute and graphics workloads as much as possible.
  2. On AMD you should do exactly the opposite. Avoid using the root signature (except for a few parameters, specially the ones that change every draw). Also interleave compute & graphics as much as possible.
  3. On Intel (and AMD APUs) you should follow AMD's rules, but also avoid staging buffers because host only memory lives in system RAM. The memory upload strategies are different.

Have fun dealing with all three without messing up. Also keep up with new hardware: these recommendations may change in the future.

#5316691 When will DX11 become obsolete?

Posted by on 25 October 2016 - 05:42 PM

Hard to tell since DX12 is not considered the successor of DX11. They're meant to live each alongside other. DX12 being the "for experts" for dedicated teams and with large resources, while DX11 being more friendly and easier.


Though whether DX11 ends up decaying due to MS focusing more on DX12, time will only tell. Beware MS is still updating DX11. The last update for Direct3D11.4 was released on August 2016

#5316390 A quick way to determine if 8 small integers are all equal to each other

Posted by on 23 October 2016 - 07:24 PM

Create a table with 256 entries, use 32-bit values in x86, 64-bit in x64; 128-bit with SSE.
The entries of the table would be:

#define MAKE_MASK( x ) ((x) << 24u) | ((x) << 16u) | ((x) << 8u) | (x)

const uint32_t table[256] =
    MAKE_MASK( 1 ),
    MAKE_MASK( 2 ),
    MAKE_MASK( 255 ),

Then just do a 32/64/128-bit compare:

uint8_t idx = (uint8_t*)_array[0];
if( table[idx] == ((uint32_t*)_array)[0] &&  table[idx] == ((uint32_t*)_array)[1] )
   //All 8 values are equal

#5316172 Matrix 16 byte alignment

Posted by on 21 October 2016 - 10:51 PM

Not in my experience, _mm_loadu_ps() was only a few % slower (maybe 1 cycle at most) than _mm_load_ps() when I did the benchmarks on Intel i7, and that extra cost is not even measurable when the address is aligned. Use aligned loads whenever you can ensure alignment, but it seems like more of a microptimization. You'll save more time by thinking carefully about how to lay out data for better cache utilization so that you don't pay tens of cycles each memory access.

YMMV (your mileage may vary). Expensive, power hungry CPUs like the Intel i7 have the lowest penalty. But on certain architectures the performance hit is big (Atom, AMD CPUs). Also this problem comes back to bite you if you later port to other platforms (i.e. ARM)
Furthermore how much slower depends on how good the CPU was masking the penalty of unaligned access. If you're hitting certain bottlenecks (such as bandwidth limits) the CPU won't be able to mask it well, and thus that 1% grows.

You'll save more time by thinking carefully about how to lay out data for better cache utilization so that you don't pay tens of cycles each memory access.

Ensuring alignment is correct is part of carefully thinking how to lay out the data. Furthermore, ensuring correct alignment takes literally seconds of programming work, if not less, and it doesn't make things unreadable or harder to maintain either.

#5315850 How to understand GPU profiler data and use it to trace down suspicious abnor...

Posted by on 19 October 2016 - 12:03 PM

For example, my GPU profiler told me that it takes 1.5ms for a compute shader to copy a 1080p R8G8B8A8 image from CPU writable memory to a same format Texture2D on default heap every frame. Does that sound normal given that I am using GTX680m?

You need to do the math. Get the specs of maximum data transfer of your system, the GPU the PCI-E bus, system RAM, etc (theoretical on-paper specs are nice, whether you got it online or via a tool like GPU-Z; but it's much better if you work with data provided by some specialized benchmark tool you ran on your system). Once you've got the transfer speeds of your system, do the math and check whether you're hitting one of the limits. FYI 1080p RGBA8888 needs 8 bits x 4 channels x 1920 x 1080 = 66355200 bits. Which is 8294400 bytes, 8100 kb, 7.9MB.

#5315385 CPU + GPU rendering with OpenGL

Posted by on 15 October 2016 - 11:08 PM

You'll need to use a PBO, keeping it mapped with persistent storage and using fences for synchronization.
There is one thing I don't get though:

Only GPU rendering with fixed textures takes 4.0 ms which is okay for OpenGL but then I cannot write freely to the depth buffer unless there is an extension for that. Copying back from fake depth buffers all the time would stall the GPU while waiting for the output as the next input texture.

I assume you don't know about gl_FragDepth?

What is that depth buffer manipulation you do? why do you need it? what are you trying to achieve?

#5315323 "Self-taught" 18yo programmer asking for carrier advice. Seriously.

Posted by on 15 October 2016 - 08:19 AM

My experience is the complete opposite of the other answers given here.


I am also self taught since 14 years old. By the time I graduated from school I was 17 and knew x86 assembly, C & C++. I had fully read the Intel architecture manuals, and I had completed small projects, one of which I posted on Sourceforge and I learnt the fundamentals that would help me years later with vertex shaders (it was a library that did vertex TnL transforms in assembly); and other small apps, from a simple program that enumerated all processes and allowed you to change its window position and size (even if the original program didn't allow you to), to a full blown image viewer (which sadly I never released) that dealt with a lot of image formats (BMP, JPG, PNG, TGA, GIF, TIFF, DDS) back in a time where image viewer struggled to support them all, and this program in particular gave me a lot of useful tools that would later help me with pixel shaders because I had developer lots of screen space filters back then in C.


But I didn't pursue a CS course. Programming to me was a hobby. I did it for fun, and I felt doing it professionally would end up like those NBA players that started out having fun playing basket and ended up with stressful professional careers where money was everything.


I ended up studying accountancy. Luckily, here accountancy is a very thorough career. There was a lot of math & statistics mandatory courses that helped me with programming. I learnt:

  • Limits
  • Derivatives
  • Integration
  • Definite Integration
  • Cauchi theorem
  • Taylor series, Maclaurin series
  • Matrices
  • Probability theory
  • Combinatorics
  • Bayes Theorem
  • Binomial, Gauss, Poisson curves
  • Chi Squared test

(note that in school I had already seen stuff like conics & solving polynomials via Ruffini rule)


So... I was really lucky. A lot of stuff needed in CS I learnt at Uni by studying something completely different (note that I never gave up being self-taught! btw I read every GDC and SIGGRAPH that came out for free every single year!). At some point while studying I got approached by a small startup (due to my increasing notoriety in open source communities and gamedev forums... that's how I built my portfolio) to work with them. Working alongside with very talented people allowed me to hone further my skills. I learnt a lot from them. And that's how I slowly started working professionally, getting more and more gigs; eventually getting NDAs with big companies and doing lots of contacts and getting good references. Turns out I was wrong, doing what you love and getting paid is great.

When I graduated I hung the degree on the wall, but I don't practice. Note however that this degree wasn't useless:

  • As a freelancer, becoming an accountant helped me a lot with my own finances, as well as dealing with foreign commerce, banks, taxes, Law, and protecting my rights (and knowing my obligations!).
  • Things like governments care a lot whether you have a degree.
  • In the end I had the rare trait of both knowing finances and accountancy in depth, and programming in depth. I prefer working on multimedia stuff (3D, graphics, audio... games), but I have to admit the finance programming world is very profitable.



So what would you recommend? Doing the programming degree, that probably won't teach me much, or doing something else while building my portfolio? Or maybe I should get an internship somewhere? (if  you happen to offer one, I'd happily agree to join). Look, guys, this post is not so that I can boast about how awesome my skills are. I just spent all these years on hard work and would really like to hear your opinions so that I don't have to go back in time and start all over again at uni, because that would really hurt.

That will happen whatever field you study. Back when I started accountancy everything was new to me and it was awesome. But by the time I was in my last years we tend to clash a lot of with the professors.

Some professors are awesome and know a lot, some may not know the answer to everything but accept student input.

But you know when you see a charlatan who somehow got his/her degree yet can't do anything better than teaching. And everything he/she says will always be always right (don't you dare prove them wrong!). You suffer a lot and can be very frustrating.

The only difference knowing programming beforehand is that this frustration will come in sooner. That's all. I had a friend who was studying CS at the same time I was doing accountancy, and we had this conversation:

  • Hey man... in this question here the teacher corrected this last year's exam about the behavior of a C++ constructor. What do you think?
  • Let me... <analyzes the question and the student's answer>. Nope, your teacher is wrong, the student was right.
  • How's so... ?
  • <I proceed to show him proof the teacher was wrong with code that denies her assertion using printf() inside constructors and custom operators in Clang, GCC, MSVC in both Release & Debug builds>
  • No... it can't be... Are you... sure? but... but.... she corrected the student...
  • <I look up the C++ standard, after a several minutes I find what I was looking for and point him the clause that proved the teacher wrong definitely>
  • Oh man.... this sucks! You're right! But... well what should I do... mmm.... if the teacher corrected him, I guess I should write down what the teacher says it should happen.
  • Yep, that's what we do when the teacher is a @!!###!@.
  • Yeah, tell me about it

Same thing happened when a teacher told a friend OOP was perfect, it made everything better, should be obsessively aggressively applied everywhere, and would get angry if someone hinted otherwise. This friend contacted me because he really felt OOP couldn't be that good, and I had to show my friend talks from Mike Acton and the PS3's OOP pitfalls presentations which made him feel a lot better and less crazy.


But the thing is... I've had the same problems studying accountancy with specific issues to that field I won't share.

Something you need to learn this problem doesn't go away when you finish university. This same problem will come back as the form of employers telling you to do something that you know there is a better way or you disagree strongly. If you are your own employer, you will have clients that will try to impose you which language should you use and how to do your job (even if they don't know nothing about programming!!!). You learn to either walk away or smartly deal with them (i.e. find a way to show them you did things the way they want it, while behind the scenes it isn't really so. Keeping them happy). But remember that a client that is a problem isn't a client, it's a problem.

There's also awesome employers and awesome clients. Not everyone is bad. It just happens that one awful guy makes a bigger imprint on your memory than 4 great employers/clients in a row.


So... in the end, if you're going to study a different field, like frob said make sure the basics are still covered (algebra, calculus, combinatorics, statistics, algorithms) and look into something that would complement what you know. In my case I got Law & Business. But it could be as well medicine if you're into bioengineering (or... you know... the bioengineering career).

#5314894 Unroll

Posted by on 12 October 2016 - 02:47 PM

How old is that ATI card?

If it's too old you'll run into two problems:

  • Radeon HD 2000-4000: OpenGL propietary drivers are no longer updated. They're known for being a bit buggy now.
  • Pre-Radeon HD 2000 (i.e. ATI Radeon X1000 series): These cards are so bad it drove us crazy back in the day. Unrolling your loop 8 times could be hitting the limit of 256 instruction slots these cards had.

#5314889 Linear Sampler for Texture3D

Posted by on 12 October 2016 - 02:20 PM

Another thing to consider is mipmaps. If your UVWs vary too much, the GPU will be likely dropping to lower mips. And the results will be as good as the quality of the mip allows.

#5314671 Fast Approximation to memcpy()

Posted by on 11 October 2016 - 08:35 AM

Just map the same physical region to another virtual address. Same content, zero copy. Different addresses!

#5313465 Are games with large worlds compiled with fp:/fast?

Posted by on 01 October 2016 - 07:32 AM

AAA is a chaotic world. For instance the first releases of Skyrim were built without optimizations.

#5313445 Talent Systems Discussion

Posted by on 30 September 2016 - 11:19 PM

FFX has the sphere grid system. In a way, it's a skill tree in a specific layout, except there may be multiple paths to unlock a particular skill, some of them are short while others are longer.


FFV had a crystal job system. I can't remember exactly how it worked (it's been 11 years last time I played it); but you had several job classes your characters could switch (as story progressed you unlocked more job classes; some of them were optional through sidequests). Your chars had individual levels, and also class levels.

Your chars first needed to grow experience on that class before they were useful. e.g. if you had a lv 99 character but lv 1 as a white mage; he would have a lot of mana, but barely any useful spell (like Cure1). Mastering a job sometimes gave some side effects outside that job (i.e. permanent health or mana increase regardless of the current job, unlocking a few commands when going job-less)

You could only switch jobs outside of battle. Switching jobs was complex because if a character was a powerful black mage and then you suddenly made him a knight, you'll end up with a weak knight; and also you had to change his equipment. It was very fun, but also very hardcore and time consuming. Something great when I was younger and had the time, not so much now.

#5313234 peekmessage while vs if (main/game loop)

Posted by on 29 September 2016 - 07:12 AM

Just in case vstrakh's answer isn't clear, first one is right, second one is wrong (it will process only one event per frame).

#5313127 Why do new engines change their file extensions and formats all the time?

Posted by on 28 September 2016 - 05:32 PM

It seems that backward compatibility is not possible when changing file formats. -_-

It's not that it's impossible, but rather that we don't want to or we don't care.


Games are loading proprietary file formats and they do not owe anyone the obligation to maintain any kind of backwards or forward compatibility (as it does happen with MS Office, Maya, Blender, 3DS Max, GIMP, Photoshop, etc). These files are for internal use only.


Furthermore backwards compatibility limits flexibility & performance. A newer format may store the same data but in a different layout. For example the new format may have decided to reverse the order in which LODs are stored so they get parsed first, streamed & displayed while the higher quality LODs are still loading in the background.

Why they weren't stored in reverse order in the first place? Maybe an oversight, a mistake, or simply because years ago the engine didn't support background streaming; or mesh loading was not a performance concern at all.

#5312931 Stencil Write with Clip/Discard

Posted by on 27 September 2016 - 05:25 PM

No, unless you've forced early Z via earlydepthstencil (HLSL) or early_fragment_tests (GLSL) as the stencil operation will happen before the pixel shader discards the fragment.