Jump to content

  • Log In with Google      Sign In   
  • Create Account

Hodgman

Member Since 14 Feb 2007
Offline Last Active Today, 04:56 AM

Posts I've Made

In Topic: Sharpen shader performance

Today, 12:17 AM

[quote name="LM-Crashy" post="5313172" timestamp="1475128726"
[quote]
Making a shader transpiler and a new shader language is a whole project in itself, Hodg :D When you don't have a tooling team that makes that stuff, the time might be best invested in something else.
[/quote]
Yeah, this is practically beyond capacity of an indie studio..[/quote]
Good thing GLSLOptimizer is open source :)
If it takes a few days to integrate GLSLOptimizer and it saves you a week in tweaking GLSL code, then you can't afford not to :wink:

There's a few other existing projects that convert D3D bytecode to GLSL too. I'm thinking about taking that direction in the future rather than making a new language on top of HLSL/GLSL.

In the past we looked at making a HLSL-like language and a transpiler - I didn't work on it, but it only took one person a week to get the prototype working.

In Topic: How to make an Direct3D program work in OpenGL context?

Yesterday, 11:59 PM

Both D3D and OpenGL use left handed coordinates, but a small tweak to your projection matrices let you work with right handed coordinates. Just use the same math/matrix code for both APIs (except for the fact that your GL projection matrices need to use a NDC with -1<z<1, and D3D projection matrices need to use a NDC with 0<z<1).

In Topic: Amd openGL driver crashes

Yesterday, 08:16 PM

Whichine of your code does it crash on, and what's the exact crash message?

In my experience, NVidia drivers tend to tolerate (and even encourage) non-standard code, whereas AMD drivers tend to follow the letter of the spec and crash if you deviate from it. So I wouldn't assume a driver bug immediately.

It's been a while since I read the GL spec... So as a guess, try getting the status of the link operation before detaching the programs, instead of after.

In Topic: When should I use int32_t ranther than int

Yesterday, 06:33 PM

Yeah I'm in the "use int/uint/float by default, and other sized types when you must" camp.

You should only specify the size when it actually matters and let the compiler do the work of figuring out the best size as much as possible.  The places specific sizes will matter vary but an incomplete list would be things such as file formats, network IO, talking to hardware at a low level and various other relatively uncommon tasks.  So, for instance:
  for (int32_t i=0; i<10; ++i)  // Don't do this.  
instead use:
  for (int i=0; i<10; ++i)

 
I disagree with this. If you suddenly compile for a platform where int is 16 bits, you may end up with all sorts of weird and hard to find bugs. This has actually happend to me. I've always considered the part where int/long/etc are compiler defined to be an incredibly flawed design decision. It's supposed to help with portability, but in my opionion/experience it does the exact opposite since the same code might behave completely different on another target.

Technically if you follow All8's advice including the bolded bit, then you won't get weird bugs :wink:
"When it actually matters" is:
* When you need to use a specific memory layout -- e.g. you require a struct have a 4 byte field.
* When you're working with "large" ranges -- e.g. a list of 100k elements.
 
So All8's loop should be:
static_assert( 10 <= INT_MAX );
for (int i=0; i<10; ++i)...


In my case, I don't care about porting my game engine to a 16-bit platform, so I'm actually happy to include something like: 
static_assert( INT_MAX >= 0x7FFFFFFF, "Code assumes a 32/64 bit platform. Porting will be hell." );

int doesn't give you any guarantee on the number of bytes it uses. So when you need this guarantee you would need to use any of the sized integers, e.g. for serialization. For performance there is actually int_fast32_t. E.g iirc on the PS3 the fast integer was actually 64bit and they didn't want to make the regular int 64bit. We used fast int in for loops. Personally I would now worry about this.

PPC64 could do int64 at full speed, so there was no need to avoid int64 for perf reasons, but int32 was also full speed IIRC -- there were instructions to operate on half a register as an int32. Most compilers should still implement int as int32_t, as it's the same speed but half the RAM usage.


In Topic: Basic multithread question

Yesterday, 05:21 PM

Desktop CPUs run all the time and use the same amount of power because if you don't use the cpu the kernel will do.What happens is that your kernel take the cpu time, to do house keeping work and on top it runs nop if nothing is to do.


The "house keeping" work a kernel does is neglectable. Maybe sometimes you have some stupid "services" run in the background that e.g. create index lists of filenames. But the kernels (Be it Windows or Linux) them self do not do this kind of stuff.

x86 (Since the 8086 to be accurate) uses the HLT command to put the CPU/Core into a halt state. And via IRQ/INT the CPU/Core wakes up again. So there is no need for a NOP busy loops.
FWIW, if you're actually writing a busy wait / spin loop for X86 these days, then you really must put a very specific NOP instruction in the loop body (the one emitted by _mm_pause/YieldProcessor), which not only signals that hyperthreading should kick in immediately, but that the CPU should enter a low power state until the loop condition is met and to de-pipeline the entire loop body on the fly.

For highly contended locks, it can be useful to spin for some nanoseconds before actually putting the thread to sleep via the kernel, as it's likely that this short pause is enough for the lock to become available.

PARTNERS