Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!

1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Member Since 14 Feb 2007
Offline Last Active Today, 10:07 PM

#5080454 Boolean operations in shader assembly

Posted by Hodgman on 25 July 2013 - 08:20 AM

Yeah I meant actual dynamic variables in the shader functions (i.e. non-uniform ones).
You can have 16 'constant' (uniform) bools and 16 ints, but these types don't really exist at runtime - there's no instructions to operate on them. If you want to work with them, you'll be copying the results into temporary float registers.

As Washu mentions above, the 16 bool constants are generally used as a 16bit mask that controls static branching.

#5080447 Is OpenGL or Java Great for Real Time Object Shadows?

Posted by Hodgman on 25 July 2013 - 07:55 AM

"Good real time shadows" is an algorithmic problem. The languages and APIs used are completely irrelevant, so the answer is mu.
There's reasons to criticize GL, and reasons to criticize Java, just as there's reasons to criticize anything... but no, these tools will in no way impact your ability to implement good shadowing algorithms.

What is the strategy to get that great performance of shadows?

What have you tried?

does Java or OpenGL get a "bad rap" in the real time rendering of shadows
I hear or read both Java and OpenGL getting criticized for a performance hit in using them (together or exclusively) for real time shadows.
I specifically want to get at what seems to me is a myth about a significant performance hit when implementing real time shadows in Java or OpenGL

Start with telling us where you got this myth from.

#5080353 implementing scripting system in C++ using LUA or Python

Posted by Hodgman on 25 July 2013 - 01:47 AM

Don't think about each object "running a script" -- after all, when you write a C++ program for enemies in the game, you don't worry about how to make one enemy "run a program", and how to make another enemy have "it's own program".


Just think about some of your code being in one language, and some in another. Lua scripts == Python scripts == C++ code, they're all just code.


If in C++ you'd put health inside an enemy class, and then have 100 different instances of that class, then you can do the same in Lua or Python (possibly using tables/dictionaries if not classes)...

#5080313 OpenGL 4.4 spec is published

Posted by Hodgman on 24 July 2013 - 08:52 PM

I'm excited about ARB_Sparse_Texture, though I'm a little confused as to why they don't support any of the 3-component texture formats.

GPU hardware hasn't supported 3-component texture formats for a long time (aside from packed formats like DXT1).

If you ask GL to give you an RGB texture, on the GPU it will allocate an RGBA texture and pretend that the alpha channel doesn't exit...

I think the biggest improvement is the conformace test. From what i head from who prefer DX over GL is that the drivers sometimes have different behaviours for different cards (with openGL). With this change, all the driver will(?should?) have the same behaviour, making it easier to develop openGL programs.

Yeah that's something that I always have a whinge about, so this makes me very happy biggrin.png

[the ARB] has created the first set of formal OpenGL conformance tests since OpenGL 2.0 [and] full certification is mandatory for OpenGL 4.4 and onwards

#5080121 What program is this?

Posted by Hodgman on 24 July 2013 - 07:30 AM

The brainstorming bit at 0:55 looks like FreeMind/XMind/etc

#5080089 Why companies still use C++ and what should I learn then

Posted by Hodgman on 24 July 2013 - 04:53 AM

Look at it this way: When hiring a new driver, the McLaren F1 team would look for someone who knows how to drive a Formula 1 car... but that driver would not have begun their career driving an F1 car; they would have started out driving a cheap two door hatchback with a learners license along with the rest of us! wink.png


C++ is a complex language that not only makes it easy to shoot yourself in the foot, but it ensures that you'll blow off your whole leg when you do so.

If you want to learn C++, it will be much easier to do so after you've mastered a similar language, so that you can just focus on it's quirks/pitfalls, rather than all the basics as well!


If you learn C# first, then learning C++ later will be easier. And before you ask if that's a waste of time, experienced programmers learn new languages all the time. Sometimes you're expected to read the manual for a new language one day, and start programming with it in the next day. The ability to learn a new language is an ability that you'll also need later, so it doesn't really matter which language you start with, as long as that language acts as a comfortable learning area. In the long run, you'll learn so many different languages that you'll actually get really good at the ability to learn new languages!

To begin with though, learning a new language is very time consuming, so it's best to choose a comfortable one.


If you start with C++ first, you will likely have a bad experience with all it's sharp edges, and then not enjoy learning new languages wink.png


[edit] P.S. I'm a C++ fanboi, who uses it daily; it's my favourite language, despite it's many problems. I learned it early in my career, and did have a bad time (learning through a lot of painful mistakes).

I also use many other languages, because they're often a more suitable tool -- C++, despite my obsession, is not always the best tool to be using.

e.g. I love using C# for making my game's GUI tools because it makes a lot kinds of tasks very easy for me to solve quickly... I use Lua to write my gameplay code, because I find it's a better fit to those tasks, and I get stuff done quicker with less bugs. I'll use PHP to make quick web-apps, even though I think PHP is a horrible language! I use Java for my server code, because the person I'm collaborating with is more comfortable using Java. [/edit]

#5080069 Multi-threaded AI

Posted by Hodgman on 24 July 2013 - 03:37 AM

The problem of using multi threads with AI is that it is very hard to avoid race conditions.

Whether this is true for any area (AI or otherwise) completely depends on the paradigm that you're programming within. There's plenty of paradigms, like functional, where you can easily write multi-threadable software, yet race conditions simply don't exist (aren't possible).
Yeah, if you take an existing OOP system and try and shoe-horn in parallelism, it will be extremely difficult to avoid race-conditions, or to implement parallelism efficiently... which is why this kind of planning should be done up front.
There's two main categories of multi-threaded designs: shared-state and message passing. The right default choice is message-passing concurrency, however, many people are first taught shared-state concurrency in University and go on using it as their default choice.

I will use pathfinding as an example because it is the most common use of AI I have seen in this forum. Say you are running A* on your graph, if one thread has processed a node and another thread changes that same node, the result won't be reliable. Worst results may happen, as referencing a node that no longers exist, resulting in a segmentation fault.
Using a task system is a good way to have your system executing several independent parts of the code at once (for instance, physics simulation, rendering, sound and particle effects math), but when it comes to IA is not that simple. Of course that this depends a lot on how your system works, if, for instance, your game doesn't allow path blocking (common on tower defense games), you may run pathfinding algorithms at the same time.

A job-graph of your example problem could look like the diagram below:
Green rectangles are immutable data buffers - they're output once by a job, and then used as read-only inputs to other dependent jobs.
Blue rounded-rectangles are jobs - the lines above them are data inputs (dependencies), and the lines below them are outputs (buffers that are created/filled by the job).
As long as the appropriate "wait" commands are inserted in between each job, there's no chance of race conditions here, because all the data is immutable.
Unfortunately, after inserting all the appropriate wait points, this graph becomes completely serial -- the required execution order becomes:
wait for previous frame to be complete
a = launch "Calculate Nav Blockers"
wait for a
b = launch "Check paths still valid"
wait for b
c = launch "Calculate new paths"
wait for c
launch "Move Actors"
This means that within this job graph itself, there's no obvious parallelism going on... sad.png
However, The above system is designed so there is no global shared state, and there's no mutable buffers that can cause race-conditions, so each of those blue jobs can itself be parallellized internally!
You can run the "Caclulate Nav Blockers" job partially on every core, then once all of those sub-jobs have completed, you can run "Check paths still valid" on every core, etc... Now your entire system is entirely multi-threaded, using every core, with a pretty simple synchronisation model, and no chance of deadlocks or race-conditions smile.png
Now the only problem is that due to the fact that you probably won't partition your work perfectly evenly (and you won't keep every core in synch perfectly evenly), you'll end up with "stalls" or "pipeline bubbles" in your schedule.
e.g. say you've got 3 cores, and that the "Caclulate Nav Blockers" job takes 0.4ms on core#1, 0.3ms on core#2 and 0.5ms on core#3. The next job ("Check paths still valid") can't start until all three cores have finished the previous job. This means that core#1 will sit idle for 0.1ms, and core#2 will sit idle for 0.2ms...
Or visually, rows are CPU cores, horizontal axis is time, blue is "Caclulate Nav Blockers", red is "Check paths still valid":
You can mitigate this issue by having more than one system running at a time. For example, let's say that the AI system is submitting these red/blue jobs, but the physics system is also submitting a green and an orange job simultaneously. Now, when the worker CPU cores finish the blue job, and are unable to start on the red job, they'll instead grab the next available job they can, which is the physics module's "green" job. By the time they're finished computing this job, all cores have finally finished with the "blue" one, so the "red" one can now be started on, and the total amount of stalling is greatly reduced:

I do have another question. How much benefit is possible here. Say single threaded vs multithread with 2/3/4 cores available?

In theory, 2x/3x/4x total performance cool.png
In practice, even if your software is perfectly parallelizable, there's other bottlenecks, like CPU-caches often being shared between cores (e.g. only two L2 caches between four CPU cores), which avoids you ever hitting the theoretical gains.
As to how much of a boost you can get in an RTS game... Almost everything that's computationally expensive, is parallelizable, so I would say "a good boost" tongue.png
I'd personally aim for maybe 1.5x on a dual-core, and 3x on a quad core.
People used to say that "games aren't parallelizable", but those people were just in denial. The entire industry has been writing games for 3-8 core CPUs for over half a decade now, and have proved these people wrong. PS3 developers have even managed to learn how to write games for multi-core NUMA CPUs, which has similarities to distributed programming.
To take advantage of parallelism in places we thought impossible, we often just need to re-learn how to write software... wacko.png

#5080028 Multi-threaded AI

Posted by Hodgman on 23 July 2013 - 10:34 PM

Yeah that's the way that I utilize multiple cores, and it's the same method that the last few console games that I've worked on have used too.

To really simplify how the "job" type model works, a simple API might look something like this:
struct Job { void(*function)(void*); void* argument; }; 
JobHandle PushJob( Job* );//queue up a job for execution
void WaitForJob( JobHandle );//pause calling thread until a specific job has been completed
bool RunJob();//pick a job from the queue and run it, or just return false if the queue is empty
The main thread adds jobs to the queue using PushJob. If the main thread want to access data that's generated by a job, then it must call WaitForJob (with the handle it got from PushJob) to ensure that the Job has first been executed before using those results.
Your worker threads simply call RunJob over and over again, trying to perform the work that's being queued up by the main thread.

I also use another similar pattern, where I'll have all of my threads call a single function, but passing in a different thread ID, which is used to select a different range of the data to work on, e.g.
inline void DistributeTask( uint workerIndex, uint numWorkers, uint items, uint* begin, uint* end )
	uint perWorker = items / numWorkers;	
	*begin = perWorker * workerIndex;
	*end = (workerIndex==numWorkers-1)
		? items                      //ensure last thread covers whole range
		: *begin + perWorker;
	*begin = perWorker ? *begin : min(workerIndex, (u32)items);   //special case,
	*end   = perWorker ? *end   : min(workerIndex+1, (u32)items); //less items than workers

void UpdateWidgets( uint threadIdx, uint numThreads )
  uint begin, end;
  DistributeTask( threadIdx, numThreads, m_numWidgets, &begin, &end );
  for( uint i=begin; i!=end; ++i )

#5079502 Handling ISP throttling of our servers...

Posted by Hodgman on 21 July 2013 - 11:45 PM

I'm guessing that the server's total uploads per second is more than the capabilities of your users... at which point packets start filling up a queue faster than they're being sent, which causes 99% packet loss to start occurring, which you're detecting as massive ping times.

You need to either tell your users how many players they can support on their upload bandwidth, and/or optimize your game to use less bandwidth.


Most residential DSL type plans will have ~10x more download bandwidth than upload bandwidth -- e.g. 20Mb/s down and 1Mb/s up, or 5Mb/s down and 0.25Mb/s up.

This kind of connection is fine when you're acting as a client, but is not ideal for a server.


1Mb/s is equal to ~122KiB/s. So if a user only has a connection with 1Mb/s of upload bandwidth, and you want them to be able to host 30 players, then you need to design your game so that each client only requires the server to send them <4KiB/s of data.

Most games that are designed for residental user hosting, simply just don't support 30 players...


Diagnose the problem:

* Ask your users what speeds they're promised from their ISPs.

* Ask them to use a site like http://www.speedtest.net/ to test their actual upload speeds. 

* Add code to your game to measure the amount of data that you're sending in each direction per second.


You should be able to come up with some guidelines for hosting -- e.g. the server requires 3KiB/s of upload bandwidth per player.


Also, if you add this measuring code so you know how much network traffic you're generating, you might find that certain systems are using a disproportionate amount of data, and that you've got some good targets for optimisation work...


What kind of network synchronisation model are you using for your FPS? Have you based it on another FPS game's model, like Unreal, Half-Life, Quake 3, etc? How often do you send out updates, etc? Can you tweak this, so that people with worse connections can update their clients at 15Hz, while better servers can update at 30Hz, etc?

#5079485 API Wars horror - Will it matter?

Posted by Hodgman on 21 July 2013 - 10:24 PM

PS3 OpenGL-based LibGCM

I wouldn't call GCM "OpenGL-based". It's a procedural C-API, which makes it look a lot more like GL code than D3D code, but it's design is completely different from GL's design. If anything, I'd say it's based on nVidia's internal GeForce7 driver code, stripped down (which they'd then build GL and D3D interfaces on top of).
In any case, Migi0027, there's not much chance you'll ever end up using GCM at this point. You'd only be using it if you got a job at a professional studio who was making PS3 games, and most of them will very soon be making PS4 games instead ;)

There's no war. The only platform where you have a choice is desktop Windows, and on Windows Direct3D works better. End of story

This. Every different platform has it's own "native" graphics API.
If you want to write truly "cross platform" code, you'll have to know: D3D9, D3D11, GL1, GL2, GL3, GL4, GLES, GLES2, WebGL, GX, GCM, GCX, etc, etc, etc...


Simply learning OpenGL is in no way a silver bullet for having "cross platform" code.
With many of the above, you'll also have to make changes to your code not just for each platform, but for different hardware/drivers within a particular platform. e.g. some phones may support a particular texture format, while others don't, etc... or some PC GPUs might support some shader operation, while others' don't.

Why is Direct3D preferred by the large companies?

I prefer it on Windows, because it's more reliable than OpenGL on Windows, due to the fact that most of D3D is implemented by Microsoft, and is thus the same on everyone's PC.
To contrast: GL has no central body that enforces compliance with the specification. On Windows, OpenGL is implemented in the graphics driver, so 3 users with 3 different GPUs/Drivers, might be running three completely different implementations of OpenGL! My code might work fine for one of them, and have bugs for the other two that I've never noticed, because I've never tested my code with their particular driver.

On Mac, OpenGL is mostly implemented by Apple (like how MS implements D3D) -- they've stepped up to enforce compliance with the spec, so this problem doesn't exist as much, so GL is less objectionable on Mac. In any case, GL is your only option on Mac, so all the engine companies that you've quoted do also support GL for their Mac versions.
On Linux, your only choice is OpenGL, but the situation is just as dire as on Windows. There is no central authority (e.g. MS/Apple) to ensure that GL strictly follows the spec, so like on Windows, OpenGL is entirely implemented by your graphics driver. You game might run fine on one driver, and be buggy on another driver. To test your game, you need to test it against every different GPU driver that you want to support... but, if you want to support Linux, it's your only choice to deal with this shitfest.


To sum that up:

Windows - MS does work to ensure D3D is stable. GL is still an option, but with with only each GPU manufacturer supporting it on their own.

MacOS - Apple does work to ensure GL is stable. There is no other option.

Linux - You're on your own with GL or Wine. Plus you've got distros that even refuse to support official nVidia/ATI drivers out of ideology...

Web - It's the wild west, a new frontier! Use Flash, HTML5 Canvas, WebGL, Java, Silverlight, etc at your own risk.

Mobile - You're on your own with GLES. Test every device you want to support to discover their quirks!

So personally, I choose to use D3D on Windows and OpenGL on Mac for stability/ease-of-development, even though it means twice the work inside my low-level renderer library.
Basically, I have to port my code to every different platform that I want to support. I just see this as a fact of life. With OpenGL's inconsistencies (outside of MacOS), I don't like to use it because I'm also forced to port my code within a platform (e.g. supporting different drivers).
That's not to say that GL on Windows is the wrong choice though! It is entirely usable, and certainly a lot of professional Windows games have been released using OpenGL!
I disagree with almost everything marcClintDion said above, except for this bit of truth:

you should choose which ever one feels most natural for you.
If you feel that DX is the better API then use it, otherwise don't, or learn both.

I started with OpenGL because it made sense to me sooner than DX did.

Except I'd say that you should learn both anyway, in the long run!


I did also find GL easier to get started with when I was first learning ;)

Should I port my engine to OpenGL. (I have no thoughts of making my engine cross-platform)

If you want to release it on Mac or Linux, then yes, you'll have to port it. Otherwise, there's no reason to do that work.

But my real worry is, that when going to high school/university, I've read that they use OpenGL, so will I have to relearn it all?

This shouldn't really be a problem. Once you've learnt one graphics API, it's very easy to learn others.
I don't even worry about it any more... as long as I've got a reference manual for a new API, I can start working with it almost immediately.

#5079321 What would you normally do if you have 2 sets of APIs to call upon?

Posted by Hodgman on 21 July 2013 - 08:36 AM

What he said ^
//write once:
typedef glMatrix GfxMatrix;
typedef D3DXMATRIX GfxMatrix;
//many uses:
GfxMatrix GetViewMatrix();
But in this particular case, just use the same matrix structure in both versions of the code. If you need conversion functions internally, then write them.
struct GfxMatrix {};

D3DXMATRIX Cast( const GfxMatrix& ) {...}
glMatrix   Cast( const GfxMatrix& ) {...}

//or: this may be valid if you can ensure your class and the one you're mapping to use the exact same memory layout
const D3DXMATRIX& Cast( const GfxMatrix& mat ) { return *(D3DXMATRIX*)&mat; }
However, neither D3D or GL require you to use some specific matrix type. They both accept float[16] at the lowest level... You're talking about using some "glMatrix" type (not part of OpenGL, what is this library?) and also using D3DX. You  can just pick one of those libraries and use it for both GL and D3D!

#5079245 Taxes / Deduct Contract Payouts

Posted by Hodgman on 20 July 2013 - 10:24 PM

Which country/state are you in?


Here in Australia, you have to be a company in order to sell stuff. You do this by filling out a short form on the tax office website for free, which gives you a business number immediately.


As a guess for how this would work in your situation:

If you're selling the game as an individual (sole proprietorship), then Ouya makes payments to just you. You then pay tax on that full amount (or Ouya pays the tax on your behalf), as it's income into your "company".

Your company then has a secondary expense, of paying royalties to an artist. This is just a usual business expense, and doesn't exempt you from paying the initial taxes.

You  make these payments to the other company (your artist's sole proprietorship), and he pays taxes on his income (so the money going to the artist has essentially been double taxed).


Alternatively, the two of you can register a partnership, so that Ouya can pay money into the partnership.

#5079136 How to organize a C++ project and its external libraries

Posted by Hodgman on 20 July 2013 - 08:48 AM

The external libraries I use (currently only SFML and pugixml) are not under version control yet though. I would like to change that, so that the repository contains everything that is needed to build the project. I also would like to keep it more platform and compiler independent than it currently is
Yeah I prefer to make an "external" directory, inside my project's directory, which contains all the dependencies, rather than having all my dependencies existing independently outside of my project directory.

The reason for this is simply configuration management and quality assurance. If you're relying on independent external libraries, then different people on your team (or different PC's of your own) might produce different results, due to those different people/PCs having different versions of the external libraries installed.

By bundling them into your own project, you ensure that it is always built the exact same way with the exact same code.


The layout I'd normally use would be leave the external libraries alone. My project files (CMake etc) will be a bit messy as each external library will use a different layout for it's own files, but it's easier to just leave their distro alone and deal with this slightly messy config on my end, rather than try to reorganize their files to the way I'd prefer every single time I update to a new version of their library...

project/msvc/...     compiler specific stuff (projects/workspaces/etc)
project/src/...      code that's internal to the library
project/include/...  code for users of the library
project/bin/...      where it's built to
project/external/... third party libraries required
project/external/foo/... the foo distribution *untouched*
project/external/bar/... the bar distribution *untouched*

#5079083 Do you extend GLSL?

Posted by Hodgman on 19 July 2013 - 10:11 PM

I'm redoing my GL support in the near future, but my plan is to parse my shader code offline, converting it into an intermediate representation, and then using that to generate streamlined GLSL code for the engine to use (with all includes/preprocessor statements resolved, basic optimisations applied, whitespace/variable names trimmed, etc).


Screw using raw GLSL.

#5078940 "Standard" Resolution Performance

Posted by Hodgman on 19 July 2013 - 09:43 AM

It won't make a difference to performance unless you're on some platform that can only output video signals in some restricted set of resolutions.

e.g. a hypothetical current gen console might be built to always output at 720p or 1080p, so if you internally use a lower resolution, then you have to explicitly pay the cost of resizing your framebuffer to 720p. However, these systems might also provide hardware support to assist in this operation... Nonetheless, you will burn a fraction of a millisecond performing the scaling.


If we take what he's saying less literally, and replace "performance" with "latency", he might be slightly more true, at least when displaying on a television.

Some displays will be optimized specifically for their native resolutions only. If you send them a signal for a different resolution, they might be forced to internally re-scale the signal to match their native resolution. This operation, when done by a television, often takes one frame's worth of time -- i.e. so a 60Hz signal will have 16.6ms of latency added to it if rescaling is performed inside the TV.

Technically, there is no performance penalty, because this rescaling happens "for free" in parallel, inside the television... but your overall input latency is hugely affected.

Modern HDTV's (especially early or cheap ones) are often offenders in this category -- sometimes playing current gen console games that are 720p on a 1080p TV will result in an extra 16.6ms of latency, which is rediculous sad.png

However, there's no way to query this. Some TV's wont add any extra latency depending on the resolution. Some will add latency if you don't output a "standard" signal. Some will add latency if you don't output a specific signal (e.g. 1080p only). Some TV's will always add a huge amount of latency for no good reason!

The best rule of thumb I can think of here would be to prefer outputting at the monitors maximum supported resolution (which should be it's native resolution).