Jump to content

  • Log In with Google      Sign In   
  • Create Account

Shannon Barber

Member Since 23 Jun 2000
Offline Last Active Sep 23 2016 07:35 PM

#5309750 Faster Sin and Cos

Posted by on 06 September 2016 - 08:52 PM

What kind of windowing are you using?

#5307563 CryEngine V scripting in C#? Really?

Posted by on 24 August 2016 - 03:17 AM

I believe Lua is the "preferred" (for lack of a better term) scripting in CryEngine but they did add a C# interface.  


The reason why so many Unity games look so similar is due to the readily available common asset pool.  

The technical issues of the engine would be next in line and whether it's C++ or C# is pretty far down the list.  


The days of pushing a PC "to the limit" are just gone and left in the dust behind us.  

There's not even a lot of focus on getting everything out of the GPU because all of the blockbuster titles are console ports.


Basically unless you are writing AVX based algorithms what language you use isn't relevant anymore (with regard to performance).

And you can always some bits in native code if you need to and dynamically link it in.  


To me, the portability of the language and its supporting libraries is the dominating characteristic and that's where C# just comes unglued. (I wish C++ 14 was released ten years earlier.) YMMV.

#5307560 Can a catmull rom take the place of a dubins curve?

Posted by on 24 August 2016 - 03:07 AM

In context, Catmull-Rom has stability issues. It could easily produce a very unnatural curve.  

Whereas the a Dubins curve is better behaved because of the constraints it follows (you guide it with the tangents-constraints).

#5307119 Ide For Linux

Posted by on 21 August 2016 - 10:56 PM

Sublime if no one has mentioned that yet

#5303564 Multithreaded User Interface

Posted by on 01 August 2016 - 05:46 PM

You just have to render it to a separate plane and if you're not running that low-level then render-to-textures and implement your own double or triple buffering.  

It's worth looking for an extension to get a plane if you try it.



I'm not sure where the notion of fast-enough UI code is coming from - in my experience it's the slowest thing there is.

In cars we use separate threads for overlays because they are so slow and we have to meet performance targets for rendering gauges.

#5303224 Is This A 'thread Deadlock'?

Posted by on 30 July 2016 - 01:55 PM

That's a primitive (with a reader-writer) which means updating it is atomic which means it's not a locking problem.

He didn't use 'volatile' so the check got optimized out.
Adding locking will probably fix the issue but it's not guaranteed! and it's overkill for such a simple case.



volatile int g_done;

#5302834 Should video game mechanics be scripted?

Posted by on 27 July 2016 - 07:12 PM

Most will need to be coded.

You're unlikely to get the performance of the effect you want if you implement it all in script.


#5301027 Finding that balance between optimization and legibility.

Posted by on 16 July 2016 - 07:10 PM

There is no possible way that code speeds up the calculation of sin and cos values for vectors and it introduces problem with reentrancy (it's not thread safe).


When optimizing this sort of code there are three things you must do to achieve state-of-the-art performance.

1) Ensure you are using the greatest known mathematical reduction of the algorithm

2) Eliminate all branches (even if it means more calculations)

3) Use vectorizing operations, e.g. SIMD, NEON, AVX, et. al.  


Optimizing the code for vectoring operations can be very annoying.

Algorithms tend to favor separate arrays for each element/dimension as opposed to interleaved arrays which are more conveniently to deal with.
This cuts down on loading and packing time of the MD registers and that can be critical to utilizing all available computation units.


Doing the above and eliminating any IEEE-754 or C-standard overhead (e.g. if the rounding rules of the unit is different than the standards then it has to perform a conversion when storing) is how you make it fast.

The old fsincos instruction got it done in about 137 clock cycles; SSE2 and newer should have faster or more vectorized options.

If you can sacrifice accuracy then you can use an estimation of the sin and cos values and those algorithms are generally just multiplies and accumulates and you can get it done in a lot less than 100 clock cycles.

#5294857 responsiveness of main game loop designs

Posted by on 03 June 2016 - 02:53 PM

To do this optimally you need a buffer of input events with hardware time-stamps.

I was hoping that DirectInput would evolve towards that but it was abandoned and we're back to sucking messages from the pump.

That at least provides an ordered list of events but without the time-stamps you cannot even implement something as simple as pulling back the plunger for a pinball game accurately.

You get quantized to your polling rate and experience jitter corresponding to your input-stack and thread stability.


You could compensate for this by introducing the uncertainty of your input into your hit detection.

When you receive an event you know it happened between between 'just now' and one polling period ago which gives you a delta-time.


Humans do not act "at 5Hz".  

I can push a button for less than 10 ms and routinely demonstrate how HMI's cannot handle button presses that quick (and we show on an oscilloscope that the button was indeed pressed for 7~12 ms).


60 Hz vs. 120 Hz makes a notable difference for FPS games and it's probably due to the triple-buffering.

3 x 1/60 -> 50 ms. That's an eternity.

#5290999 Custom editor undo/redo system

Posted by on 10 May 2016 - 12:15 PM

The (de)serialization (second) approach can have a significant performance impact for moderate data-sets.

In a tool I wrote long ago we had to stop doing it that way and use the command pattern with undo/redo stacks (otherwise every time the end-user made a trivial modification there was a pause as the data was serialized.)

#5290183 Is it C# Territory?

Posted by on 04 May 2016 - 08:03 PM


Your good reason is "We have hundreds of thousands of man hours invested in our giant aging C++ code base, thus we'll be keeping that around. kthxbye."


Sunk cost fallacy...


A good reason would involve comparing the expected results of the new product against the quality of the existing product to see if the value of improvement exceeds the implementation cost and risks.



... and that cost would be hundreds of millions of dollars.

#5289321 Question about Open World Survival Game Engines

Posted by on 29 April 2016 - 02:50 PM

You're going about this completely backwards. Normally seeing you gather a team of engineers based on your pitch (and proper compensation of course), and you let them decide what technology to use for the project, since they'll be able to make a much more educated decision than you ever will.


I actually disagree. Just like running a guild, you have to pick the game and schedule first then recruit people that fit and want it.
If you just grab a bunch of people because they are good at X/Y/Z you will end up with a group of talented people that have no feasible way of working together.

#5289313 Question about Open World Survival Game Engines

Posted by on 29 April 2016 - 02:15 PM

For a professional project I would see if I could license Forgelight 2 (that's the SOE/Daybreak engine used in PS2 and H1Z1).

Next I would look at Unreal 4 and determine the work it would take to scale to large worlds - ARK is progressing albeit with some issues.

Practically speaking Unreal 4 is way easier to "get off the ground" with than wooing Daybreak to license FL2 to a vaporware team - especially when the plan is to create a competing project.


Programmers will be the least of your problems.

(There are a lot of programmers with boring day-jobs that will be willing to moonlight on an interesting project.)

The biggest problem is core business organization and project management.

There is no demonstrated way to complete such a project *on schedule* without an operating budget.

Mods, total-conversions, get made but they are generally not completed according to the original schedule.

The next major problem is the creation and integration of high-quality artwork and sound.


For structuring such a thing I would create a bitcoin-mining-like value generation and then assign it to the people working on it.

There's some software infrastructure to create for tracking (e.g. a tray-icon tool you click to clock-in/clock-out, detect away, detect screwing around on reddit, et. al.) and just that infrastructure could be its own company.

By bitcoin-mining-like I mean I would track the time people put in and have accomplishment of milestones unleash value that is then distributed among the people that contributed time to make it happen.

That amount becomes how much of the company they own (e.g. issue private stock).

#5289298 Returning by value is inevitable?

Posted by on 29 April 2016 - 01:34 PM

Anyone care to explain what is wrong with the static variable solution? I've used this for years and always thought it was a convenient and fast solution.


It's not technically different that returning the result in a global variable except it's even less obvious what it going on.

(Others have pointed out why this bad; breaking thread-safety and I presume also exception safety.)

Over the past 10 years or so all such code in the C standard library has been deprecated and replaced by better designed functions.


e.g. You suggested they do this:

float g_result;
void do_stuff(float a, float b)
   g_result = sqrtf(a*a + b*b);

#5289293 Data alignment on ARM processors

Posted by on 29 April 2016 - 01:16 PM

The best way to solve this problem really depends on the details.


Given what you have posted: since you have to perform big/little endian reordering anyway, you can get de-aliasing for free if you use a macro instead of a function as-long-as you write the macro to access the data byte-by-byte.


If the function adapts to the system/data then you need a corresponding no-swap macro that is hard-coded to move 4 bytes (that will be much faster than invoking memcpy for such a small amount of data).