Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!

1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Member Since 27 Aug 2002
Offline Last Active Apr 30 2015 11:54 AM

#5226467 Extremely modular software architecture: GOOD or BAD?

Posted by Krohm on 30 April 2015 - 03:55 AM

The question you should do yourself is not whatever being modular is good or not, being abstracted to be good or not.

The question should be: is my system "connected" to my real-world problem?


As you go over and over the design and grow abstracted you get far from the real design. You know that already.

Always keep in touch with your design. Artistic decisions do influence your system. Consider:

  • We need massive smoke-simulating particle systems VS our most intensive particle system is to simulate sparks from an hanging cable and counts 20 triangles frame
  • We need continuous streaming of our massive graphics VS this is 2015, just load everything into memory.

I encourage you in writing "exploratively". That means, yes, write stuff you're not sure about.

When you start building on top of the exploratively-designed stuff it'll be the right time to iterate it. At that point, you'll hopefully have a better understanding of the problem.

If something is "disconnected" from your problem, it should be allowed to exist only if it saves you mental sanity.



Be careful with scripting. You must think at what you want the scripts to do or not. I would honestly leave it out for a few iterations.

#5225818 Direction Booleans into single Integer

Posted by Krohm on 27 April 2015 - 06:21 AM


#5225815 Character control from a Physics Engine perspective

Posted by Krohm on 27 April 2015 - 06:14 AM

I started with a dynamic controller, but it got really messy and instable ...
I'm seconding this. Dynamic controllers just don't blend with logic easily and they are a huge pain at artist-level.


For an FPS or a platformer a kinematic controller is a must.

#5224973 GPU Ternary Operator

Posted by Krohm on 23 April 2015 - 12:07 AM

I understand why the ternary operator branches on the GPU if there is any in-line computation, like:

// Branching on the (value+1)!
float value = 3;
value = (value < 4) ? (value) : (value+1);

Where is this written? AMD IL compiler (the compiler compiling the compiled HLSL/GLSL/CL-C bytecode to HW ISA) transforms that in a register selection. When conditions evals to a vector type, it is even defined to be equivalent to an intrinsic. I'm pretty sure you're not considering OpenCL-C here. This is just FYI, I take some liberties as there are no operators at GPU level either.


TL;DR; it's implementation dependent.


High-level languages don't have the necessary control to specify how divergent code can be. That's a good problem. There should be if modifiers.

#5223610 ascii game camera

Posted by Krohm on 16 April 2015 - 02:05 AM

What he's trying to do is to scroll blocks by characters, one at time.

Instead, since he takes player x and maps it to a character by division to block size, he only gets the block the player is in.


Solution: divide character position by block size, keep the original value (which is probably already there) and consider the modulo by block size. This will give you position of character relative to containing char tile.


At this point you have to update your char tile drawing routines to draw only a part of the tile. Maybe using advanced console io you can also draw outside the visible area; should that be the case you wouldn't need to mess with sub-tile drawing. I'm not aware of those possibilities.

#5222197 Relation between TFLOPS and Threads in a GPU?

Posted by Krohm on 09 April 2015 - 04:37 AM

There's no relationship because GPUs have no real "threads" in the CPU sense. The choice (of DirectCompute) to use the word thread is so bogus I cannot believe it made to final documentation. That said...


It depends on the chip family and even on the specific segment.

The number of instructions executing per tick is just: $$ops = processingElements * clockRate_{hz}$$



Which takes us to the magic world of processing elements: what are those?

They are part of the ALU carrying out the useful work. Many people think one PE ~ 1 thread and given current GPU capabilities you can currently do that. But in practice 1 PE is a much more fine-grained element and you are given the choice of how to setup the PEs to make up "threads".


The native concept of a "thread" for GPU is the Wavefront (AMD GCN, OpenCL) or the Warp (NV). They are basically the same thing: packs of 64/32 processing elements.

I am going to use the word thread for your convenience but be warned it is inaccurate term.

Marketing wants you to believe a PE is a thread but if that would be the case then a single CPU thread using SSE would be quad-threaded.


The amount of "threads" executing at a given time (assuming you always saturate device) is 

$$threads=processingElements / threadSize$$


So for example GM200 titan x has 3072 "cores" (marketing jargon) which are really 3072 PE (CL jargon). With a warp size of 32, you have 96 threads in flight. WRONG! You have 96 warps!

If tomorrow NV decides their warp size becomes 16, you'll have 96*2 warps.

This is at each given clock.


During processing, the GPU will switch across several warps. The amount of warps in flight depends on device and the actual program being executed. There's usually an upper bound but I'm not well aware of NV architectures.


EDIT: I messed up the second formula somehow.

#5218417 Amateur Looking For Advice On Where To Go Next

Posted by Krohm on 23 March 2015 - 03:24 AM

I'm not sure Java is going anywhere so I support your idea of going away.

I honestly wouldn't start from C++ nowadays. C# and JavaScript are better candidates in my opinion for the time being.


I'm pretty sure UE4 allows some fairly extensive scripting without even writing code (through Blueprints). I strongly suggest to play a bit with some engine, maybe only from a level design perspective. It allows you to keep a view of the target at an higher level. It likely doesn't make you a better programmer but if you want to ship a whole product, you must think at the whole picture.

#5217627 Mantle programming guide and white paper released

Posted by Krohm on 19 March 2015 - 07:20 AM

Pretty ironically,


From Anandtech

So pull up a chair, get comfortable, and find large quantities of caffeine as this isn’t the sort of material for a quick read – the PDF weighs in at a hefty 435 pages. That’s pretty much par for the course when it comes to API guides though – the Direct3D 11 API is almost certainly just as long (though I couldn’t seem to find a comparable PDF).


OpenGL 4.5 core: 825 pages.

OpenGL 4.5 compatibility: 999 (yikes!) - no idea how much is really shared

GLSL: 209 pages.




#5215991 Decompressing PNG / JPG images on the GPU

Posted by Krohm on 12 March 2015 - 12:42 AM

I'm pretty sure AMD has a GPU-accelerated media pipeline. No idea how much is available, how much is GPU, how much is their own internal asics. Odds are it works with their GPUs only, anyway...




Give up man. This is only going to be painful. D3D9-level devices do not have enough flexibility and compared to modern CPUs they might not have enough performance advantage either. D3D9-SM3 devices might have been worth worth talking in 2008. Maybe.

#5214730 Placing enemies in the map/world

Posted by Krohm on 05 March 2015 - 07:59 AM

Interesting question.


As I've tried to prototype a SHMUP game some time ago I can appreciate some complications.

Is your game something like this?


Personally I haven't found existing editors much of an help - maybe I haven't searched hard enough. The main problem is not much the code but rather figuring out the numbers to put there as enemy patterns must be effectively authored in screen space and easily visualized with full time control for fast iteration/tweaking.


What I would do today: 1) have another googling at tools 2) hack together a HTML5 <canvas> utility to pour out JSON data.


This is in contrast to more static design such as in FPS: I've had no trouble adapting Blender in this case but note enemy movements in this case are very different things. Given the amount of paths, it would be quite inconvenient to pack this data in Blender.

#5213844 Dynamic Memory and throwing Exceptions

Posted by Krohm on 02 March 2015 - 02:27 AM

Maybe worth recalling somebody isn't AAA. Exceptions are extremely handy and considering the first few posts of this thread are clearly written by someone who doesn't have an accurate view of what's going on, I'd suggest to stick to what C++ suggests to do as canon as long as there isn't a specific product to talk about.

#5213151 Component-based architecture - entity and component storage

Posted by Krohm on 26 February 2015 - 01:06 PM

As a side note, if physics is your stuff, play with some physics API first!



Then you have a really weird definition of a component. His game objects are very clearly composed and not monolithic. That's all using components means, in any context; ECS is _hardly_ the first place the word "component" has ever been used in computer science or even game development.

Well, you got me there. I should have been more explicit in intending the word component in this case is to be intended uniquely as intended in CES.


#5213091 Component-based architecture - entity and component storage

Posted by Krohm on 26 February 2015 - 09:06 AM

No idea what exactly is going on there but what you have done isn't a component thing to me.

Just because you can put arbitrary "component" object handles inside an array, which allows you to build "entities" does not mean you are component based.

The above is not component based either, it's switching behaviors exactly like a monolithic entity would do. I'll agree that has some very slight flexibility added.


No idea what a "physics" component is supposed to be either. I assume it is a rigid body representation.


Here is ECS, condensed to the its core.


There are no entities. There are only the components.

See Fig-2.gif



The message "between the lines" and showcased in the above diagram is: the execution/update of components is independent from other types and can be - in theory - completely asynchronous. I dare everyone in writing a fully async, fully ECS-only system but that's for another time.


So, what does that mean? It means basically the opposite of this:

One idea that came to mind was having a vector of pointers for each type of component and passing the corresponding vector to the corresponding system.

You should really have the components exist in the systems only and link to them on need rather than keep them floating around and putting them back in on need. Seriously, where do you think rigid bodies are going to go each frame? In the physics library. Where do you think the models will go if not in the rendering subsystem? No point really in taking them out: you take out reference / pointers to them and leave them live in their own land using a base class or a proxy of some sort. Internally the subsystem accesses everything it needs while externally you don't.

#5211676 Problems with partial OpenCL kernel dispatch

Posted by Krohm on 19 February 2015 - 08:00 AM

Wait, you can specify read_only on globals? It was my understanding it was for image objects only.

Parameter 7 to clEnqueueNDRangeKernel is currently (int)ArraySize(waitOn). Leaving aside it is a cl_uint, the pointed events must complete and I have no idea what is going on with them.


Considering png typically goes with integers I would also check out the way you mangle the resulting data.


Ultimately, some drivers have watchdogs and will kill dispatches if they take too much time to run. Considering the inner loop seems to be doing nothing (the value is trashed right away) I think that's fairly indicative. Am I missing some side effect?

#5211671 terrain editor resolution based on height

Posted by Krohm on 19 February 2015 - 07:39 AM

i want to have the ability to adjust the resolution on the area where the terrain are raised/lowered,

Interesting idea in theory. In practice this would require a perfectly regular structure such as the point grid to become irregular. I guess you could do some sort of quadtree to provide more resolution. I remember a paper about quadtree-accelerated parallax occlusion mapping which could be adapted to your uses. It's not very complicated but I have doubts it's really worth it. Last time I checked the Unreal developer network it looked like not even Unreal (3) had support for that so I have doubts about its usefulness.


how do i handle or store the height data? right now the height data are just stored in a 2x2 array and each array is equivalent to one vertex in terrain grid

That's fairly peculiar. Why are you doing that? I would just use a 16-bit grayscale map. Or perhaps RGBA32 for super extra precision but I don't see much of a point in thinking at this as a bidimensional sample of sort. Please elaborate, I'm curious.


I'm pretty sure what you're looking for can be done using quadtree-s.