Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 13 Sep 2012
Offline Last Active Yesterday, 09:18 PM

#5270379 Question about data and resources protection

Posted by TheChubu on 10 January 2016 - 06:03 AM

Multiplayer games will end up checking these files as well, as it is trivial to modify these resources to gain a competitive advantage.
Very true. I was thinking more on the lines of an MMORPG. Wouldn't want anyone replacing the smoke grenade sprite with an empty one in a game like CS:GO or CoD.


I believe once heard of a game where the players would lower the graphics setting to a minimum since cloaks/smoke grenades at low quality settings affected visibility less than at higher quality. That sounds like a hard nut to crack : /

#5270322 Question about data and resources protection

Posted by TheChubu on 09 January 2016 - 06:15 PM

If its a single player game, that stuff will be local and what Josh/Swift/Sean said applies.


If its a multi player game, the important data will be in the server (player progression, stats, characters, etc), so it shouldn't be modifiable by the player, and the unimportant data (models, textures, sounds) will be local resources, for which again, what Josh/Swift/Sean said applies.


Being overly paranoid with the game data is annoying from the POV of the users. More often than not some parameter isn't exposed in the options menu, and people have to dig through the games files to fix it and play their game, typical scenario in older games: Game resolution/refresh rate was set higher than what the monitor supported, so people would edit the config files so they could at least launch the game.


Easy modification is what allowed modders to fix hundreds, if not thousands of big and small bugs in Elder Scrolls series. And being pragmatic, if your users go out of their way to fix something in your game and make said modification available to your users, thats good for you, someone just gave their time to you in a silver plate for no charge.


Let them mod.

#5269710 D3D11 Shadow Mapping with Deferred Rendering

Posted by TheChubu on 06 January 2016 - 05:52 PM

When exactly is not relevant, it just needs to happen before the light pass.


In the light pass, for the lights that can cast shadows, you sample the shadowmap and compute lighting from there (instead of doing the normal pass without shadows). That result gets added to the light accumulation buffer. You add there all lighting results (shadowed or non-shadowed).

#5269538 How to support multiple rendering paths in my engine?

Posted by TheChubu on 05 January 2016 - 08:34 PM

The way I see it shaders aren't the hard part but the data flow:


  • With deferred you just collect all opaque objects, draw them, then collect all lights, then draw them (more or less).
  • With forward shading you have to do everything simultaneously, ie, grab an opaque object, collect data from lights that affect it, then draw.


I'm not even sure if its worth it, I'd focus on having one good render path first, since that will take a lot of time anyway...

#5269520 [SOLVED] Uniform buffer actually viable?

Posted by TheChubu on 05 January 2016 - 06:48 PM

It also does really feel like a "hack", and it really only works with instancing.


You think that is a hack? Well, Unity stores instance ID numbers as part of the texture coords so I'd say its an improvement biggrin.png And I remind you you're using textures to store things that aren't textures what is that if not a hack? tongue.png Providing an instance ID its always been a hack because the API doesn't provides a direct way to do it. Only very recently with gl_DrawID and even that isn't widely supported nor has good performance. Its an issue you need to work around given what you have.


Anyway, here is Mathias explanation about how you can have IDs for whatever you want to draw, thats implemented in Ogre3D:




TL;DR; Setup a fixed attribute from indices to 0 to MAX_INSTANCES once, and just manipulate it with instanced call instance id. Oh and before you mention "I'd have to draw everything with instanced draw calls!", Mantle doesn't even has non-instanced drawing IIRC, probably Vulkan won't have either, so I'd suggest to get used to it.


 so this is technically doable for some shaders where it's clear how much data will be written from the start


Not really necessary. You don't need to know that, you just need to know the size of the data your struct instance contains and how much memory glBindRange can handle in your GPU. Like this:


// Allocate temp buffer up to max bindable ubo range.
Buffer buffer = alloc(MAX_UBO_RANGE);
// Uploaded task counter.
int tasksUploaded = 0;
while(!buffer.full() && !tasks.empty())
  Task t = tasks.next();
  // Pad to vec4 if necessary here.
// Here ring buffer works its magic.
ubo.bindNextRange(TRANSFORM_SLOT, buffer.size());
// Draw what you have uploaded so far.
drawTasks(tasks, tasksUploaded);


There, something like that, its just an iteration, write all the data to a buffer, then glBufferSubData or glMapBuffer it to your UBO. Then repeat until you have drawn everything.


(Here you can find a nice explanation on the internal differences between UBOs and TBOs http://www.yosoygames.com.ar/wp/2015/01/uniform-buffers-vs-texture-buffers-the-2015-edition/ )


Again, this reduces state changes and interaction with the driver dramatically. A similar technique is suggested in a GTC NVIDIA presentation, although they do things slightly differently. Look it up, it was called something like "Advanced OpenGL scene rendering", GTC presentation, various PDFs around that had data on how does the indexing into the shader impacts performance vs the amount of time it saves on the CPU side (overall win even on older hardware from what I read).


Graham Sellers, of AMD/Mantle/Vulkan/Modern OpenGL fame, also mentioned in an Ogre3D thread to store all meshes in as few buffers as possible, separated only by vertex input format. Here, read all the thread, good stuff in it:




In that way you can also reduce all the buffer/vao binding to a minimum.


I don't think I'm able to do that from Java besides hoping that the memcopy function I'm using for unsafe memory access outside the Java heap (= C performance) does that under the hood, which seems unlikely.


Now that you mention it maaaybe HotSpot does something like that. Although probably works just for copies between Java arrays. Maybe Spasi an do something about this in LWJGL... 

#5269326 Questions About Blur Effect

Posted by TheChubu on 04 January 2016 - 09:02 PM

This Intel article its a good resource, discusses blurring techniques, optimizations and other considerations:



#5269275 Java, still being a good option for game dev in 2016 or there are other optio...

Posted by TheChubu on 04 January 2016 - 04:41 PM

The single biggest problem you're going to have with Java is version-of-the-week hell.  Write once run anywhere is a joke. 
Have you had an issue where desktop Java applications broken inbetween releases of the same Java version? (ie, from Java 7 u60 to Java u72 or something). Otherwise its an unfounded myth.


I've run my projects in more VM versions of OpenJDK and OracleJDK that I can count, played Minecraft with all the damn updates that came every two weeks or so, never had an actual VM compatibility problem. They do take retro compatibility very seriously (to a fault even), thus why no API was ever actually removed from the runtime.


Hell I've never in years had an issue where Eclipse would crash because some VM was incompatible, and Eclipse its a massive application.


And please, don't even mention applets. They shouldn't exist, its a Good Thing™ most of them stopped to work.


In any case, there wouldn't be any single damn problem if you just provided a link to download the JRE but Oracle is composed top to bottom from a pile of steaming stinky assholes and they bundle crapware with their damn JRE distributions (luckily they dont do it with the JDK). So yeah, bundle a JRE (20Mb to 40Mb, libGDX guys provide a tool  to reduce the size of the VM by removing unwanted crap).


Still this is an issue that you will have in some measure whatever you choose. C# needs the .NET runtime (or Mono depending on the platform, which is a whole other issue altogether), Java needs the JRE, C++ will need whatever MSVC runtime you're using (or some specific glibc version depending on the platform), etc.


The answer to all of those is: Ship the dependencies with your application, and save yourself a headache.

#5269080 [SOLVED] Uniform buffer actually viable?

Posted by TheChubu on 03 January 2016 - 05:51 PM

Here: http://www.gamedev.net/topic/655969-speed-gluniform-vs-uniform-buffer-objects/ I ended up implementing the idea I had at that time.


There was also a discussion with Mathias that I can't seem to find, he explained the instanceId with more detail.


Anyway, say that we have some per instance data. Like mv and mvp matrices:


Thats a single struct:


struct Transform
  mat4 mvp;
  mat4 mv;
  // Then some padding to respect std140 if necessary. 12 bytes / vec3 at most.


Now, thats 128 bytes per struct right? If you wanted to place them sequentially on a buffer and bind the range for each Transform struct, yeah, you'd need to place 128 bytes, then pad, for every Transform instance.


Lets say we got our typical 64kB UBO and we define it like this

layout (std140, binding = TRANSFORM_SLOT ) uniform TransformsBlock
  Transform[MAX_TRANSFORMS] transforms;


Where MAX_TRANSFORMS its max ubo size divided Transform instance size, given our 64kB UBO, that'd be 512 instances. Tightly packed.


Now the issue here is that while now you don't need to pad, since you're binding a lot of transforms at the same time, you need to index into the array to get the proper one for whatever you're drawing. There are many ways of providing an instance index, like with an additional attribute, with a normal uniform, with the instance ID, with a combination of instance ID and a vertex attribute, etc. I think Mathias talked about the instance indexing in the Vulkan thread, can't remember.


Anyway, once you got the index per instance uploaded its as straightforward as:


mat4 mvp = transforms[instanceId];


There, now you just need to bind the whole range to that TRANSFORM_SLOT, no more padding in between instance data.


Also, have in mind that you shouldn't put everything into a single buffer. Separate them between "globals" (stuff that never changes), per frame parameters, and per instance parameters. And choose appropiate update strategies for each. Probably mapping a global or per frame buffer for a single tiny update is a waste, glBufferSubData would suffice.


The strategy I use right now is to have a sort of ring buffer of a couple MB. I compute the maximum amount of instances I can upload in a single pass, given the kind of buffers that pass needs (say, TransformBlock and MaterialBlock).


Say that the max is 512. First I bind the ring buffer. Then iterate over the transform data, upload those 512 instances, then bind that range to the transform slot. Then iterate over the material data, upload 512 instances, then bind that range to the material slot. Then draw those 512 instances. Rinse and repeat for the rest of the draw tasks. The only padding I have is in between the kind of block I'm updating. Each block itself has its internal array of structs tightly packed.


Since its a ring buffer I just upload to the next available range, until it wraps around and starts again, by the time it wraps around that data will be quite a few frames old if you give it a couple megabytes.


That means that I can draw 10 thousand different things with 20 updates, 20 bind ranges, and only one buffer binding (drawcall count is a different matter, ideally you can also draw each instance batch with a single draw).


You can get smarter and pack UBOs in a way to reduce the calls even further via passing reduced forms of the matrices, packing different kinds of data into the same struct (ie, instead of having separate slots for transforms and material, just put them in the same struct), uploading all of the instance data in one step, and then just do a loop of bindRange-draw for all of them, and so on. That way you can handle batches of thousands with a dozen calls tops.

#5268936 Criticism of C++

Posted by TheChubu on 02 January 2016 - 09:55 PM

Yup. Language wars are silly because it doesn't make sense to use only one language in the first place. Every language has its weak points and every language except for Java has its strong points.
 biggrin.png I'm above that shit Khat, I'll just upvote you.

#5268767 Benefits from manual function loading.

Posted by TheChubu on 01 January 2016 - 08:24 PM

. I was wondering if there was any benefit to manually loading functions for OpenGL rather than using a library like GLEW or similar.
If your goal is to waste your time, then that would be one benefit.


Use a lib: GLFW, GLEW, SMFL, SDL, etc. OpenGL is hard enough as it is, you don't need to artificially create more complexity around it.

#5268750 Help understanding glewinfo.txt

Posted by TheChubu on 01 January 2016 - 06:50 PM

There are two things in play here: The OpenGL context that you can create and the OpenGL features you can use.


As you guessed, with your card in Windows you can create an OpenGL 3.1 context, and an OpenGL 3.3 context in Linux.


What does that means? That you can use all the features up to that context version, contexts are inclusive so OpenGL 3.3 context has 3.0, 3.1 and 3.2 features. If your card didn't supported a 3.2 feature but all of 3.3 features, that doesn't means you can create a 3.3 context since you're missing the required 3.2 features, and 3.3 includes 3.2.


Now, while your card supports those contexts, it also supports some of the features of other OpenGL versions. Thing is, since it only supports some features, then you card can't give you a higher context. However, you can use those other features as extensions.


So you create the highest context your card supports, and if you want, import the rest of the features your card supports as extensions. That way while you can't say, create an OpenGL 4.2 context, you could use some of the features introduced in that version via extensions.


For example, I use a 3.3 core context in my application, but most of the D3D10 cards out there support a few useful extensions of higher OpenGL versions, like arb_texture_storage, arb_shading_language_420pack, etc.

#5267051 My A* Hierarchical pathfinding.

Posted by TheChubu on 19 December 2015 - 12:08 PM

I see various places where you can simplify equality tests a bit:


// This is a more canonical equals, with the instanceof operator, which is a bit more robust.
public final boolean equals(Object o)
  if ( this == 0 ) return true;
  if (o == null || !(o instanceof Node) ) return false;
  return this.equalsNonNull((Node)o);
// Now we do an equals for when we know we're comparing nodes.
public final boolean equals(Node o)
  if ( this == 0 ) return true;
  if (o == null) return false;
  return this.equalsNonNull(o);
// Most reduced scenario, we know its a Node and that isnt null.
public final boolean equalsNonNull(Node o)
  // Here we can use the reduced equalsNonNull of Coordinate.
  // If it could be null, use equals(Position) instead.
  return this.position.equalsNonNull(o.position);
// Now equals for Coordinate objects:

// Most generic case.
public final boolean equals(Object o)
  if (this == o) return true;
  if (o == null || !(o instanceof Coordinate)) return false;
  return this.equalsNonNull((Coordinate)o);
// Equals for when we know we're comparing coordinates.
public final boolean equals(Coordinate o)
  if ( this == 0 ) return true;
  if (o == null) return false;
  return this.equalsNonNull(o);
// Most reduced scenario, we know its a Coordinate and that isnt null.
public final boolean equalsNonNull(Coordinate o)
  return (this.x == o.x && this.y == o.y);


And use them where they're needed given assumptions in the surrounding code (ie, you don't always need the generic equals(Object), sometimes you know you're comparing Node/Coordinate, or that something can't be null).


Also, reduce your objects, don't use "Coordinate" that only has two fields. Simply put the x/y on the Node or something.


For each object you have 12 bytes of overhead, and each object access is more or less a pointer indirection (unless HotSpot can work some magic there too). So a typical "Position2D" object would look in memory like this:


12 bytes overhead

+ 4 bytes x coord

+ 4 bytes y coord

+ 4 bytes for 8 byte alignment (everything is 8 byte aligned).


Total of 24 bytes and one mandatory indirection for only 8 bytes of useful data. So flatten a bit your structures. Also look up alternative HashMap and ArrayList implementations, like from fastutil's. 


Can't help with the algorithmic complexity though biggrin.png

#5266369 good approach to randomly generate the contents of a dungeon or room

Posted by TheChubu on 14 December 2015 - 05:35 PM



... hehehehehehehe



#5266055 Cross Platforming: Switching to Java?

Posted by TheChubu on 12 December 2015 - 01:43 PM

Oh, well, but is not maintaining Java on multiple platforms way easier?
I might get downvoted but I'm going to say yes. Library linking/loading is standard and behaves the same in all the OSes that (desktop) Java supports.


Now the issue is that desktop Java isn't the same as Android Java. Yes, with desktop Java you get fairly simple multi platform support in all major OSes, with deployment of the application itself as simple as copying exactly the same .jars in all of them. LWJGL is well made, it will automatically load the native lib of the platform you're running the application on (for all the combinations between Linux, Windows, OSX, x86, x86_64).


But Android is different, you will have to code specific parts for it (input, display, sound, etc), moreover, you will need to "downgrade" your language support for whatever Java 6/7 bastard Android supports nowadays. iOS was supported through RoboVM for free, but Xamarin bought the company and moved it to their kind of strategy (ie, gotta pay up monthly). Same scenario if you want to use C#.


Also while you can reasonably expect the runtime of any OS to run your C++ program (or at worst you need to bundle some tiny binary, say a MSVS2015 redistributable), with Java you need to either bundle a 40-60Mb VM (not as complex as it sounds though) or provide a link for the user to download the VM from Oracle's site (and remember, Oracle bundles crapware with their JRE isntallers). Moreover, the "executable" itself might be multi platform, but it wont get you an OSX installer, Windows installer or a .deb package. That part you have to do on your own, probably regardless of the language you're using.


I still think its an scenario better than what you're left with C++, there are plenty of parts of the standard lib that are the same across desktop and Android, deployment is simpler albeit heavier, and ultimately, Java is a much simpler language to manage than C++.

#5265828 ECS: Systems operating on different component types.

Posted by TheChubu on 10 December 2015 - 10:27 PM

Thats an issue I often find myself trying to solve.


See the thing is, you totally can (and I've seen it) create a "messaging" api between systems. Ie, one system fires up an event that goes straight to another system. Which would be the case of the "activate" event for example.


Thing is, I feel like event/messaging api is bolted on. The system "should" write data into an entity's component, and the system that handles the response should fetch the data when its their turn. As you described, this isn't something simple to do in many cases.


The specific issue here is when this happens. With event systems you need to figure it out because its very important, imagine that if the physics system casts a ray and fires an event, the receiver does something with it, but it probably will be a system that also iterates over its entities. If you make a single event system, they might get processed at the same time, if you do it per system ,you could make a queue that only gets processed when that system is processed, or you could try to do without events and just write data, read data, and make it part of the normal entity-component iteration.


Order of system processing and system inter-dependencies are very important for getting consistent results, finding bottlenecks and possibly multi threading some of it.


Its still one of the things I'm not decided on, thus why I haven't added an event api on top of dustArtemis.


One idea I had for this kind of problem is to have one system in charge of maintaining a needed spatial structure. When entities get added/removed, system iterates over the affected entities and maintains its spatial structure.


Different systems can hold a reference to that system, and issue queries to it. So the system thats in charge of activating stuff (SystemA) can query "well, for this direction and reach, SystemB, give me what I am hitting". Thats a data query, and means that SystemA depends on SystemB to work on, that means that SystemA has to be executed after SystemB has updated its internal spatial structure for that frame. That makes dependencies obvious.


Then it would be a matter of:


for (entity in entitiesBeingHitted)
  if (entity.hasComponent(ActivableBehavior)


Then the system in charge of activated stuff can iterate over the activated entities in that frame and (possibly) execute trigger scripts or something, which in turn could plug other events in other subsystems. In this case I think adding/removing components to entities, and whatever consequence that has in the engine, should be a fast operation for this to work properly.


It might get overly complex, and maybe a straightforward but well defined messaging api between systems is better, as I said, I'm evaluating my options.