Jump to content

  • Log In with Google      Sign In   
  • Create Account

Erik Rufelt

Member Since 17 Apr 2002
Offline Last Active Jun 15 2016 08:30 AM

#5270609 Enable Z buffer only for ALPHA CHANNEL > 0

Posted by on 11 January 2016 - 04:59 PM

Enable Z buffer and use "discard" in your shader, also known as alpha-testing.


Here is info about alpha testing in D3D9:



(Note that if you want the edges of the net to be partly transparent but not completely, or if you want the net to instead be colored partly transparent glass or similar this solution won't help, so in that case you need to draw all opaque geometry like the player first, and then as the final step in your rendering process draw all alpha-blended surfaces sorted back to front).

#5269865 Sorting with dependencies?

Posted by on 07 January 2016 - 11:47 AM

Say you have N objects with M "sort values", where M0 is prioritized as required depth order, M1 is shader, M2 is texture, etc...

Start by sorting by M0, to yield X <= N subgroups with equal M0. Then sort X0 by [M1, M2, ...] to minimize state changes when drawing the objects at the back. Then sort X1 the same way, but re-order the shaders in M1 used for that sort so that the last one that was used in X0 is at the front, same with M2 etc.

Loop over all Xi to do the same, always with the internal sort-values in Mi modified to put the last used value in Xi-1 at the front.


If you care about more than M0 and M1, you would have to recurse for new subgroups Y <= NX after sorting Xi to minimize changing texture when the shader changes within Xi.


Should be possible to optimize reasonably, as every recursion would only have to find the matching subgroup of Mi+1 and move it to the front of its parent..



You could make it more advanced by saying that M1 is valued as three M2 for example so if you had to change texture four times to avoid changing shader once you would re-order the priority of M1 vs M2 .. but I can't quite picture the exact complexity of that.. probably significantly worse.

#5269823 A better way to communicate with relay server?

Posted by on 07 January 2016 - 07:37 AM


InternetOpen followed by Connect / HttpOpenRequest / HttpSendRequest if you use HTTP(S). Can automatically use system proxy settings where required.

#5269627 How can I draw a svg to a SDL_Surface with nanosvg?

Posted by on 06 January 2016 - 09:09 AM

Write the 'img' buffer to a file and open it in some image editor that supports raw files or just in a hex editor to make sure it contains somewhat proper data. If it does then the problem is in getting it onto the SDL surface, and if it doesn't then the problem is in drawing the SVG. Either way you know where to investigate.


(Also, in this case you code seems to do absolutely nothing with the SDL surface.. you create it and lock it... and.. then just exit your function?  Probably some logic error there, maybe you want the surface to be a reference or have your function return a pointer to the new surface).

#5266037 OpenGL Efficient Rendering of 2D Sprites

Posted by on 12 December 2015 - 10:17 AM

One thing to remember is that it's usually not too bad with a drawcall for each sprite unless you have on the order of at least several thousand. GL draw-calls are pretty fast, assuming you still use a texture-atlas and don't switch textures for each one (+ sort by texture either way, even with the atlas).

It's uncommon that sprites have any other bottle-neck than fillrate, unless you do something like very many small particles. If your sprites are reasonably large, then sorting by texture or even area of a texture to improve cache usage can give much more benefit in itself than every possible optimization of the vertices does.


When sprites are sorted by texture then changing textures won't matter too much, so several atlases isn't a bad idea, and if you have sprites with many animation frames you will probably end up with one texture per sprite for the animated ones to fit all the frames.

Count the number of texture switches, if you only save a few texture switches by combining an atlas with another then it probably isn't worth it, but if you need to switch atlas between every other sprite then combining them has a large benefit.

#5265960 Issues With Sprite Batching

Posted by on 11 December 2015 - 05:56 PM

I don't really know of one... always use interleaved arrays :) But you can have all positions in one VBO and all texture coordinates in another VBO if you really want to.

It could be good in special cases, for example if you have a vertex that is very large, say 128 bytes per vertex, and part of that vertex, say 8 of the bytes, have to be dynamic and be updated for every vertex every frame, whereas the other 120 bytes are static.

If you have 1000 vertices in your model, then you can either use one VBO and update 128k data each frame, or you can use two VBOs and separate the small part of the vertex that needs to be dynamic into its own VBO, and only update the small VBO with 8k data per frame while keeping 120k static.

Another usage can be instancing (though I've never tried it like this). This link talks about it under instanced arrays https://www.opengl.org/wiki/Vertex_Specification

#5265830 Issues With Sprite Batching

Posted by on 10 December 2015 - 10:56 PM

It still only sees the last VBO that's been bound.


Either way you always need one VAO per VBO. But if the same VBO is drawn twice with different shaders then it can share the VAO if the shaders are compatible.


(Though technically a VAO can use more than one VBO, if you don't use interleaved vertex attributes and get some of them from a different VBO, but a VAO will always have "attribute 0" from exactly the last glVertexAttribPointer used on index 0 with the VAO active, any new call for an index will overwrite the old setting for that index)

#5265724 Issues With Sprite Batching

Posted by on 10 December 2015 - 07:58 AM

Yes, multiple programs with the same VAO must have the same inputs (vertex attributes), and they must have the same (or compatible) indices.

So if one program has "position" bound to attribute index 0 and another has it on attribute index 1, then it will go wrong.


In general I would have one VAO for each combination of (shaderprogram, vertexbuffer, indexbuffer), or you could share between shaders if you know they use the same attributes.

#5265537 Issues With Sprite Batching

Posted by on 08 December 2015 - 11:28 PM

The VAO knows the bindings from your glVertexAttribPointer as well as glEnableVertexAttribArray to the vertex-buffer, so you need a different VAO per batch. Only the last batch created will be the "active" VAO state, as it will overwrite the previous batches created (the glVertexAttribPointer for the same vertex-attrib indices for each batch will overwrite those made for the last batch).

The first long answer here seems to give good explanations: https://stackoverflow.com/questions/26552642/when-is-what-bound-to-a-vao

#5264772 FPS frustrations

Posted by on 03 December 2015 - 12:48 PM

That may explain a lot of things. Vsync can't really be perfect in windowed mode.. though it depends on your OS and system settings. On Windows 8 and up (I think) they swap the whole desktop I believe.. so your Present actually just copies to another buffer (which is always RGBA Float16 I think, not sure exactly), and then swaps. Also not sure if this is actually true on desktop always or just on the metro stuff.. perhaps someone else here has done more apps targeting Windows beyond 7 and knows for certain?

There are some articles on MSDN about this stuff if you're targeting those systems.. they changed quite a bit. This is the reason (again, I think, or at least probably partly the reason) why you can't shut off the DWM composition on later Windows...


In general.. what happens if you have two Windows that both present their buffers to the screen, can they be VSynced?

The driver may be able to make this happen.. but it's sortof imperfect.. window mode is not really the target for vsync. There's also the whole frame-buffering issue. In fullscreen the driver will buffer up frames and schedule them for swapping when VSync is approaching, which isn't really enabled in windowed by default (you may be able to use some DXGI functions to override it, not sure, and probably also required latest Windows, as it's dependent on the new composition, again sortof guessing :)).. not sure if it's a technical issue in window mode or just that they don't bother, to avoid having to deal with edge cases when there are two apps running DX etc.


To get window mode VSync as good as possible, I've found the best way is to call Flush or Finish or whatever the function is called in your particular API version to make the driver finish everything on the GPU, then after that call Present with VSync or disable VSync and manually wait for it using DXGI or DirectDraw, to present when the last update is finished (VBlank I think it's called, the slight pause between one screen redraw finishes and the next begins).

Then make sure you wait between frames until that vblank is done so you don't present twice on the same vblank which can happen if your app would run at like 1000 FPS if it was allowed to, so combine with some wait or sleep like you have now.

#5264741 FPS frustrations

Posted by on 03 December 2015 - 09:42 AM

Are you doing window mode by the way?

If so that sounds like a pretty good solution, but if you do fullscreen you really should try to use vsync.. it will be quite apparent if the game has any camera movement and isn't vsynced.

#5264631 FPS frustrations

Posted by on 02 December 2015 - 02:54 PM

I think your problem is almost certainly in DirectX or related, and not in your app.

Go into DirectX Control Panel (64 bit if your app is 64 bit), and then on the Direct3D 9 tab set all debug levels to max, and choose the Debug version. Then build in debug and run with the VS debugger so you get all debug output in the VS output window, and you should get warnings if you have any.

Once there are no warnings, set everything to release and no warnings in the control panel, then build a release version of your app and run that (without any debugger attached, like from Windows).

Also clean and reinstall your graphics drivers (latest non-beta version) and perhaps the DirectX runtime as well.

(If you don't have debug versions in the control panel you need to install the DirectX SDK I think..)


Also, if you have a bunch of debug prints or something disable all of those and all logging in release mode, such things could interfere. Text output in DirectX probably allocates buffers every frame as well so try disabling them as well just to rule it out.

Many Windows functions work differently and may use different memory allocators when run with a debugger from VS.

#5264422 FPS frustrations

Posted by on 01 December 2015 - 10:25 AM

Depending on the GC it may just ignore garbage collection until there is enough to collect.. though it does seem quite strange. Is there something in C# to force the GC to run each frame?

I know on iOS Apple has scoped pools for GC, and recommends running each frame with a pool that is cleaned up immediately after, to avoid any buildup that could cause frame-skipping.


In general your problem isn't uncommon.. lots of people complain about this all the time.. and sometimes gamers get it in games on Steam etc (though rarely what you describe here, mostly like 1 frame skip or similar). What OS/hardware are you running, and do you have anything like auto-switching between discrete/integrated GPU?


Do you run in release mode?

Otherwise try that, like build a release package and run it from outside VS. Could be something much simpler like DirectX outputting debug information. (Also make sure to check all DirectX debug information, and turn on the debug mode in the DirectX control panel, and read all the messages in the output in VS when running with the debugger, some unexpected states there could impact performance in strange ways).


And finally make sure you don't somehow leave any DirectX objects to clean up. It may run its own garbage collection that you probably can't see in VS... for example if you update dynamic buffers every frame then depending on exactly how that is done DX may choose to keep allocating new memory every frame instead of reusing, and then sooner or later run out and have to defragment VRAM or similar. (This especially could be different depending on DX running in debug or not).

#5259630 Drawing with GDI works, but I can't do the same thing in GDI+

Posted by on 29 October 2015 - 03:14 PM

Note that it does return the DC correctly, it's just not flagged for update at that time.


For example, if you have a large window of say 500 x 500 pixels with an image in it, and you drag another window above its top left corner covering an area of 100x100 pixels, then WM_PAINT will be sent by Windows and BeginPaint will return a DC that will only paint on those 100x100 pixels that were invalidated by the other window.

The reason for this is so that no unnecessary drawing occurs.


If you want to flag a specific area for update and receive a WM_PAINT you can do so with InvalidateRect.


I would guess that the DefSubclassProc handles WM_PAINT and calls Begin/End-Paint and as such validates the update-rect, so that your BeginPaint call returns a DC with an empty update-rect, clipping all your draw-commands.


You can read about this on MSDN: https://msdn.microsoft.com/en-us/library/windows/desktop/dd183362%28v=vs.85%29.aspx

#5259590 Drawing with GDI works, but I can't do the same thing in GDI+

Posted by on 29 October 2015 - 10:27 AM

Don't call startup and shutdown on GDI+ on every draw.. call startup once when starting up your program and then shutdown once when the program exits.

Also you call EndPaint twice in your code.


Lastly your working examples use GetDC while the GDI+ version uses BeginPaint, perhaps DefSubclassProc already clears the dirty rect so nothing is left to draw (the DC from Begin/End-Paint will mask to only draw to areas where it thinks the window requires an update, and set that area to not requiring updates anymore so the next draw will do nothing).