Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!

1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Member Since 18 Jan 2008
Offline Last Active Today, 01:58 PM

#5216248 Vulkan is Next-Gen OpenGL

Posted by samoth on 13 March 2015 - 05:20 AM

That's true but it's hard to call OpenGL itself open-source. That's just an open-source implementation.

But the "open" in OpenGL really only means that the standard is an open standard, not that its complete implementation has to be open source (which, however, doesn't mean that open source implementations aren't allowed and don't exist... Mesa being one example).


Now, given the fact that all IHVs basically must supply a working OpenGL implementation for at least a couple of years to come (personally, I'd guess two decades) to cater both existing software and existing developers who are unwilling to migrate to a different API (quite possibly, Vulkan will not be an API for everybody!), it makes sense to provide a single open-source implementation (maybe a major IHV joint venture) which covers the current state of the art, or most of it.

At that point, when the open standard also has an "official" open source implementation, I find it legitimate to call it "open source", too. Nothing funny about it, if you ask me.


Once you have that single working implementation, all you need to support is your platform-specific Vulkan layer. As long as this works, everything else works, too, and the open source community can (and will happily) take care of maintaining the other beast.

This is both cost-effective, and enduser-friendly.

#5215719 Packet safety

Posted by samoth on 10 March 2015 - 04:21 PM

Why maximum period would be 255?

8 bits of state can have 28 = 256 different states. Ideal random number generators have periods that are of length 2N-1 (don't ask me why it's 2N-1, not 2N, there sure is a mathematical reason for that too, but I wouldn't know it).


Anything you do in your pseudorandom generator is deterministic, so X inputs cannot possibly give more than X outputs (though they can give fewer if the generator doesn't have maximum period).


If you want longer periods, you need more state. One method to do that is to simply use larger integer types (but be cautious, the magic constants on the xorshift generator are not just some arbitrary, random constants, they need to have precisely chosen characteristics as described here, so if you use a different integer size (32 or 64 bits) you need different shift constants).

Another method to add state is to have a "lag register" which is nothing but an additional memory location (or an array of values) that you update one value per iteration. This is how e.g. Mersenne Twister manages to get such a long period without running excruciatingly slow -- it does one iteration for every random number that you want, reading and updating a single, different seed from an array of 623 integers.


The amount of state is not necessarily the same as the number of bits of output, though. You can easily have a generator with 1024 bits of state that outputs 4 bits at a time. You will still only have 16 possible numbers in those 4 bits, but the period will be much longer before it repeats.


The other problem with 8-bit random numbers is that 8 bits are way too little. 16 bit is something I might be able to guess in considerable time if I'm very lucky. 32 bit is something that's unfeasible to guess (I can't practically keep sending you billions of packets until one makes it through). On the other hand, if there are only 256 possible sequence numbers, I only need to send 256 packets and one of them will be guaranteed to be correct. A slow home internet access is nowadays able to send out 2,000 packets per second no problemo (some people will be able to send 20-30 times as many).

You really want 16-bit sequence numbers at the very least (32 bit is better). While it is still possible to try them all through, it's not very practical.




"xorshiftstar1024() & 0xff" can generate same numbers repeatedly.

Yes, it can and it will. That's a consequence of the fact that 1024 bits of state map to 8 bits of output, there is no other way than there be collisions (and incidentially the same number twice in a row, sometimes).

However, that is relatively harmless (OK, for a sequence number that is actually pretty bad, but what you describe isn't really a sequence number, it's more a kind of "obfuscated check number") since it will not generate the same sequence of numbers over and over again, even if some numbers repeat every now and then. In fact, this is something that makes a wire analysis much harder. If, say, both the hypothetical internal states 123456789 and 987654321 generate the number 4, you can't tell from looking at the traffic (which only shows "4") what the internal state is. (Truth being told, on a not cryptographically secure generator, you actually can... but not that easily, not by looking at one or two packets. You'll need dozens or hundreds of packets for that and a lot of knowledge.)


If you think it's a problem, you can add the output of the generator to your current sequence number, so sequence numbers will only repeat once they wrap around (be wary of possible undefined behavior on signed integer overflow).

#5213860 Not sure if scam...

Posted by samoth on 02 March 2015 - 03:35 AM

Also ask yourself why a poorly English speaking person would do with half of the amount it takes if you can't afford to give him 26 keys, and how he got to negotiate with over 40 publishers in the first place, given his English skill (and settle a deal for one key each with each of the 26 quite popular ones).


Besides, I'm the attorney of the late Mr. Bambuga Mumbago who left 20 million dollars to his heirs. Unluckily, for political reasons that I can't disclose, they cannot directly access the money. If you are willing to cover some minor legal fees ($2.000) and receive the money so you can pass it on to the rightful owners, I shall give you a $100.000 reward. But hurry, if none of the heirs has claimed the money by next week, the bank will confiscate everything.

#5212898 Adjusting the direction

Posted by samoth on 25 February 2015 - 05:33 AM

Unless objects can only ever rotate on two axes, this is impossible -- you need at least two vectors.


Why? Well, because a single velocity vector only tells you in what direction your object looks, but not how it is rotated around that very axis. So there is an infinite number of possible solutions, each one being equally correct and incorrect.


So, in addition to your "forward" vector, you also need an "up" vector. If you are OK with objects having one degree of freedom less, you can simply use the world's "up" vector. Do a cross product on the two and you get a third vector. If you used the world "up" do the cross product backwards to orthogonalize the "up" vector, too (if your object has its own up vector, that's not necessary since it's already orthogonal). Normalize.


Now you have three unit vectors that are orthogonal to each other. There goes the upper left 3x3 portion of your 4x4 rotation matrix.


Translation goes into the right column.

#5212889 Why do games not have items 'one sale' in their stores

Posted by samoth on 25 February 2015 - 05:01 AM

I think one reason may be that it allows a moderately wealthy player to gain currency by "doing nothing", which is almost always undesirable. In a single-player game, it makes the game too easy (you're supposed to gain fortune by killing stuff, not by standing in the shop), and in a multiplayer game, it negatively impacts the economy by inflating inflation (inflating inflation... say that three times!).


Unless a lot of care is put into the pricing or unless there is always a huge gap between bid/ask, a player could (and some will!) buy everything that is on sale regardless of need, and re-sell it with a margin. They'll do that all day until they can afford all the best items.

#5212728 Texture updates: something I'm missing or driver quirk?

Posted by samoth on 24 February 2015 - 11:10 AM

I don't quite understand what you are doing with "default buffers" or what these are (maybe a few lines of code would help better understanding what you're doing).


Regardless, glTexImage2D has three (not two) modes of operation:

  1. No buffer object is bound and data = nullptr    → Reserve memory for a texture of the specified size, do nothing else (not even zero memory). If something happens to be in that memory, you will see it. Whatever it is, it's undefined. It might just as well be some old buffer's contents that "looks about right" but is a few frames old.
  2. No buffer object is bound and data != nullptr   → Reserve memory for a texture of the specified size, and then read pixel data from data. Presumably this works by copying into an unnamed buffer object nowadays, but it's still OpenGL 1.1 functionality. Which, in any case, must run synchronously since the driver cannot know for how long the data pointed to by data remains valid, so don't expect stellar performance.
  3. Buffer object is bound   → Reserve memory for a texture of the specified size, type-pun data into an unsigned integer, and read pixel data from the buffer object starting at the offset data. This is what you will want to do in almost every case.

glTexSubImage2D, by contrast, only knows two very similar modes of operation: Copy data either from a buffer object (if any) or from the pointer data (if there's no buffer) into already allocated texture storage.

#5212159 How to limit your FPS ?

Posted by samoth on 21 February 2015 - 03:28 PM

A few nitpicks and corrections (which I consider important details nevertheless) on these:

YieldProcessor - Either just a NOP, or an energy efficient NOP on newer CPUs. Basically an incredibly tiny sleep. A must if you're ever building a low-level busy wait (which is something you should probably never be doing...)

According to the official documentation, it's about enhancing performance, not so much about saving energy or a tiny sleep:

Improves the performance of spin-wait loops. When executing a “spin-wait loop,” a Pentium 4 or Intel Xeon processor suffers a severe performance penalty when exiting the loop because it detects a possible memory order violation. The PAUSE instruction provides a hint to the processor that the code sequence is a spin-wait loop. The processor uses this hint to avoid the memory order violation in most situations, which greatly improves processor performance. For this reason, it is recommended that a PAUSE instruction be placed in all spin-wait loops.

SwitchToThread - go for a trip through the kernel to see if there's another thread you can switch to. IIRC, only gives away your timeslice to other threads of equal priority within your process. Probably still conserves a bit of power while it wastes time.

This function picks another thread that is ready to run on the same processor (not necessarily from your process!) if there is any. In other words, this will never cause a thread to migrate (which is a good thing). The bad thing is, of course, if there isn't any other thread ready for your current CPU, you have only burnt CPU for nothing (but if you were willing to give up the CPU, that isn't so harsh since apparently you didn't have anything urgent to do).

Sleep(0) - very similar to the above, a tiny bit less strict on who it's allowed to give up time to.
Sleep(1) - actually give up your timeslice for sure.

These two will cause threads to migrate under certain conditions (depending on the number of quantums since they've last run), which is possibly/likely disadvantageous overall.

The documentation of Sleep(0) states that the thread gives up the remainder of its time slice but remains ready to run. Which means that it is immediately re-scheduled with other threads of the same priority. Insofar it is very similar to sched_yield(). However, Microsoft leaves it undefined where on the queue the thread is placed whereas Linux defines that it is placed at the end. That means that as long as a different thread of equal priority is ready to run, it is guaranteed to run under Linux whereas under Windows you simply don't know what's going to happen.
In my experience, Sleep(0) is desastrous, don't use it. I've sporadically had 60-100 milliseconds pass before the thread is scheduled again. Which, of course, isn't quite acceptable.

Sleep(1) is still not very precise before Windows 8.1 (and to make things worse, Windows 2000/XP/vista round up whereas Windows 7/8 round down, see e.g. here) but it is much closer to what one would want. You might get 2 or 3 ms if you're unlucky (sleeping doesn't guarantee your thread runs again immediately afterwards, it merely guarantees that it becomes ready after the time you specify), but that's as good as it gets on an no-RT OS.



This is somewhat of a solution for the bad resolution of both Sleep and timers but also the exact opposite of what one would want to do to be CPU or energy-conservative, causing about 15 times as many timer interrupts and also 15 times as many context switches (since the scheduler runs more often, too).


Changing the timer resolution is frowned upon by a lot of people, but in practice, it's what everybody does. Google Chrome and Windows Media Player being well-known examples of applications that do it (and arguably for no good reason).


Note that the system uses the minimum time of what any application on the computer sets, for all processes, and if you forget to reset it, the resolution will stick.

#5211505 What do you use for handling sounds?

Posted by samoth on 18 February 2015 - 12:20 PM


I wouldn't realistically work for less than 50,000€ per year, so that's what I'd have to assume as "salary cost" whether it's being paid or not. Anything else would be fraudulent.


Well, gamedev is my hobby not a commercial endeavour and I don't pay myself for it. I wouldn't be earning money for that time so it has no monetary value. If I was taking time off work or doing it as a self employed job that would be different. At present my costs for developing my game are zero. [...]

That is certainly one valid way of seeing it, but the other way is surely valid, too. What's interesting is what the FMOD guys (who have to pay their bills, too, so they'd probably rather like you pay them than not) think about the "worthless" time that you have invested the moment you start making a little profit (maybe just a few thousand) with that game, since then it isn't "non profit" any more. They might be cool or they might not be. They might say "OK, time didn't cost you anything, live and let live", or they might say "Woah, dude... there's at least like... 5,000 work hours in this project, and time is money" and sue you for breaking the license agreement. It's impossible to know their point of view on that (unless you explicitly ask them).


The view that free time is money is, albeit in your favour, for example applied when you take up a loan to build a house (at least it is in Germany, might be different in other places). If you do the interior fitting yourself, the work that you put in (which is "worthless" since you do it in your free time without pay) counts as "extra capital" and is subtracted from your loan during risk assessment and interest calculation.

#5211411 What do you use for handling sounds?

Posted by samoth on 18 February 2015 - 06:19 AM

Althought consider, that FMOD is free for many scenarios, check it out.

That can turn out being a tough one, though. It's free up to 100k USD everything included.


I wouldn't realistically work for less than 50,000€ per year, so that's what I'd have to assume as "salary cost" whether it's being paid or not. Anything else would be fraudulent. Assuming a mere 1€ = $1 (which is not so unrealistic now -- thank you Mario Draghi, may you die from a pancreatic tumor!) this rules out any game which takes 2 years to produce, assuming zero cost otherwise (no artist and writer to pay, no computers, no sound recording equipment, no office, no software licenses, no electricity, internet, website, advertising, ... nothing).


With somewhat more realistic figures that include the things and people you inevitably have to pay, the 100k limit places you much closer to any game you can plan, develop, and ship in under half a year or so. How many non-trivial games have you conceived, developed, and shipped within half a year? Me: none.


Seeing how obviously the prime interest of the FMOD makers is (like everybody's prime interest) getting paid, how do you negotiate this after the fact?

Maybe they're a lot more relaxed than one would think, I wouldn't know. However, I wouldn't want to take the chance of suddenly being confronted with a library fee based on "Oh come on, this must have cost more than 100k" or any such thing. Not if something that is completely free is readily available.


The fee is of course very moderately priced and totally acceptable when you're employed by a company (someone else is paying, who cares), and it's mighty fine if the crowd over at Kickstarter gave you 100k for your project, but paying $500 $3,000 for "nothing" (as compared to just using a different API that doesn't cost anything) is a bit of a different thing for an independent developer who has to pay the bills himself. laugh.png

#5211397 Hiding savedata to prevent save backup

Posted by samoth on 18 February 2015 - 04:57 AM

The obvious things (such as anything on the client can be broken, and don't do this, don't annoy people...) have been said, but if you still insist, what I would do is something like this:

  1. Calculate a 64-bit CRC (or cryptographic hash if you will) from the latest valid savefile, and set the file modification time of the folder where savefiles are stored to that value (or, some other folder, the data file folder if you have one). This will not prevent someone from re-imaging his drive but it will prevent the most naive attempts at copying over old savefiles. Sophisticated backup software, unlike Explorer copy, also retains modification dates, but then again, who remembers to also copy the containing folder, or an unrelated folder like the data folder! It's clever enough to fool the most naive users until someone discovers how it works and makes a post on the internet (which is probably a week or two for a not-so-wellknown game).
  2. If you want to be more clever than that, you can use alternate datastreams or extended attributes (somewhat at the expense of portability), but that will not hold someone who has a minimum of technical knowledge back either (it will thwart average users, though). ADS is stunningly trivial to use if you know how (just append the "magic formula" to the filename), and a surprisingly good way to hide stuff from an unaware end-user type of person. It's also a good way of totally fucking up if anything different from NTFS gets involved, unluckily (though this may indeed be an advantage for you, prevents copying files to FAT filesystems...).
  3. Also save every level you've visited (randomly generated, I figure, as "roguelike" implies?). This not only allows the player to go back to a level with all loot where he dropped it (which is a nice feature), it also gives you a somewhat bigger data set, which means backing it up is more expensive. Though of course, in an age of terabyte harddisks, who cares if you have 10 or 15 megabytes of save data. Saving every level also gives you the possibility of creating a hash chain, so level 3 will validate level 2, and level 2 will validate level 1. The last one could validate the last valid save file (either instead of, or in combination with a file modification time or any such hack).
  4. Use encryption. Yes, you said you don't need it, but you do. It is much easier to do nasty secret stuff and to keep your little secrets if the other person cannot trivially read everything. A binary file which is still somewhat parseable if you know how is nice for holding back the casual end-user, but a binary file which is only "total random garbage" (encrypted) is better. It needs not even be a particularly good encryption, since breaking the encryption is trivial for anyone dedicated to do so anyway (seeing how both key and algorithm are stored on the machine). But even poor encryption will reliably ruin the average user's experience unless your game is famous enough that people start putting up automated crack tools onto the internet. If that happens, congratulations, means you've succeeded.

#5211386 What do you use for handling sounds?

Posted by samoth on 18 February 2015 - 04:14 AM

If OpenAL is what you want and what you feel comfortable with, why don't you just use it? OpenAL-Soft is actively developed and nowhere near deprecated.

#5210797 Modern C++ Book?

Posted by samoth on 15 February 2015 - 04:51 AM

Although Stroustrup's book is obviously the one book, and although it has been recommended by notable experts e.g at CPPCon14, I can't really agree on that recommendation.


Not only is the book's form somewhat... well, let's say extravagant (looking like having been photocopied from a manuscript by Mr. Stoustrup himself, different margins on half of the pages, odd and even pages mixed up, etc). it's also a very, very gentle introduction as if for people who have never programmed in their lives. At least that's what I felt like reading it.


But maybe that's just me... anyway, I had expected something much more "hardcore".

#5209862 Bug when dealing at around 1,000,000.0f of coordinates

Posted by samoth on 10 February 2015 - 02:29 PM

Working on a game of universe scale... I can say float point precision was the first thing I noticed. Solution? For me it was dividing the "scene" up into 3 coordinate systems. 1) Universe scale (where 0.001f was 1ly). 2) Solar System Scale : (0.001f was 1au). And 3) Local scale (where 0.001f = 1km).
But why not simply use 64bit integers where "universe scale" is measured in units of, say, 1,000 km and "solar system scale" is measured in units of nanometers?


This is, unlike floating point, perfectly deterministic with no rounding issues, no surprises, and no weird special cases, and uniformly distributed precision.


Nanometer resolution should be enough for everybody. Outside a solar system, it doesn't matter whether you're off a few thousand kilometers, since the next closest thing is at about 1012 kilometers, so there's no observable difference and no meaningful way of travelling other than with some hypthetical faster-than-light speed. Hence a 1,000km resolution is mighty fine, too.

#5209600 Does gpu guaranties order of execution?

Posted by samoth on 09 February 2015 - 08:44 AM

Yes, although there is a "but...".


The GPU generally makes no promises and gives no guarantees whatsoever, and indeed works much differently from what one would "intuitively" expect.


However, the graphics or compute API that you use (such as e.g. OpenGL, CUDA, Direct3D) will usually give one or the other guarantee, and most of the time it does not matter in which order operations happen anyway.


Unless of course, when it matters... that's when you need to use things like barrier (CL) or memoryBarrier (GLSL) or glFenceSync on a higher level, or functions like glTextureBarrier.


Now, when does it matter in which order things are processed?


As a rule of thumb, it usually doesn't matter as long as you stick with the more "traditional" render pipeline:

  • It doesn't matter whether you process vertex 5, 34, or 732 first, they are not dependent on each other. You wouldn't know a difference and you don't care.
  • It matters that all vertices of a primitive (such as a triangle) have been processed before the geometry shader is invoked. The implementation ensures this is the case, simply by processing all the vertices, and then invoking the geometry shader (you need not care).
  • It matters that all vertex/geo/tesselation stuff (... belonging to one draw call) is done before the fragment shader is run. Again, this is trivially assured by how the pipeline works.

Triangles are rasterized to fragments with some unknown (unknown to you) method and are then processed in parallel in groups of 2x2 or larger (this is necessary for partial derivatives / mip calculation). Some fragments may be shaded although they are not part of a triangle at all, they will be discarded but are still shaded. Some may not pass a test (depth, stencil, whatever) and be discarded. Some fragments may be shaded twice (think fragments on the diagonal of a fullscreen quad, which is really just two triangles from the point of view of the hardware). Some will be weighted using some known or unknown or tuneable function (think multisampling).


Usually, rather than just 2x2, something like 64 or so fragments will be processed in parallel in a shader core running the same identical instructions at the same time (with several thousand queued, swapped in and out on demand to cover for texture/memory latency), and a few dozen or hundred execution units will run independently of each other.


Whatever! Not your problem! It is guaranteed (by the API contract, so it's finally the driver's problem) that what comes out is the same as-if everything happened exactly in the order that you specified. This is still relatively easy for the implementation to guarantee, since while you are allowed to read pretty much everything, you can only ever write to a single exactly specified location (in other words, you have gather functionality, but not scatter). So all the implementaion really needs to be doing is not mess up its own order of rasterization and blending.


So far for the easy part. Now there are atomic counters and shader load/store, which allow you to do... scatter -- write to more or less arbitrary locations, concurrently. This is where it gets ugly.


If you use shader load/store, you must take extra care. Writing to haphazard variables or memory locations not knowing which one of your fragments will be shaded first can, and will, lead to surprising results. It doesn't make a difference whether fragment 43772 is shaded before fragment 43775 if each one can only ever write to its own output, which is under the control of the driver. But it matters a lot when they both write a value to memory location 123456 or if they both modify a counter, and this happens in a different order than you had expected.

#5209564 Arcsynthesis OpenGL tutorials contradict OpenGL documentation?

Posted by samoth on 09 February 2015 - 04:55 AM

I don't think they strictly contradict each other. The way buffer usage flags are worded is admittedly somewhat hard to understand and not entirely unambiguous in every way. That's of course intentional, too, since the usage flags are very general hints, and not mandatory (the implementation may totally ignore what you tell it).


STREAM usage suggests that you upload data (once), then use it once or maybe twice (e.g. for drawing something) and then don't need it any more. On the next occasion, very soon (usually the next frame), you will do the same thing, but with new data. Think of presenting a video. Once displayed, the old frame isn't very interesting any more, you likely want to display a different one next time.


That is, in other words, more or less exactly what Arcsynthesis says ("set data [...] generally once per frame"), and it is in some way what the official docs say, too ("modified once and used at most a few times").


Why would OpenGL want to know anyway? The implementation/driver might choose not to allocate a dedicated block of GPU memory for your data, but instead DMA it in right when it's being used. Since you're only going to use that data once (or maybe twice) that will probably do, and reserving GPU memory for data that is accessed many thousand times is more efficient.