• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.

FGBartlett

Members
  • Content count

    6
  • Joined

  • Last visited

Community Reputation

132 Neutral

About FGBartlett

  • Rank
    Newbie
  1. Sorry, I meant lag the audio-- the audio in must be delayed to match the video lag. These are easy to build-- continuously running samplers, to a ring buffer, with an offset to drive the output sampling. The offset determines the audio delay.
  2. [quote name='Hodgman' timestamp='1341293591' post='4955170'] [quote name='FGBartlett' timestamp='1341197175' post='4954738'] You can still benefit from more threads than cores; here is an extreme example: a single core/single lane machine. Is there ever any [b]need[/b] for multithreading on such a machine?[/quote]No, there isn't very often a [b]need[/b] for it -- it may be one way to solve the problem, but I assure you there's probably a single-threaded solution as well.[color=#000000][quote]But it provides performance and behavior you can't achieve in a single threaded model.[/color] [color=#000000]as in --------Please wait....scene loading---------...[/color] [/quote]For example, background loading of scenes definitely is possible in a single-threaded game... [quote]Sure; suppose you have a prep thread that is waiting on I/O or some other condition, like a FIFO being less than half full. It can yield while waiting to another thread. If you are single threaded, then that single thread eats all the latency in your model, and latency can't be hidden at all..[/quote]Threading should not be used for I/O-bound tasks (only processing-bound tasks). The OS is designed to handle I/O-bound tasks asynchronously without using extra threads already -- use the OS's functionality instead of reinventing it with extra threads. If a single-theraded program wanted something to occur when a FIFO was half-full, it would likely use a callback that is triggered by the push function. [quote]If you have threads that never yield at all, then each thread will try to consume a complete core. (Depending on the O/S, it will still occasionaly lose bandwidth and be parked, but it will always be crying for attention. Sometimes that is a necessity, but if such tight polls or freewheeling threads can be minimized, modern O/S can manage lots of well behaved yielding threads far in excess of the number of available cores/lanes...[/quote]But if you're writing a [i]high-performance real-time system[/i] ([i]such as a modern game engine[/i]), then you want a small number of threads that hardly ever yield to get predictable performance. Yielding a thread on Windows is totally unpredictable in length, with your only guarantee being that it's unlikely to be longer than 5 seconds ([i]although yes: that case shouldn't occur unless you're massively oversubscribed[/i])... [quote][color=#000000]Headsup with the design of the threadsafe FIFO; it must use a two-step allocate and release model, because there is finite execution time between when a handle is pulled and when it is prepped or consumed. But that is easily done. [/quote]What's this "[/color][color=#000000]two-step allocate and release model"? Is that specific to your OpenGL resource FIFO, or are you talking about thread-shared FIFOs in general?[/color] [quote][color=#000000]A FIFO object is basically tracking a head and a tail in a circular fashion, with some maximum FIFO size. The FIFO should provife booleans for IsFull, IsEmpty, etc.[/quote][/color] [color=#000000]Functions like [/color][color=#000000]IsFull[/color][color=#000000] and [/color][color=#000000]IsEmpty[/color][color=#000000] are nonsensical in the context of a shared structure like a multiple-producer/multipler-consumer FIFO -- they can never be correct. It makes sense for [/color][color=#000000]Push[/color][color=#000000] and [/color][color=#000000]Pop[/color][color=#000000] to be able to fail ([/color][i]if the queue was full or empty[/i][color=#000000]), but simply querying some state of the structure, such as [font=courier new,courier,monospace]IsFull[/font] is useless, because by the time you've obtained your return value, it may well be incorrect and any branch you make on that value is very dubious.[/color] [/quote] Re: FIFO half full. There is no need for this to be precise; the point is, if you respond to the event FIFO Empty, it is too late. The assumption is that 'about half a FIFO' is enough latency to respond and keep the FIFO 'not empty', which is all the draining thread cares about. You never want to starve the draining thread or you get a stall. Re: Two Step FIFO accessors. They can always be safely used; the STL variants can sometimes be used, with more care, and if your resource model changes, you have to review each usage. So I always use the two step scheme. IOW, if the two step variants are used, they are always thread safe. If the STL single step variants are used, they are sometimes thread safe. It depends on your resource model, how and if they are cycled, reused or shared. [i]If you are using a pooled resource model [/i](resources cycled/resued)when both the filling thread and draining thread are finite in time between accessing a FIFO member and doing soemthing with it, you don't want the act of accessing the FIFO member to change the state of the FIFO; you want the act of releasing that FIFO member to change the state of the FIFO. The two step FIFO usage makes that explicit. (It can usually be implicitely handled without the two step process...and that will work as long as it is. The explicit two step process makes it harder to implement this wrong.) It can be and usually is arranged that every filling thread is done with the resource before touching the FIFO state, As long as it is. And dittto the considerations for the draining thread. The assumption is, only the filling thread adds to the FIFO and only the draining thread pulls from the FIFO. So the filling thread a] get the next FIFO slot, b] does something to the associated resource(even if just to define it)and c] releases the FIFO slot, changing the FIFO state. Ditto the draining thread. Otherwise, if the act of accessing the FIFO slot simultaneously changes the FIFO state, you could have a condition where an EMPTY FIFO immediately changes state to NOT EMPTY, the waiting draining thread accesses the resource in process, and the filling thread is not finished prepping the resource. The above assumes that some kind of pool of resources is being reused/cycled, not being continuously allocated anew by the filling thread. In that case, a filling thread -could- create the resource complete and then change the state of the FIFO and no harm. And the draining thread can as well, because the scheme is not re-using resources (like a buffer, FBO, VBO, or texture handle) but continuously allocating them and destroying them. But if you switch to a rotating pool of resources(to eliminate the constant creation/destruction of resources)then you might run into this need for 'two step' FIFOs. These two step thread safe FIFO things are actually pretty simple. They are tracking a few integers (head, tail, max, count) and maybe maintaining a few booleans(FIFO empty, FIFO full, FIFO half full, etc., to drive events). (The models I usually use don't actually try to push resource objects themselves through any FIFOs-- the resources are from a pool and aren't copied, but are recycled. A Pool manager allocates a new resource if a free one of the requested flavor/size isn't available. When the first filling thread in a chain needs a resource, it requests it from a PoolManager. When the last draining thread in a chain is done with a resource, it returns it to the PoolManager. The PoolManager, most of the time, is simply changing some state value on the resource, to make it served or available. I also usually wrap the resource with some pool attributes so I can trace which thread is currently banging on a particular resource. But the models I use push handles to resources through the FIFOs. Because the locks are on the FIFOs and not the resources, because the thread access to these FIFOs is a low % of total thread bandwidth (beginning and end of each thread process cycle), and because mods to the FIFO are trivial, you really have to work at it to serialize your threads using this model. The concept of the FIFO itself isolates resources between the threads. The locks are not on the resources, but on the objects that isolate the resources. In that sense, the resources themselves are never locked, but isolated just the same. The things that are locked are seldom accessed(on a % basis). So no thread is ever left starved waiting for a resource conflict while any length process is being done on it. A draining thread only cares about FIFO EMPTY. If its resource source FIFO is not EMPTY, it can process. If it is EMPTY, only its filling thread can change its state. Same thing with the filling thread. It only cares about FIFO FULL. If its output FIFO is not FULL, it can process. If it is FULL, only its draining thread can change its state. In most chained thread models, the gating thread is the final compositor thread that pulls resources from its source FIFOs at whatever frame rate is required. The filling threads that service the FIFOs either need to keep up or else the render thread will be starved and frames will be skipped. But that is always the case, even in a single thread model. The output is usually driven by some target frame rate. This gets hairy in real time video processing models, in which there exists both an input contract (sampled video input frames) and output contract(output video frames) This is a 'two clock' gating problem, even if it is the same clock, and in this case, the function of all those FIFOs in the process is to provide compliance for latency. This is why video processors almost always have a video processing delay; there is significant compliance in the streaming model, to accomodate latency. Video processors must lag the output video to accomodate this. You can always tell when they don't because the audio will be ahead of the video by the amount of the video processing lag.) This is why, in the old days with DirectShow, you always saw canned examples of video to disk, and disk to video, but never video to video... it was a largely rigid model tolerant of only one gating sink or gating source. You can always cache ahead disk access, smooth it out and gate video out, or gate video in and cache it to disk, but gating both video in and video out in a streaming model is a challege. And FIFO's as caches are critical elements. Also, no way anything like that happens in a single threaded model. If live video input(not from store, but live video)ever becomes a significant part of game processing, this will become apparent. Games might tolerate glitch/missed/stuttering frames in the playback, but broadcasters definitely do not. I also diagree re; mutlthreading I/O, even async. If your process spends any time at all waiting for an async I/O to complete, that time waiting can be put to better use. I just completed a project that demanded highest possible throughput to disk, and it was a streaming model that was not only async but multithreaded; in practical terms, it was the difference between a disk access light that blinked and one that was solid, running at full bus bandwidth. This also required lining up write sizes with target sector size multiples, unrelated to threading, but while this is occurring, a 400Hz update streaming waterfall plot is being handled as well, part of the same streaming chain. (The GUI thread isn't updated at 400Hz, but the FBOs are updated in the background at 400Hz and presented to the foreground GUI at reduced frame rate as a streaming freezable/scrollable in the GUI waterfall plot, without interrupting continuous stream to disk. I don't think anything close to that is possible in a single threaded model. Not only would you be trying to do it with maybe 1/8th the available bandwidth, but any time waiting for async I/O to complete is lost..
  3. Here is an approach. You have two full frame textures you want to composite in some fashion, TLeaving and TArriving. You've setup those with Texture2D samplers so that your fragment shader can sample each of them. At the beginning of the transition, you want 100% TLeaving and 0% TArriving. At the end of the transition, you want 0% TLeaving and 100% of TArriving. In between you want a uniform (set on each frame by your driving CPU code) that is varying from 0% to 100% to drive your transition. If it's a wipe, then you can imagine the transformed X coordinate in clip space to be what is driven as the boundary between sampling TLeaving and TArriving in your shader. (By 'frame' I mean, render cycle during this transtion. You decide how fast you want to drive the transition, how many render cycles it is going to take to complete. You drive that externally, your shader responds to the current render cycle/frame. If it is 60 frames, then frame 0 is 0%, frame 59 is 100%, etc. You could use an if but don't. Instead, calculate the clip X(gl_PointCoord.X, 0 to 1) for this Transition%, divide every clip X by that value, and assign to an integer to get 0 or 1.(the index of the sampler to use, not to be confused with the normalized range of the clip space, 0 to 1) Use that as an index to sample either sampler [0] or sampler [1]. The alternative, using an if(pos.x > transClipX), will have unique paths in your shader, causing a stall. A stall isn't fatal but your shader will execute faster if every fragment instance takes the same conditional path. (ie. if you can arrange it, only use conditionals when each path will the same for every instance in a workset.) Make it a vertical wipe by using ClipY(gl_PointCoord.Y) Make it a fade by alpha blending the two samples, etc.
  4. You can still benefit from more threads than cores; here is an extreme example: a single core/single lane machine. Is there ever any need for multithreading on such a machine? Sure; suppose you have a prep thread that is waiting on I/O or some other condition, like a FIFO being less than half full. It can yield while waiting to another thread. If you are single threaded, then that single thread eats all the latency in your model, and latency can't be hidden at all.. Another approach would be to put that thread in a tight loop constantly polling to see if the FIFO was less than half full but that is the poont od using FIFOs... lets assumr its a filling or prep thread; its job is to make sure that its draining thread never sees an empty FIFO. The prep thread can periodically detect 'half empty. wake up and process some number of resources, and then yield again. Same for mutli-core/multi-lane machines. As long as threads can intelligently yield when possible(they are waiting for some condition, like a FIFO to be less than half full) then that time yielded can be used by another thread. If you have threads that never yield at all, then each thread will try to consume a complete core. (Depending on the O/S, it will still occasionaly lose bandwidth and be parked, but it will always be crying for attention. Sometimes that is a necessity, but if such tight polls or freewheeling threads can be minimized, modern O/S can manage lots of well behaved yielding threads far in excess of the number of available cores/lanes...
  5. I think conditional branching in shaders only provides a performance hit if each SIMD instance might take a unique branch. If so, you will get a SIMD 'stall'. The shared branch instances will execute in the current active working set, then the other branch set will execute, and then eventually they will 'sync' up afer all branch sets and once again grinf away efficiently as SIMD across the entire working set. But just because you take a hit doesn't mean you can't do it. Just be aware there is a hit. Instrument and measure the hit, compare normal cases with extremes. It might not be so bad, that totally depends on the logic. This issue becomes more front and center with openCL but it also applies to shaders.
  6. [color="#000000"]OpenGL is perfectly happy with multiple contexts sharing the same handle space via sharelisting. (However, not all flavors of OpenGl currently support sharelisting; OpenGL ES, for example, though it might in the future, according to the folks at Khronos.) [/color] [color="#000000"]For 'desktop' OpenGL, It is often beneficial to have 'prep' threads and a main render thread that consumes resources prepared by those prep threads. But for that to happen, the contexts in each thread must share the same handle space. Prep threads can be used to isolate disk and other latency that must be dealt with to prepare resources from the main render thread, which should only ever deal with prepared resources.[/color] [color="#000000"]A resource pool manager(that delivers availan;e handles and accepts freed handles), plus sharelisted threads isolated by threadsafe FIFOs, is more than adequate to guarantee collision free operation without expensive locls. (The only locls required are in the low duty cycle updates to FIFO state and pool manager state; the prep threads and main render thread spend most of their duty cycle prepping and rendering, and little time changing FIFO state, which is simply a matter of updating a couple integers for head and tail.)[/color] [color="#000000"]Headsup with sharelisting; make sure all contexts that are going to be sharelisted are requested before any of them are selected as a current opengl congtext(and this for sure, before any resource handles are allocated among the contexts that will be sharelisted. Sharelisted means 'share the same resource handle space' which is required for multithread Opengl.[/color] [color="#000000"]Headsup with the design of the threadsafe FIFO; it must use a two-step allocate and release model, because there is finite execution time between when a handle is pulled and when it is prepped or consumed. But that is easily done. A FIFO object is basically tracking a head and a tail in a circular fashion, with some maximum FIFO size. The FIFO should provife booleans for IsFull, IsEmpty, etc.[/color] [color="#000000"]You don't have to do any of that when you write a game. It adds complexity. But it provides performance and behavior you can't achieve in a single threaded model.[/color] [color="#000000"]as in --------Please wait....scene loading---------...[/color]