Jump to content
  • Advertisement


  • Content Count

  • Joined

  • Last visited

  • Days Won


JoeJ last won the day on August 20

JoeJ had the most liked content!

Community Reputation

2941 Excellent


About JoeJ

  • Rank

Personal Information

  • Role
  • Interests

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. JoeJ

    wavefronts in-flight

    I think this takes much longer too, even accessing LDS takes longer than 4 cycles i guess (anyone knows some numbers?) What you mean is probably to read data from registers. You want to have data in registers if possible, (which now is accessible even from neighboring threads with SM 6.) Secondly you want to cache data to LDS memory which is much faster than global memory. This is also the common way to share data over the whole workgroup and mostly key to efficient parallel algorithms. It's also the main difference between pixel and compute shaders. If the term 'Prefix Sum' is new to you in this context, i recommend the OpenGL Super Bible chapter about compute shaders no matter what API you use - it was enlightening to me Finally you want to access global memory as less as possible. The 4 cycles per instruction should be correct, because one GCN SIMD is 16 lanes wide, and a wavefront is executed in 4 alternating steps. But AFAIK this has no effect on how we should program.
  2. JoeJ

    wavefronts in-flight

    I doubt this but i don't know details. However i recommend this guide, which covers older GPUs too, but i did not read that: http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf OpenCL 1.x is the same as compute shaders, it just misses advanced command list stuff and indirect dispatch we see in DX12 / VK. So the guide is worth to read, especially the chapter about memory access patterns. Usually yes. But too much wavefronts in flight can cause cache thrashing so even reducing performance. Consoles can reduce occupancy on demand, on PC you only can reserve more LDS than necessary or waste registers to limit occupancy. I have used with this trick on older NV GPUs where it indeed improved performance slightly in rare cases, but i have not tried in recent years again. I heard it can be important with async compute beside rendering, but no experience myself. In practice i see a linear increase in performance almost always with better occupancy, at least up to 8 wavefronts, not so much beyond that. But this always depends on what you do. I assume what you mean is the compiler may rearrange instructions so the read happens earlier (which may require more registers to hold the data). This can help likely if the data is in cache, global read can take 400-800 cycles(!). My own experience here is: It is totally worth to spend much time on optimization of register and LDS usage. Be sure to use a debug tool that shows you this, otherwise you are blindfolded. Typically i end up with a speed up of factor 4 after optimization. (On NV this number is much smaller.) But i talk about pretty complex compute shaders, with simple stuff there might not be many options.
  3. Sure, sorry. Hollow objects can be modeled without com offset, and if you want to model the mass distribution so precisely, you likely use multiple bodies anyways, e.g. a heavy body for the engine. (I neither know if this works good enough nor if those engines provide out of the box vehicles.) Totally agree. So i'll stop to argue. I just said those engines are used widely, which reduces the need for physics expertise in many cases, and hacks and tricks are common practice. We can say the same about graphics. If you, or the developers in your environment go beyond that, that's a good thing. And if a degree is required to get a job i can't help that. Personally i applied only once for a game dev job and they i would have hired me, but unfortunately i decided against that in favor of another job near to my location.
  4. JoeJ

    wavefronts in-flight

    I assume 'Cayman' as an older, meanwhile totally outdated architecture than current GCN. At least the article is from 2010. GCN was a big change against previous generations, so older stuff is useless to know.
  5. JoeJ

    wavefronts in-flight

    The idea is to switch to other wavefronts when the actual one waits on global memory operations. So they do not run simultaneously, and instructions do not need to be in sync, it can even be wavefronts from different dispatches. The most important to know is all 10 wavefronts share the CUs registers and LDS memory. That's why we aim for low register and LDS usage, so we have as many wavefronts in flight is possible to hide memory latency.
  6. JoeJ

    Bumpy World

    That's beautiful geometry - i did not expect that from hearing you talking about prisms and octrees. I want to dig caves into this stuff... seems fun! To make your explanation clear you would need to add some illustrations, maybe in future posts. (I do not get anything )
  7. Isn't the collision normal only used to calculate a sign, while its exact direction does neither effect collision impulse nor contact force? (I may be wrong... too much time has passed) And complaining against missing COM / inertia tensor offset functionality is really exaggerated, because a need for this is very rare. That's like complaining against the approximate 3 number representation of inertia everybody agrees upon, although not every real object has a mass distribution of an ellipsoid. The point i try to make is that games are made without targeting realism at all, and problems are solved with hacks. The dead ragdoll jitters? No problem, increase sleep threshold. Player can push a car because both have a mass of one? No problem, make car unmovable by player. The barrel jiitters if the player holds it against a wall? No problem - add forcefield GFX to the gravity gun and make hold object wobbling all the time -but turned into feature, problem solved. Isn't that how the majority of games are made? Even simulation games?
  8. This more sounds like your personal vision of how all games should be made, but i doubt it's a reality for the very most of them. For once there is a lot of specialization happening in games, leading to offloading special tasks to very few experts in a given field. And those results end up in middleware, engines and tools. Often affordable or even almost for free. So your view is really that of one of the few experts and maybe not represantative for the whole industry. Think of the army of artists necessary to make a game, and also the majority of code where a degree in rocket science would be just a waste. For second i rarely see impressive simulations in games. E.g. the popular physics libs used everywhere can not even handle mass rations of 1:10 well. Game developers often work around such limitations instead to fix them. They fake and trick stuff, and they succeed with it. It has always been this way, and even with constant progress in better tech it will remain so. Games are primary the result of creativity, not science. Personally i agree with your visions if i get you right. Many game programmers are addicted to the idea of simulating reality. But we should not forget that we talk about games here, which are primarily about entertainment and not science. So there MUST be a place for less educated people as well, even in coding positions. Ruling out their creativity would be fatal. On the other hand, in times where everybody can make a FPS just by downloading UE, your opinion is more necessary to be said out loud than ever. Keep it coming, but consider there might be people with similar thoughts and skills without university background
  9. Did you experiment with prevoxelization and so streaming static parts of the scene? I assume this would make sense at least for distant cascades where dynamic objects can be ignored, but i'm unsure if it's still worth it when dynamic objects need to be added.
  10. I really like this work: I think it is a diffusion approach, so it avoids to calculate expensive visibility. I've experimented with this too 10 years ago using grid of spherical harmonics. I gave up it in favor of a surfel approach i'm still working on, but it's very promising.
  11. Ok, but then i would just say university is not necessary. I dare to claim i could go through UE4 source and there should be nothing where my math skill is insufficient to understand how it works - i have done most of those things myself already, and i consider most of it as 'easy, but lots of work'. With easy i mean i had no problems in learning this stuff myself, and i did not even learn trig or solving a simple system of two linear equations in school. So i still wonder why you think university is such a requirement. Like said earlier, i regret every day i did not go to university because of missing math skills. But this comes from my passion on research, e.g. working on walking ragdolls, or currently quadrangulation and vector field design - open questions and stuff that did not really made it into games industry yet. L. Spiros example is better. PBR is definitively a field of active research in games, but after working more than a decade on realtime GI i would not be frightened to dig in there either - it's just about integrating stuff locally not globally. That said just about math skills. (which always worries me... ) As an industry outsider i can not comment on questions from OP.
  12. What would be some examples related to game development? I'd just like to figure out if i miss something essential. Not been at university, i do not know what you mean in practice, and i guess that's the same for others of my kind
  13. NVs implementation is very slow, i've heard (but that was many years ago). AFAIK it has not been used in a game yet. I think they used anisotropic voxels (6 colors for each), and octree. Anybody else ruled both of them out. Instead anisotropic voxels it's better to just use one more subdivision, and instead octree it's better to use plain volume textures for speed. That's what people say... personally i would give octrees still a try, though. Cryengines approach is very interrestig: IIRC they use refractive shadowmaps, and the voxels are used just for occlusion. Likely they use one byte per voxel to utilize hardware filtering, but using just one bit would work too. This means big savings in memory, so the resolution can be higher and the light leaking becomes less of a problem. They have detailed description on their manual pages. The developers of PS4 game 'Tomorrows Children' have a very good paper on their site. They really tried a lot of optimization ideas and have great results, so a must read if you missed it. An unexplored idea would be to use oriented bricks of volumes. So you could prevoxelize also dynamic models like vehicles and just update their transform instead voxelizing each frame. Similar to how UE4 uses SDF volumes for shadows.
  14. No, it will have a barely noticeable effect, because indirect lighting is dominated by diffuse reflection in practice. Exceptions would be extremely artificial scenes with walls of shiny metal for example. (For metals you likely have better results using the greyish specular color instead of the almost black diffuse.) But even if you would store view dependent information, doing that only for the locations of the lights would not work well i think. Simple interpolation would look more wrong than using Lambert, and spending effort to improve that would not be worth it and still cause artifacts. To do better you would need directional information at the voxels themselves so they store their environment (e.g. using spherical harmonics / ambient cubes / spherical gaussians etc.), but this becomes quickly unpractical due to memory limits. With voxel cone tracing your real problem is that you can not even model diffuse interreflection with good accuracy, at least not for larger scenes. So you worry about a very minor detail here. If you really want to show complex materials even in reflections, probably the best way would be to trace cone paths. (Or to use DXR of course)
  15. JoeJ

    An idea for a video game

    Nice, but then your game would be like those that you criticized, only the end would differ (although, remarkably). I would try to utilize the idea more, maybe divide the game in 3 acts: 1. You're a divorced cop, your ex wife gets killed and you start inofficial investigation. Some of your police buddies help you out. You find evidence the son of the big gangster boss dated your wife... 2. You start to sabotage the sons life (like you said in OP). You find more evidence, but not enough to act officially... 3. Showing the cutscene you have planned for the ending. Turns out the son, after killing himself because he became totally crazy, has planned to build hospitals for the poor, and his love for your ex wife was true... With this plot twist the game could change a lot - maybe more action, darker, but still using the same mechanics, so players that liked it still like it. (Think of From Dusk Till Dawn Movie, although the change there is extreme.) Problem is now the gangster boss has found out you were after his son, and he sends killers for your only daughter. Also the police is after you now, so no more allies. You fail to save her in the end. The story is banal, but that's what i come up with in some minutes. My point is that the game changes and keeps going on, and it punishes you for your mistakes. I think of something like Limbo with shooting, there was one great UE3 game in this style, small but very good. Can't remember the name right now... Bioshock Infinite is another example, but thing do not happen really surprisingly in those games.
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!