• Content count

  • Joined

  • Last visited

Community Reputation

231 Neutral

About Helicobster

  • Rank
  1.   OpenCL isn't designed for task parallelism, but OpenMP is actually pretty well suited to forking off a single thread at a time.   And compared to OpenCL, it's very easy to use OpenMP in an existing C++ project.   #pragma omp task would be a good place to start.
  2. about OpenMP

    OpenMP is supported in Visual C++ 2010, but it isn't enabled by default when you make a new project.   Go to Project->Properties, then Configuration Properties, then C/C++, Language, and set Open MP Support to Yes.   Kambiz and Yourself are correct, too - the compiler is optimising away the code branches that don't go anywhere.     Try switching on OpenMP support in your project and running this:   [source] #include "windows.h" #include "math.h" #include <omp.h> #include <iostream> int main(void) {     double t1 = omp_get_wtime( );     float sum = 0.0f;//new!     for(int i = 0;i < 8;i ++)     {         float a = 0;         for(int j = 0;j < 10000000;j++)         {             a += sqrtf(j);         }         sum += a;//new!     }     double t2 = omp_get_wtime( );     std::cout<<"time: "<<t2 - t1<<std::endl;     std::cout<<"sum: "<<sum<<std::endl<<std::endl;//new!     sum = 0.0f;//new! #pragma omp parallel for     for(int i = 0;i < 8;i ++)     {         float a = 0;         for(int j = 0;j < 10000000;j++)         {             a += sqrtf(j);         }         sum += a;//new!     }     double t3 = omp_get_wtime( );     std::cout<<"time: "<<t3 - t2<<std::endl;     std::cout<<"sum: "<<sum<<std::endl<<std::endl;//new!     std::cout<<"speed improvement: "<<((t2-t1)/(t3-t2))<<"x"<<std::endl;//new!     system("pause");     return 0; } [/source]     You should see a speed improvement that's close to your CPU's core-count. I get around 3.8x on a quad-core.
  3. The problem is that the transparent pixels bordering the opaque pixels are still light grey in their RGB values, even when they have zero alpha. So when you take anything other than a nearest-neighbour sample of those borders, you get a blend of each pixel and its neighbours. So you're getting a blend of alpha values [i]and[/i] a blend of RGB values. So say you get a 50-50 blend of opaque warrior and transparent background. That's a 50-50 blend of alphas (in this case: 50%) and a 50-50 blend of RGB values, in this case a shade of grey that's between the background colour and the warrior's colour. The result you're getting is in fact completely correct - it's your source material that's gone wrong. The solution is to always, [b][i]always[/i] save your sprite art with a black background[/b], even where it's completely transparent. Some programs will solve the problem for you automatically, but most will not. Don't expect a game engine or realtime graphics API to get it right, because alpha blending always expects the foreground layer's transparent areas to fade to black.
  4. SPH simulation - pressure force doens't work

    Is there any particular reason you're dividing the pressure force by the particle density? [quote name='Sepii' timestamp='1347201426' post='4978303'] particles[ i ].velocity += ( particles[ i ].force / particles.density + gravity ) * dt; [/quote] That means that as particles pile up, the pressure force will decrease. That's the opposite of what you want to happen. You should instead divide the force by each particle's mass, such that [b]F[/b] = [i]m[/i][b]a[/b]. Density is not a substitute for mass in this case. Density shouldn't even be considered as a per-particle value - you're only storing it so that you can sample the density [i]field[/i] via a smoothing kernel. Simply removing that reference to density in the integration step should get you some fluid-like behaviour.
  5. OpenCL ports and wrappers

    Did you know that there's already an official C++ wrapper for object-oriented use of OpenCL? It's in here: http://www.khronos.org/registry/cl/ specifically: http://www.khronos.org/registry/cl/api/1.2/cl.hpp docs here: http://www.khronos.org/registry/cl/specs/opencl-cplusplus-1.1.pdf It's relatively simple and painless to use. First, include cl.hpp. Then you create a cl::Context, some cl::Kernel objects, and some cl::Buffer or cl::Image objects, and finally you load your kernels into a cl::CommandQueue for execution. There are OpenGL-friendly versions of the buffer & image classes in cl.hpp that you can use as VBOs or FBOs for speedy display, too, so if you already have an OpenGL project that you can use for "fast prototyping", you should be able to incorporate a CL context with some texture-processing kernels and see results straight away. Last I checked, though, NVidia weren't using cl.hpp in their GPU computing SDK, and their OpenCL examples were pretty crudely ported from their CUDA examples, accessing the OpenCL API through some rather impenetrable C code. ATI's Stream SDK (aka AMD APP SDK) has some decent examples of proper use of the C++ wrapper, though. ATI seem to be a lot more interested in OpenCL than NVidia are, and this is reflected in their sample code. So even if you're on or targeting NVidia hardware specifically, you may as well install the AMD APP SDK just to have better sample code to learn from. Most of it should still compile & run with few modifications, regardless of your hardware setup - but bear in mind that there are OpenCL [i]extensions[/i] that may be unsupported on one platform or the other.
  6. In all my years working with 3D animation software, I have [i]never[/i] seen an FBX file work as advertised. At best, you can some object placement information out of them, but the rest you should consider incomplete or otherwise suspect. Anyway - it looks like what's happening is that the Meshsmooth modifier is converting the object from an old-fashioned triangle mesh to winged-edge 'poly' format, but the FBX exporter doesn't notice that the surface is no longer a triangle mesh. You have a couple of easy options: 1: Apply a 'Turn to Mesh' modifier above any Meshsmooth modifier before exporting. This will (non-destructively) convert the surface back into old-fashioned triangles. 2: Replace all Meshsmooth modifiers with Turbosmooth modifiers. Meshsmooth has more features, but it's more or less deprecated. Turbosmooth is far more efficient, and it outputs a triangle mesh.
  7. Issues with blending (transparency)

    [quote name='wolfscaptain' timestamp='1301866910' post='4793961'] Since you can usually enable back face culling, remove geometries that are too far or near, and do all sorts of other optimizations, how many faces are there overlapping in a usual scene anyway? 3? 4? 5? 10? Is that really enough to overwhelm today's technology? [/quote] If you're rendering stuff like smoke as billboard particles, then you can easily have a hundred semi-transparent fragments faces overlapping. Same if you're relying on blending for rendering vegetation etc. without ugly artifacts. That's why that sort of thing is usually sorted for back-to-front rendering, instead. There [i]are[/i] good methods for remembering the depth, transparency & colour of every fragment you draw, so that they can be drawn in the right order and blended correctly, but they're generally not realtime, and they tend to use a lot of memory. This family of techniques is called 'A-buffer' rendering. It's hard to know in advance how many fragments could end up stacked in each pixel of your A-buffer, so instead of allocating enough memory for hundreds of fragments for every single pixel (which could use hundreds of megabytes), enough is allocated to render small portions of the image at a time (often called 'chunks' or 'buckets'), to a generous and/or variable fragment depth. Pixar's Photorealistic Renderman and many other offline renderers work that way, when they're not raytracing instead.
  8. [quote name='Vectorian' timestamp='1302093680' post='4795012'] When textures are stored on the graphics card, they are upscaled to the closest power-of-two size, so 257x257 becomes a 512x12 sized texture in memory. Also in ye olde times graphics card did not support non-power-of-two texture sizes at all. [/quote] Nope - most GPUs these days support non-power-of-two texture sizes. If it supports OpenGL 2.0, it'll support any reasonable resolution you can throw at it. There is still a slight advantage to power-of-two texture sizes, though. When creating MIP-maps for trilinear-filtered texturing, each level of the MIP-map is half the size of the one before it. If the texture's size is a power of two, then each pixel of each MIP-map level can be made by simply averaging four pixels from the level before it. This 'binning' reduction is easier to optimise, and it gives cleaner & more predictable results than resampling the image. It's usually handled by the GPU, but you can still reasonably expect power-of-two textures to load a little faster, and/or look a little better at a distance. There's no [i]dis[/i]advantage to using power-of-two texture sizes, either, so while it's not [i]necessary[/i] at all, it's kinda habitual. Much like if you're selling random stuff in a garage sale, you're more likely to offer things for $5, $20 or $100, than for more arbitrary prices like $7, $23.45 or $102. Round numbers are nice.
  9. A workable structure for physics?

    [quote name='SeanH' timestamp='1302115257' post='4795150'] Acceleration is based on Time, so you'll want to store your object's maximum speed. If you store it's maximum speed, you'll be able to have objects that move at different speeds. If you don't, you'll need to setup a global constant that you can use for all the objects in your world. [/quote] If you don't know how game physics is usually done, should you really be giving advice on it? If you limit objects' speeds like that, then a player with a top speed of 10km/h riding a train with a top speed of 100km/h would be dragged through the back of the train as it accelerates. Nobody likes getting dragged through a train. The better way to give objects different speeds is by giving them appropriate masses, appropriate forces to move them around, and appropriate forces to resist their movement. It's not that hard to do, and it will make any game more fun (and nicer to look at) than just [i]guessing[/i] at how objects might move. Even super-simple Newtonian physics with boxes & Euler integrators will work, like it worked in Super Mario Bros. and every decent platformer ever since. If you were to fake your physics naively instead, your game will stop being fun as soon as players encounter unexpected weirdness like not being able to ride a train, or encountering a "strange curve" for planetary gravity that throws them straight into the sun. MrTwiggy - you were probably taught that forces can be represented in non-vectory ways because your teacher was accustomed to doing physics with a pencil & paper. On a computer, it's far more useful & efficient to think in vectors. Basically they just encapsulate X-position & Y-position together (and the same for velocities & forces), but no matter how fancy you want to get with your physics, vectors will always be faster & friendlier than things like compass headings or top speeds.
  10. A workable structure for physics?

    [/quote] If you were thinking of writing your own physics engine I would highly recommend you not to. There are libraries freely available, why not use one of those? [/quote] Or, better yet, use several. http://www.adrianboeing.com/pal/index.html The great thing about physics simulation is that concepts like force & mass & velocity are universal to all simulation methods, so you can replace the actual simulation engine whenever you like, and usually expect similar or identical results. Even if you're determined to write your own (which isn't a bad idea - it's just a lot of work, including [i]literally[/i] reinventing the wheel), it would pay to be able to switch engines now & then, just to verify that your results are similar to everyone else's results.
  11. Point / Vector understanding

    [quote name='alvaro' timestamp='1301923729' post='4794184'] Writing more code to reduce functionality that shouldn't exist is a good thing, and we do it all the time. The most obvious example is making data members private: The user of the class shouldn't access that data directly, and we write extra code to make sure that it stays that way. [/quote] I think you meant to say "[b]if[/b] the user of the class shouldn't access that data directly, we write extra code to make sure that it stays that way". Personally, I only make members private if there's the potential for something to go wrong otherwise. I'm not in the habit of adding gatekeeper code out of superstition - I add it where it's actually necessary. This means I write a little extra code to prevent users from having to conduct their own sanity-checks when using volatile data. Result: less code overall, and fewer hoops to jump through. [quote] Let me give you one example of how treating every three-coordinate object the same way sometimes just doesn't work. You import an object from a 3D modeling software, and it comes with vertices and normals. Now you would like to make the object taller, applying an affine transform that effectively multiplies the z coordinate of every point by 2. How should this transformation affect normals? [/quote] Uh... I'd scale the vertices and their normals with separate functions. Why do you assume I'd scale the vertices and their normals with the same transform? You seem to be under the impression that by treating points & vectors as the same thing, I'm capable of forgetting what I'm using them for at any given time. Like I said before, the context in which they're used makes it clear enough how to use them. By the same token, we don't need separate classes for 'float' and 'real' to keep us from accidentally dividing when we want to subtract, or multiplying when we want to add.
  12. Point / Vector understanding

    [quote name='alvaro' timestamp='1301666518' post='4793064'] I can measure the angle between two vectors, but not between two points. I can measure the length of a vector, but there's no such thing as the length of a point. You can normalize a vector, but not a point. You can scale a vector by 10, but not a point. Those should be enough examples. [/quote] Your examples are of things you've decided that you're not going to be able to do, because you've decided in advance to make a distinction between points & vectors. Those of us who do not make that semantic distinction do those things all the time, however, without issue. If point==vector, angles & lengths & scales all become meaningful & obvious automatically. Any possible ambiguities are resolved by the contexts in which they're used. For example if you're trying to scale a point by 10 [i]because you're scaling vertices of a 3D mesh[/i], it makes perfect sense in that context. There's no context in which you'd need to find the angle between [i]just two[/i] points, though, so that issue never arises. Well... not for much longer than it takes to realise what you're doing, anyway, and think of a third point to use. It sounds like your academic background has left you with unnecessary baggage, and now you're using [i]more[/i] code to produce [i]less[/i] functionality. You may be quite sure you'd be deemed correct in academic circles, but are you sure that the extra complexity is doing you & your code [i]any good[/i]?
  13. If all your springs are doing is flowing in short chains from a constraint or fixed point (say - a cape attached to a character's shoulders, or a flower blowing in the breeze) then yeah, it doesn't really matter if you skimp on the higher-order aspects of the physics. But you did ask for the "best way", and not just "a way" to stabilise your cloth simulation. From your initial description, it sounds like your system is already showing the shortcomings of techniques that don't store velocity & acceleration separately. It's something I've seen myself countless times, and the solution is always the same: store velocity (or last-position in addition to current-position) for generally pleasing motion, and store acceleration (or the forces that induce those accelerations) for stability & realism. Chances are it'll cost you only a few kilobytes of extra RAM and a few milliseconds of extra CPU time, but it'll let you build spring & mass systems as jiggly or as taut as you like.
  14. When you say 'perturb', do you mean you're modifying the nodes' positions directly? Or are you modifying their velocities, or accelerations? And what do you mean by "all operations are performed on the same buffer"? That sounds like trouble. It sounds like you're not storing acceleration, but applying spring forces directly to velocity and/or position all in one 2D loop. Whatever integration method you're using (Verlet, Euler, RK4 etc.), if you've got particles interacting with each other, you should be storing a position, a velocity and an acceleration per particle, or (depending on the integration method you're using) their functional equivalents. At the very least you need a way of storing & retrieving each particle's position, and both the first and second derivatives of its position (ie: velocity and acceleration). Otherwise the outcome of any physical simulation would depend on the order in which particles were processed. By the time one element has been modified, the conditions for processing the next element will have changed, and you'll have an accumulation of errors that propagates throughout the system. If you're not storing acceleration (or force) per particle, you should do so. So instead of 'perturbing' the particles all in one loop, you should have a spring-wise loop to collect the spring forces acting on each particle (including damping), a particle-wise loop to collect the other forces acting on each particle (gravity, wind etc.), and an integration loop to apply those forces and update the particles' positions. Then the order in which particles & springs are processed won't matter at all, and you can even parallelise the simulation for multi-core CPUs or GPUs. Adding more connections won't help in your case, but they can still be fun. Diagonal springs (when they're working correctly) will counteract shearing/skewing of the cloth, but that's not really necessary for a net. Longer connections (connecting particles to their neighbours' neighbours) will stiffen the cloth, but again, for a tennis net, that's probably not necessary, except perhaps along its top edge. Springs going up, down, left & right would be sufficient.
  15. Just so you know, you don't have to reduce the size of the FBO itself. You can just create it at a generous size, then just draw a smaller picture in the corner when you need to, and then only display the corner you've drawn. Some platforms might not like it if you were constantly making new FBOs of different sizes.