Archived

This topic is now archived and is closed to further replies.

Poll - plane equation convention.

This topic is 5129 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I recently changed my math code base about it due to SIMD optimizations. So I just wondered : Question what convention are you using ? (1) Plane.N * P + Plane.d; // (* P.w), * being the dot prod or (2) Plane.N * P - Plane.d In other words, for you, does the signed distance d (projected on the normal axis) mean : Plane to Origin (1) or Origin to plane (2)

Share this post


Link to post
Share on other sites
Signed distance to means from the normal. I use an intersect ray plane to find the distance, and it works like this:

d = - PlaneNormal dot PlaneOrigin
numerator = PlaneNormal dot RayOrigin + d
denominattor = PlaneNormal dot RayVector

if denominator == 0, normal is orthogonal to vector so we can't intersect.


distance = -(numer / denom)



Otherwise the signed distance from a point is basically shooting a ray from the point along the normal of the plane. It looks like or similar to this.


distance=Point dot PlaneNormal+(PlaneNormal.x*PlaneOrigin.x+PlaneNormal.y*PlaneOrigin.y+PlaneNormal.z*PlaneOrigin.z);

}
EDIT:

It doesn't matter, cause the signed distance from a plane normal to the plane will always be 1.

As far as a signed distance from a point, I say it like this..

Distance=Plane.SignedDistanceTo(Point);

[edited by - RhoneRanger on November 28, 2003 12:17:11 AM]

Share this post


Link to post
Share on other sites
I use the one where you subtract, it makes the most sense because it is the distance the plane is from the origin, in the direction of the plane normal. all of the quake engines use it, and this is a retarded poll

Share this post


Link to post
Share on other sites
z80 : Isn't the form Ax + By + Cz + D = N . P + D used pretty much anywhere?
Yep except in the non retarded community (ROFL).

Rhone :

You should clear your mind about PLANE EQUATIONS (Google search). I never mentionned ray/plane intersections in this thread, only planes (equations).

Mathematically, in 3D linear algebra a plane is an hyperplane. It's the orthogonal (thus supplementary) sub-space (2D) of a vector (1D). In affine math you just add a translation T which is condensed T*N = our d in question. But here I am not founding my judgement on math conventions but efficiency.

Else you are right, the convention won't change the result (of course) BUT IT MAY CHANGE CODE SPEED (SIMD). I thought my post was implictely clear about it.

shadow12345
If you knew me a bit more I never loose time here writing non-sense. Just because (2) is the standard used in the Quake engines (and I also used it independently well before Quake1) does not mean it's the right choice. The + choice (1) makes it possible to use less SIMD instructions. Quake3 does not use 3DNow or SSE optimizations, that's why this game is slow on old PCs. So it's not the best example.

"and this is a retarded poll"
ROFL. Thx I have 24 years of game programming experience, started at the age of 10, got trigonometry and linear algebra at the same time. I you have something to learn about 3D math, boy ask me.Thx for answering anyway.

Question :are you also one of those who believe one can't beat VisualC++ with hand written code or even code tweakings ?

Well here a clear example. The coder can choose wether he uses choice (1) or (2). The compiler can't do it for him. Eventually if you choose (1) and compile with VectorC you may produce a better .asm/.obj. I hope you learned two things today :

- keep your hurried judgements (retarded ) for yourself until you are sure.
- think out of the box.

[edited by - Charles B on November 29, 2003 7:23:04 AM]

Share this post


Link to post
Share on other sites
I type fast, I code fast and write better and faster code than what I have read in the Quake sources (1). Now if you don''t want to take profit of the valuable time spent here by some experienced 3D coders like me, don''t even waste your time reading threads. I see the level constantly degrading here because it seems the noobs become more pretentious and unpolite everyday and the real coders get tired of helping these ungrateful noobs. However I still admire Carmack specially for the implementation job he''s done concerning the BSP trees in Quake1.

Share this post


Link to post
Share on other sites
I posted more garbage but I deleted it. You're right, you're obviously god and I'm *totally* incompetent.
*too many edits, back to work*
[edited by - Shadow12345 on November 29, 2003 2:36:36 PM]

[edited by - Shadow12345 on November 29, 2003 2:37:41 PM]

[edited by - Shadow12345 on November 29, 2003 2:38:03 PM]

[edited by - Shadow12345 on November 29, 2003 2:38:26 PM]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
I use the

- normal dot point

format in my SIMD library.

PS, not everyone uses VC++. Some of us code on PS2 etc. And yes, esp. with SIMD code you can still beat the compiler. VC++, gcc, and CW still generate redundant loads and stores.

Share this post


Link to post
Share on other sites
"you''re obviously god"

No I am not since God is unique and I mentionned experienced coders (plural). Anyway if writting better code samples (1) than ID makes one God of someone then I believe you don''t believe much. To be one of them I suppose this requires some curiousity and open mindedness, that is for instance not try to bash threads when you are not able to imagine the initial intentions about it.

You could be interested in the fact that I am writing the fastest asm/C/C++ multiplatform math lib ever produced. Dots prods in less than 2 cycles, etc ... that is as good as hand written asm ... in portable C/C+. And making such a piece of software requires to ask a well chosen set of apparently innocent questions ... I could explain 30 pages about it but you would be quickly lost. So keep on coding ... but don''t be too blind else IT''S actually wasting time. While you''ll be struggling to reach 30FPS my first debug version for the same soft will be 500FPS.

(1) I don''t talk in the void, back to planes :

Quake1 code :

if( pPlane->N.x > 0 )
dist += pPlane.N.x*pAABox->xmin;
else
...

My code :
sx = signbit(pPlane.x);
sy = ...
d = AABox[0][sx] * pPlane.N.x + AABox[0][sy] * pPlane.N.y + ....

Means box/plane is nearly as fast as a plane equation. The code in the tutorial at GameDev is 8 times slower ...

Share this post


Link to post
Share on other sites
At last anonymous, another constructive post. Yep about load/store which is crazy since this logic is the same concerning the FPU (since the first Pentium) and the integer unit more recently. Sadly in Visual 6 the intel intrisics like :

mpadd x, *pSomeData

compiles to :

moveq mm1, [pSomeData]
mpadd mm0, mm1

This results in more code, more cycles wasted and unnecessary register consumption (while we're always so limited with 8). So after 10 years of hardcoding I still wonder why enormous corps that make billions of profit (MS, Intel) can't even produce a last pass of compiling that would make asm code reordering/rewritting instead of supposing the processor reordering stage will cover their laziness, which is usually false in practice.

I am not courageous enough to do it. But grr it's not our job.


[edited by - Charles B on November 29, 2003 3:12:46 PM]

Share this post


Link to post
Share on other sites
ironically my real name is charles. anyway, it''s not difficult to confuse people. It''s being smart enough to teach dumb people that''d really make you stand out. also notice i''ve successfully enticed you with every provoking post malleable as clay

Share this post


Link to post
Share on other sites
Sure my coder ego is of the same order of magnitude as my skills . But beyond everything it's code truth that counts beyond everything. To me code truth is counted in 1/clock cycles, number of bugs and time required for a someone to expand the code.

But ego is not the point, the point is if you read well, it entitled me to expand the scope of the subject. So thanks for your sneaky constructive approach. Else I also like the Talion rule when someone pisses me off. But sure not everyone is currently concerned in writing a ultra speed portable math lib. And if I do that it's simply because it does not exist or it's not affordable for our team project that will be highly CPU intensive and also scalable/portable. However maybe one day soon I will release the math lib for the community. That's the main reason I did such an innocent poll.

This let's me see :
- first degree : the actual convention people use.
- second degree : see their level of concern for actual math optimizations (any function is small but many rivers ...)
- third degree : catch attention of some coders who would be interested in the same subjects. Specially those working on the PS2.

[edited by - Charles B on November 29, 2003 3:34:56 PM]

Share this post


Link to post
Share on other sites
Hey I think it's all cool stuff. I gave my reply on what I use, but I guess my reasoning wasn't really even pertinent to your purpose (I guess you're trying to code it in the fastest way possible, whereas I just use the method that makes the most sense to me).

I definitely think that it's important to squeeze as many clock cycles out of a math library as possible with the new technology we have in graphics. I admit I do a lot of stupid stuff in my own math library that most people would scoff at, i.e copying stuff by value which snags you with hidden copy constructors and things like that. I'm only at the point where I want to get stuff to work (I'm only 17, I've only been coding this engine for about half a year, I have taken no special math courses but I am bright and can figure stuff out on my own from various math books, but I've implemented complex collision detection between moving objects while keeping track of where they are in the world based on the BSP tree, and I've developed a very interactive entity hierarchy which is actually turning into a real game with simple 3D animated enemies, comparable to the complexity of GLQuake, with about as many lines of code).

So, does it really make *that* huge of a difference in performance whether it is adding or subtracting the plane's distance? My engine runs wicked fast, upwards of 1700 FPS depending on res and what you are looking at in the world (no joke about that number either, and I don't have the fastest system), so I've only done basic code profiling and I don't worry too much about house keeping stuff, only algorithms and highly complex math.

[edited by - Shadow12345 on November 29, 2003 3:59:42 PM]

[edited by - Shadow12345 on November 29, 2003 4:01:14 PM]

Share this post


Link to post
Share on other sites
Well then greetings for your (lone efforts). 17 is roughly the age when I started basic 3D. It was also more difficult in one sense since it was all software. But managing coldet and BSP trees at the age of 17 is a cool start. So my excuses for my arsh answers. But finally this leads to constructive exchanges.

Well if you render Quake scenes, first the poly-count is of the old gen 3D engines so the ammount of CPU processing on today hardware is not that big. Todays state of art technologies should render 30K 50K 100K or more polys after any CLOD and culling (occlusion too) has been done.

In a first order approach you are right that it's wise to think most is processed by the 3D hardware. However CLOD requires CPU work and many other techniques too. For instance my terrain engine is infinite in precision and distance. To do this I need to feed the GPU with a lot of data updated per frame or at a lower rate and that's computed on the CPU. You mentionned physics. Todays games are far from being optimal in the way they implement physics. Imagine a game where any object that could be animated in real life could also be animated in the virtual scene. For instance you launch a rocket into a wall (made of individual bricks when you close up) and the bricks explode and may hit the player (col det).

So as you can see, if you want to use the GPU capacity at max then the CPU also becomes a serious bottleneck for most ambitious new challenges. Personnally I for instance generate new LODs of textures on the fly with all lighting self shadowing, etc ... You can see details of one millimeter. The GPU even with pixel shaders would be strangled because I use 16 passes of shadowing for each light to simulate penumbra. On the CPU I can devide the refresh rate by 100 and update once per second in the texture space, hardware blend between transitions. This is just one example.

Now for the math lib more specifically. Look at your most used functions found with your profiler. They are many certainly apart one or two main loops. But every where you will find basic maths. Sqrt, RSqrt, dot/cross products, quaternions, etc ... OK now here is something I tell you not by vanity but just because I was surprised myself in one of the fields I consider I am mastering. There is an incredible difference between optimal C/C++ code relying on asm or intrisics (let's call it OPTI) and standard OOP math classes (I call it STD). The factor is (read well) :

- between 10 and 100 in STD debug mode
- between 2 and 20 in STD release mode
- OPTI is at worst only 50% slower than the most scheduled handwritten assembly code (twenty years of asm coding exp in my bag.) and it sometimes even beats it because my C macros/functions lets the compiler reorder code in big functions which would be impossible using macros in asm coding for instance.

So as you can see there is a lot of room for improvement in common math libs.

Now here is my conclusion. I always tend to resist to the fashions of the day in computer sciences. Such simplistic assertions made by "software engineers" who think for instance that compilers are so performant that they do everything for you once you code decently in C++. That's false I always see layers in my software architecture. In the lower layers I still keep a fair ammount of crazy hacks. But all in all my code compiles on a Mac, PC or Unix seamlessly even when I code on Windows during one week. Now today the illusion is that one can do everything with (vertex/pixel) shaders. Yep everything in a demo. But you won't fill my infinite scenes with shaders only. You need algorithms. Remember that the CPU is where still lays the most freedom and creativity potential because it's a generic processor, that's where you design the highest level and thus the most crucial optimizations. For instance it used to be the BSP trees. Now ABT or Quad Trees.

So remember this : the GPU is the lair of brute force. The CPU is the lair of subtle force, algorithms. Thus the real potential of CPU may in fact still surpass totally the equivalent GPU implementations. Thus code level optimizations on the CPU, the purpose of my math lib, must be overweighted by a huge factor, 100 in my terrain textures example.

When this assertion will become false when the boundary between the GPU and CPU will be very unclear. It is already the case in the very interesting but complex architecture of the PS2. On the PS2 you need to handle the CPU AND GPU very precisely, not only the GPU. On a PC it's also what one should try to do.

(I read again, sorry it's late, and my english sucks, I hope you understood what I meant.)


[edited by - Charles B on November 29, 2003 9:31:26 PM]

Share this post


Link to post
Share on other sites
everything you said makes sense (I did have a little bit of trouble understanding what you meant, so I read what you wrote twice), and I'm not at the point where I have to worry about things like per pixel lighting and stencil shadows, so I have plenty of processing power to do what I want. It's difficult enough to actually write a game engine, and there are no tutorials out there that say 'this is how you write a game engine'. you mentioned rendering in software. I was wondering, how many developers (aside from device driver writers who implemented OpenGL and direct3D) actually know how to write a software renderer? I know the basics of casting a ray out into the world to intersect objects, finding where it intersects the view plane, and interpolating across the polygon to visit each pixel across each scanline applying the texture/color, but I haven't actually written a software renderer and I don't know if it is a good idea to spend the time to do so. Would that be somethign that I should take the time to learn and implement? I'm wary of going into extrememly difficult things such as that because I'm afraid it will take too long, and I could be heavily focusing on my BSP game engine (which renders with OpenGL only right now).

Also, are you actually going through and implementing square root? That'd be very impressive. I've looked at carmack's INVSqrt function but I don't understand it, and I've tried writing my own several times, once was a 'guessing' program that used algorithms to guess the sqrt and continued until it got within a certain value, with a certain percent accuracy, and the second time I tried doing it using derivatives only to find that in order to find the derivative of a sqrt, you need to call sqrt (the derivative of sqrt(x) is 1/(2*sqrt(x)), so that didn't work).


quote:

Such simplistic assertions made by "software engineers" who think for instance that compilers are so performant that they do everything for you once you code decently in C++.


Yeah, I definitely do not believe that, but at the same time I've tried finding the 'perfect tweaks' for the compiler and it seems next to impossible without trying to dissect things and re-write everything in asm (which is what you are doing with the mathlib).

I've got a question because you code on PS2: exactly what is under the hood of a PS2 and who writes drivers for its CPU and GPU and fabricates the hardware? For example, I visited SGI this summer (Yes, for real, I have pictures!) and they have their own custom super computers with special architecture (they have a low frequency, about 700MHz per processor, but the actual computing power is that of a 2.2GHz P4 easy, plus they have upwards of 64 processors in one super computer, I think they made it low frequency to be stable), however SGI doesn't make their own processors, they only come up with the designs and then IBM actually produces the processors themselves. I'm wondering if the PS2 is just a pentium 3 for a CPU like the XBox, and with a special integrated GPU of some sort mended with already existing technology and software, or if the hardware and software is a custom design just like SGI's systems.

I've got a lot more to say/ask, but I think that's enough for right now.

oh, also, I can send the code for my project to you sometime if you want.


[edited by - Shadow12345 on November 29, 2003 10:00:40 PM]

[edited by - Shadow12345 on November 29, 2003 10:02:26 PM]

Share this post


Link to post
Share on other sites
quote:


I''ve got a question because you code on PS2: exactly what is under the hood of a PS2 and who writes drivers for its CPU and GPU and fabricates the hardware? For example, I visited SGI this summer (Yes, for real, I have pictures!) and they have their own custom super computers with special architecture (they have a low frequency, about 700MHz per processor, but the actual computing power is that of a 2.2GHz P4 easy, plus they have upwards of 64 processors in one super computer, I think they made it low frequency to be stable), however SGI doesn''t make their own processors, they only come up with the designs and then IBM actually produces the processors themselves. I''m wondering if the PS2 is just a pentium 3 for a CPU like the XBox, and with a special integrated GPU of some sort mended with already existing technology and software, or if the hardware and software is a custom design just like SGI''s systems.




Basically in the PS2 you have a slow and cacheless CPU, two vector units (basically one (VU0) for doing stuff in collaboration with the CPU, and one (VU1) for transforming vertex streams and driving the graphic chip) and then the GS, the graphics ship which is the nice part of the whole
And a fast DMA controller to transfer data all around the place.
The GS has good brute force potentiality and very good fillrate IMO, but the fact you have to code so much on the VU1 is a real pain in the ass, as the lack of cache. Parrallelization makes it hard to develop and optimize, but PCs begin to have the same problem of balancing the workload with modern GPUs now...

For the sake of developping a math lib using VU0, you are at the mercy of the compiler allocating SIMD registers and optimising copies, so there''s not much to do if it doesn''t (like CodeWarrior, SN Systems''s one is much better but it seems (I hope since I work with CW) Metroweks will catch up)...


BTW, Charles, I see you''re French, at what game company are you working?

Share this post


Link to post
Share on other sites
@Tramboi
I don''t think there are many valuable companies in France after the crash, tho there are many skilled guys (like Yann L for instance). Thus I currently work with my own funds (to survive) and I have started an international team to make a technically ambitious online game. (Check my profile if you wanna get the link to Small Big Game For Real Coders). Now concerning my past experience I was just fed up beeing "exploited" by some carpet sellers who know nothing about game dev and ruined several times my attempts to do something worth being released. You probably know how uncomfortable game dev may be anywhere, in France in was just hell.

@Shadow
You misunderstood me on two points :
- I don''t make a software renderer today though I have coded many in the past. I simply use the CPU for some things like generating textures (or light maps if you want) on the fly. It''s not quite the same.
- I am never coded on a PS2 tho I regret not knowing this interesting machine a bit more coz I am somehow a masochist coder. I am just interested to have to feed back from console coders to see if there could be a way to make a totally cross platform math lib. Currently it''s MacOSX (Altivec) and PC, Unix, Windows (MMX/3DNow/SSE/SSE2) compatible.

About RSqrt and Sqrt I simply cut/pasted "Carmack" square root code for the FPU version. I also use the fast RSQRT of 3DNow. All in all it''s just 0.1% of the math code base I am not sure Carmack created this code. Such things are known since decades. What counts is implementations hacks depending on a certain generation of hardware. You also have to read about the IEEE 754 floating point format. Else it''s based on serie development (but I doubt you''ll learn about it before you are 19 or 20 depending on your studies). However it''s not so complex to understand, yes it''s based on derivatives.

One of the secrets of the "Carmack" code is this :

In general (xe)'' = e*xe-1
RSqrt(x) = x-0.5

So :
(1) RSqrt(x+dx) = RSqrt(x) -0.5*x-1.5*dx + o(dx*dx).
So when dx is small the second order o(dx*dx) is negligeable. There are some math properties with RSqrt which make this equation very useful. But that''s too complex to explain here.

This equation is the base of what''s called the Newton-Raphson refinement algorithm. You wanna compute y=RSqrt(x). Once you know a rather good approximation (*), I call it y''=RSqrt(x''), where y'' is close to y and thus x'' close to x, dx=x''-x you can exploit equation (1) :

And if you work a little bit with (1) you will find the explanation for this "magic" line (2) :
y*=(1.5f-0.5f*x*y*y);

(*) study the code of Carmack and try to find what this approximation is.
i = magic -( i>>1); // based on the IEEE 754 format, exponent and mantissa.

Else Sqrt(x)=x*RSqrt(x).

BTW (2) can be very useful to compute speedy exact point light contributions on scan lines (on a lightmap row for instance) because when you do u+du the light vector length varies continuously. Not RSQrt required just (2).

Share this post


Link to post
Share on other sites
Personally, I use the formula:
Ax + By + Cz + Dw = 0;
that way, when I do Dot(Plane,Point) I get the distance from the plane to the point(knowing my planes w = -dot(Plane, Point) and my points w = 1)

Because of this I have to use 4D dot products, I find this works perfectly as in the case Plane . Point the ws multiply to add d and in everything else(Vector . Point, Vector . Plane, Vector . Vector, Quaternion . Vector(untested)) at least 1 w = 0 and thus both ws are ignored

I assume this 4D stuff helps with SIMD as SSE is 4xfloat but im too lazy code in asm and when the time comes I''ll proberbly just get ICC or something and optimize it for SSE1 as everyone''ll have it by then :-D

for arguements sake I store vectors, quaternions, points and planes in the same variable, it works rather well as most of them have common functions(eg. dot products) and because of that, I code like theres no tomorrow(ie. I do all my debugging immediately after making a function, then never touch it again, only ever replace it entirely)

so I guess you could say I do "(1) Plane.N * P + Plane.d;" however the + Plane.d is hidden in the dot product as Plane.d*1

Share this post


Link to post
Share on other sites
Fine dreddlox so if we decide to release our lib as GPL you''ll be able to map your own routines to our perf macros/funcs or search/replace in your code to update it. You''ll benefit not only of the SSE but also 3DNow optimizations and even some FPU code (all scheduled C or asm).

For quaternions (q) I don''t have the simplified fomula in mind but It''s more complex than a dot product. I am sure there are some cross products in it since for a vector v it''s v'' = q*v*q-1.

I have only implemented the most common functions till now. I am almost sure it''s quicker to convert quat->mat43 (transposed submatrix) and then compute matrix * vector in SIMD (specially SSE) once you have several vectors to rotate/translate.

We wonder if ppl here would get interested by our very high perf and portable math lib. For instance C dot product in 2 clock cycles and highly scheduled vertex array functions (best possible parallelism, unrolling, most hand written). Since we make a lot of efforts, and many benefits later only considering our own devs, currently there could be two strategies for us :

(1) A "light version" would be free for indies with small royalties in case their products become commercial. What do you think about it ?

(2) A full GPL version. Our benefits would mainly be having users expand the code base (win-win cooperative logic). Eventually some clock cycle competitions would replace old asm or C code by more optimal code (tho I doubt many can beat me ).

I''ll probably open a new thread about it. Would some of you be interested in (1) or (2) ? Any comments ?

Share this post


Link to post
Share on other sites