Jump to content

  • Log In with Google      Sign In   
  • Create Account

Calling all IT Pros from Canada and Australia.. we need your help! Support our site by taking a quick sponsored surveyand win a chance at a $50 Amazon gift card. Click here to get started!

Vilem Otte

Member Since 11 May 2006
Offline Last Active Today, 04:39 AM

#5199626 Writing my own pathtracer on mobile device

Posted by Vilem Otte on 22 December 2014 - 08:21 PM

Well... I've been re-reading that code today in the morning and something caught my eye...


Your ray generation for AA rays is wrong!

Camera.x = Camera.x + (col + 0.5f) / 2.0f;
Camera.y = Camera.y + (row + 0.5f) / 2.0f;

Your 'Camera' is actually your origin (which must be shared between all the rays, you shouldn't move your camera.


A little theory - how antialiasing work ... well... let us start better, with simple question, why do we have aliasing?

1.) Let's draw an image - the following rectangles are pixels, and points are the positions where we shoot a ray:


│          │    

│  ■  │  ■  │  ■ 

│     │     │    


│     │     │    

│  ■  │  ■  │  ■ 

│     │     │    



For 2x2 anti-aliasing we can use for example this (uniform sampling)


│ ■ ■ ■ ■ │ ■ ■

│     │     │    

│ ■ ■ │ ■ ■ │ ■ ■


│ ■ ■ │ ■ ■ │ ■ ■

│     │     │    

│ ■ ■ │ ■ ■ │ ■ ■



Or this (Random sampling)








│    │     



There is one very important note, all the rays originate from the same point (so you will not move camera), but they have different ray direction vector.


How the computation works - each of your pixel is actually 4-sided pyramid, with apex in your camera origin - all you need is to compute 4 boundary vectors (minimum and maximum ray directions), all antialiasing samples will fall between the bounds of these (you can distribute them uniformly, randomly, or on some well-guessed spots (Quincunx AA (better than uniform in a box), poisson (one of the best), etc.).

#5198609 Writing my own pathtracer on mobile device

Posted by Vilem Otte on 16 December 2014 - 03:11 PM

Why your image turned black (respectively is black and white).


CColor class uses 1 byte to store each color channel (0-255), you are summing 4 values of range (0-255), you have overflow! You have to take care while working with data types - your have to sum colors F.e. in 32-bit value per channel. E.g.:

int r = 0;
int g = 0;
int b = 0;

for(int row = 0; row < 2; row++) {
    for(int col = 0; col < 2; col++){
        Vector3D direction = new Vector3D(xdir, ydir, zdir).normalize();
        Ray ray = new Ray(Camera, direction);
        CColor c = Trace(ray,1);
        r = r + (int)c.r;
        g = g + (int)c.g;
        b = b + (int)c.b;

r = r / 4;
g = g / 4;
b = b / 4;

return new CColor(r, g, b);

You can also use floats instead of int (or technically even short - 16-bit integers - will work). That is why you saw the issue you saw.


Ad. CColor - it stores the color value in 1 integer - 1 byte for red, 1 byte for green, 1 byte for blue and 1 byte for alpha; technically you can initialize it with float, but it will still store channel just in 1 byte. This is why I'd recommend creating your own class for holding color (in our software we use "float4" - 4 floats packed in class). Why? Simply because we need more range than 1 byte per channel (and using of CColor-like type may be confusing - you can use it, for output - but not for computations).

#5198422 Writing my own pathtracer on mobile device

Posted by Vilem Otte on 15 December 2014 - 03:51 PM

Congratulations, you've just built your first path tracer that can converge. It still has some serious problems (apart from problems when handling large scale scenes), when you overcome these it will start becoming useful smile.png.

  1. Convergence in directly lit areas takes ages, convergence in shadowed areas too.
  2. Your ray generation when you hit surface is... well, naive is the best word describing it (we denote techniques (or algorithms) as naive, when they produce correct result (and they are simple), but they are definitely not efficient)
  3. Allowing for different shapes and/or materials
  4. Allowing for TONs of shapes and materials in scene

Okay, brace for informations!


1. Explicit path tracing


The general problem of path tracing is this: If the path we're following doesn't hit the light, the contribution of this path is zero. The smaller your lights get, the lower probability of hitting them we have (and for perfect point lights we are at zero probability of hitting the light!).


But, don't worry - we don't write some black magic code to have fast path tracers for direct lights. The actual idea is quite simple, if at each given step in path we can connect to (any) light the path will have some contribution to the result (aka, we separate direct lighting from indirect lighting). I've did the full derivation in my Bachelor thesis few years ago (for full derivation, see chapter 3.2 http://is.muni.cz/th/396530/fi_b/Bachelor.pdf, you don't need to remember that though ... shameless self-promotion ph34r.png) - the whole idea works like this, at each step in path we sample one (random) light, compute direct illumination from it (note, we need to cast shadow ray to determine visibility between current path step hitpoint and random point on that light) and continue in the path (if we accidentally hit the light, we have to discard that value!).


This way it is quite easy to achieve very good looking images in few samples per pixel ... there will still be problems (caustics still takes quite long to converge, dark room standing next to lit room, etc.) - they also have solutions (Bi-Directional Path Tracing).


Alright, I'm talking enough for this - time for some code:

float4 trace(Ray r, Scene s) {
    // Calculate next step on path
    RayResult res = s.findClosestIntersection(r);
    float4 hitPoint = r.origin + r.direction * res.distance * EPS;
    float3 normal = s.objects[res.hitID].getNormal(res.barycentric);
    float3 color= s.objects[res.hitID].getColor();

    float4 result = float4(0.0, 0.0, 0.0, 0.0);

    // Explicit step
    if(s.objects[res.hitID].isLight() == false) {
        int lightID = s.getRandomLight();
        float4 randomLightPoint = s.objects[lightID].getRandomPosition();
        Ray shadow = new Ray(hitPoint, (randomLightPoint - hitPoint).normalize());
        RayResult shadowResult = s.findClosestIntersection(shadow);
        if (s.objects[shadowResult.hitID].isLight() == true) {
            result = dot(normal, shadow.getDirection()) * attenuation * color * s.objects[shadowResult.hitID].getColor();

    // Next step, beware, you have to correctly weight depending on your PDF!
    return result + weight * color * trace(r.generateNextPathStep(), s);

2. Sampling....


Assuming you are still reading smile.png ... there is a way how to heavily implement areas where there is only indirect illumination. Now a little theory - before I described a way you use for generating your ray directions, the way works, but it is bad - you are working with perfectly diffuse materials (as for now, once you just into reflective you will have to change your ray generator to work for them), now the current generator will work for reflected rays too, but it has to (randombly) pick exactly the ray in direction of perfect reflection of incident ray (btw. probability of doing that is equal to 0). That is why we use different ray generator for reflected rays.


Where am I getting? All your samples should be weighted by cosine of angle between ray and normal of the surface (to keep energy conservation and of course to be physically correct). Now, as each sample is multiplied by the cosine of that angle - it means that rays at grazing (high angles - almost parallel with the surface they are hitting) have very, very low weight (their contribution is almost zero). Technically we want to generate more rays in direction similar to normal and less rays in directions close to be perpendicular to normal.


For further description you can also look inside my thesis (Chapter 3.3). There is also a code provided for faster uniform random number generator (and thus ray direction generator), and for cosine-weighted random number generator (better for diffuse surfaces). Of course for each different shading model you have another "ideal" ray generator. This technique is called importance sampling (you compute mainly those samples that are important).


3. Shapes & Materials


This is the part where you most likely get very soon, I'd advice to implement previous 2 before that (1. will allow you to see the resulting image very fast, thus increasing your speed in debugging shapes and materials; 2. (if implemented correctly & in a clever way) will allow you to use different sampling methods for different materials, which is very, very important for any useful path tracer.


4. Acceleration Structures


Once you have 1, 2, 3, and 4 - it is time to jump into large scenes. I won't dive into them now, but to give you little preview:




This Crytek Sponza image is what I've rendered just now on NVidia Mobility GeForce 720M (which is low end), at full laptop display resolution (1366x768 ... minus something on window borders), frame rate 12 fps; On Radeon R7 (the power of desktop) next to it runs at realtime frame rates (I didn't do the benchmark though) ... that is why acceleration structures are so important.


Note, there is some aliasing, because - no AA filter is running in that image, no texture filter is running in that image (and yes, it decreases performance), no packet tracing or other kind of hack is used, the transparent materials are root of all evil (I have to terminate ray and create new one), acceleration structure is SplitBVH, ray tracing runs on OpenCL, the resulting buffer is showed using OpenGL and PBO.

#5198358 Writing my own pathtracer on mobile device

Posted by Vilem Otte on 15 December 2014 - 11:27 AM

Uhm, I'm not sure why you don't get exceptions with that code (as you should get them, there are few things done wrong + you won't see progressive at all afaik) ... time to refresh my java memory:

private float[] accumulator;
private int samples;

// You need 1 float value per channel!
public void RestartProgressiveRendering() {
    this.accumulator = new float[this.canvas.getWidth() * this.canvas.getHeight() * 3];
    this.samples = 0;

public void RenderOneStep() {
    int width = this.canvas.getWidth();
    int height = this.canvas.getHeight();
    int i = 0;
    for (int x = 0; x < width; x++) {
        for (int y = 0; y < height; y++) {
            CColor sample = ComputeSample(x, y, width, height);
            // The accumulator is used to sum all the samples since last RestartProgressiveRendering, thus you 
            // have to either use +=, or write it this-like
            accumulator[i * 3 + 0] = accumulator[i * 3 + 0] + sample.r;
            accumulator[i * 3 + 1] = accumulator[i * 3 + 1] + sample.g;
            accumulator[i * 3 + 2] = accumulator[i * 3 + 2] + sample.b;


public Canvas DisplayResult() {
    int width = this.canvas.getWidth();
    int height = this.canvas.getHeight();
    Paint p = new Paint();
    int i = 0;
    for (int x = 0; x < width; x++) {
        for (int y = 0; y < height; y++) {
            // Here you get each single channel from accumulator and divide by pixels
            this.canvas.drawPoint(Color.rgb((int)(accumulator[i * 3 + 0] / (float)samples), 
                (int)(accumulator[i * 3 + 1] / (float)samples), 
                (int)(accumulator[i * 3 + 2] / (float)samples)));


Just few notes on optimization:

  • in public void RestartProgressiveRendering method, you shouldn't re-allocate the buffer (even though allocations sometimes seems to be cheaper in Java virtual machines than in compiled languages, it isn't as cheap as zero-outing the buffer). The (re)allocation should be put into separate method called Resize (you should call that when user F.e. rotates screen)
  • Using canvas for this is in my opinion not performance wise. Technically you want to use OpenGL ES PBO (which is available for ES 3.0), for older versions there is GL_OES_EGL_image which works almost the same (for transforming data from CPU memory to GPU memory and thus better for rendering).

#5198242 Writing my own pathtracer on mobile device

Posted by Vilem Otte on 14 December 2014 - 10:40 PM

Alright, for some reason I woke up very early in the morning... so let's get started:


For each pixel you want: c=1/n sum_0^n x_n



  • c is the resulting color of the pixel
  • n is the number of samples rendered
  • x_n is the sample

The most obvious and easiest way to perform progressive rendering is to use a buffer where you accumulate samples, like:

Buffer accumulator;
int samples;


    samples = 0;

    for each pixel (i)
        accumulator[i] += ComputeSample(i);

    for each pixel (i)
        resultPixel[i] = accumulator[i] / (float)samples;

Now, this approach works (note, the shown code is written like pseudo-code, you can do a ton of optimizations in there using parallelization and moving that ugly slow division prior to for loop and use multiplication inside).


Each time you move something in your scene (or your camera has moved) you have to call ResetProgressiveRendering. There are several problems with this approach:

  1. You have to hold 32-bit floating point buffer for accumulator (or 16-bit floating point, or 16-bit integer ... but you will hit precision problems early ... for 16-bit integer it will be as soon as 256 samples per pixel, for 16-bit fp probably a bit later (it depends!) ... 32-bit fp should be okay)
  2. 32-bit fp RGB buffer means ... width * height * 4 * 3 bytes of data at least, that is 4 times more than for 8-bit integer RGB buffer; it is not such a big deal on PC, but on android devices you don't want to waste most of the memory on output buffer

There is another approach, where you can also always keep the buffer you're already showing and divide at the point where you add new sample into buffer. Time for pseudo-code:

Buffer result;
int samples;


    samples = 0;

    for each pixel (i)
        result[i] = result[i] * samples / (float)(samples + 1) + ComputeSample(i) / (float)(samples + 1);


Note, you must pre-compute samples / (float)(samples + 1) and 1.0f / (float)(samples + 1) ... most compilers are not intelligent enough to pre-compute these parts of the equation and you would be doing a lot more divisions (for 1000x1000 pixels it is 2M divisions per frame (from Intel Optimization Manual on modern Core i7 single fp division eats 24 clocks, multiplication 1 or 2 clocks (can't remember correctly) ... this means that by not pre-computing these you do at least 22 more clocks 2 milion times per frame ... thats 44M clocks wasted -> just to give you little insight into the optimization - and why it heavily matters here)


This solution has an advantage, if you do correct casting, result can be 8-bit integer RGB buffer - that means less memory, less fillrate used and that means more speed!


Of course it also has disadvantages:

  1. If you're not careful, you will lose precision (very quickly), otherwise you will jus lose it (definitely a lot faster than in the previous case for 32-bit fp solution). Although, if your path tracing algorithm is good enough (F.e. 4 samples per pixel at time, doing bi-directional path tracing ,etc.) it may be enough to combine just several frames of progressive resulting in smooth image (with smooth caustics!) ... quite good, don't you think?
  2. As the buffer where you sum samples is your result buffer, it may cause some troubles with tone-mapping or such (in case you combine your samples wrongly - because those inside result buffer will be already after tone mapping (in case of tone mapping)) ... or you can use F.e. 16-bit fp RGB buffer instead, and do tone mapping after everything (which will work well too!).

It is up to you which one will you select, honestly on Android I'd prefer going the way that uses the least memory and the least fillrate (and the least processing power), as there isn't that much computing power as on the PC. Note, inside NTrace project we have a path tracer (actually there is more path tracing algorithms, they use different sampling strategy, etc.) that uses 2nd solution on PCs and it works well (if I remember correctly we got quite good results with 8-bit integer targets there too!).

#5198225 Writing my own pathtracer on mobile device

Posted by Vilem Otte on 14 December 2014 - 07:55 PM

What you generally want is a buffer where you accumulate samples and divide by number of sample count. I will elaborate more tomorrow as it is 3am here atm and there would be tons of mistakes from my side. wink.png

#5197702 Writing my own pathtracer on mobile device

Posted by Vilem Otte on 11 December 2014 - 07:04 PM

I'm still at work today (I've got a demo day tomorrow with one application, so I will be most likely sleeping for just few hours). But as we're running tests that take very long time, so...


Why you see black/colored pixels


The hint you want had already been mentioned:

You cast a ray from camera origin to sphere - you compute your hitpoint as "HitPoint = Origin + Distance * Direction", which will most likely get you INSIDE the sphere (due to precision of floating point numbers).


Now, a quick hack is using something like "HitPoint = Origin + Distance * 0.99 * Direction", this isn't a real solution, but this hack works 99% of time and for start it is absolutely valid to use it. Technically you would like the constant to be relative to your distance (floats are quite precise between 0.0 and 1.0, less precise in thousands, and even less precise when you get to millions, billions and up ... the higher your value (in absolute value) is, the worse the problem is, and the constant offset also needs to be higher).


For a quick view into precision problem - go check this site https://randomascii.wordpress.com/2012/03/08/float-precisionfrom-zero-to-100-digits-2


E.g. modify your hitpoint calculation as:

HitPoint = ray.origin.add(ray.direction.multiply(distance * 0.99f));

This way you should see something. (It won't be perfect and lots of rays miss, you will need better sampling than just randomly shoot rays - once you got this running, I will describe few ways how to improve it).

#5197109 Writing my own pathtracer on mobile device

Posted by Vilem Otte on 09 December 2014 - 02:22 AM

Your random number generator and logic behind it is wrong. Sadly I have to go off from PC in few minutes so I won't be able to give full answer and some background why is that (I will do that later).


As for now - your random direction should be normalized (unless you normalize it in ray constructor - which I don't think you do). There are few rules about the "random" direction and generally random number generation behind - you should use different random number generators for different surfaces. As of now, go with:

// Initialization of RNG, try to do this only once for whole random direction generator object - SEED is some number (F.e. 123456 ... or current timestamp, or anything :) )
Random rng = new Random(SEED);


// Your random direction into sphere generation code

Vector3D v = new Vector3D();
v.x = rng().NextDouble() * 2.0 - 1.0;
v.y = rng().NextDouble() * 2.0 - 1.0;
v.z = rng().NextDouble() * 2.0 - 1.0;


// Your random direction into hemisphere
Vector3D v = GenerateRandomDirectionIntoSphere();
if (v.dot(normal) < 0.0)
    // Where negate does return new Vector3D(-v.x, -v.y, -v.z);
    v = v.negate();

The process of "how" you generate random numbers determines how good is your path tracer (I will elaborate later ;) ).

#5196637 Writing my own pathtracer on mobile device

Posted by Vilem Otte on 06 December 2014 - 10:52 AM



Your ray generation code is a nonsense... for F.e. x = 0, y = 0 you create a vector (0, 0, 0) and normalize it - all of its components are divided by sqrt(0^2 + 0^2 + 0^2) ... afaik sqrt(0) = 0 (or technically +0 or -0, both are correct answers) ... nothing changes the fact that you're dividing by 0. (Fyi. in mathematics sqrt(0) depends on convention - there are multiple definitions - one of the conventions also specifies that 0^2 is undefined, thus also sqrt(0) is undefined).


You want your rays to go from left to right, where x=0 will be in the middle column of pixels, and y=0 in middle row of pixels - also your z determines your field of view angle. Technically you want something like this:

float xdir = (x / (float)width) * 2.0f - 1.0f; // Keep x direction in interval <-1; 1>
float ydir = ((y / (float)height) * 2.0f - 1.0f) * aspect; // Keep y direction in interval <-1; 1>, aspect is (height / width) in your case
float zdir = 1.0f / (float)tan(fov); // Where fov represents field-of-view angle (in radians)
Ray ray = new Ray(camera, new Vector3D(xdir, ydir, zdir).normalize());

I hope xdir and ydir makes sense (their computation), zdir can be calculated using trigonometry.


This will generate correctly your ray directions, assuming your intersection code is correct - you should see a sphere.

#5195808 Writing my own pathtracer on mobile device

Posted by Vilem Otte on 01 December 2014 - 07:40 PM

Welcome to the forums.


I'm glad there is another fan of ray tracing and another brave one to try writing them. Although for the beginning, allow me to share few things I've learned (mostly the hard way).


First of all, you need to know at least some math (3D math -> vectors, solving equations, etc.). I would recommend you to try naive ray tracing first (just primary rays, then try whitted ray tracing) - and next you can dive into more complex stuff. Be sure to check out few ways how to optimize your renderer (BVHs, KDTrees, Grids, Nested Grids, etc.), and don't forget also to optimize your intersection routines (you will need them precise and fast for path tracing). Once you've got this, you got basics to jump into wonderful world of global illumination.


Android is indeed an interesting plaform, I'm though not sure how fast you will get without native code (Java is slow - but for learning, why not ... you can also always move to native code later - and optimize from there).


Anyways, I wish you best of luck in wonderful world of infinitely bouncing rays, feel free to ask any further question, I'll try to answer... to your questions so far:


1.) Currently I'm working mostly with high performance ray tracing (where we decided to use just single type of primitive - triangle). Although I've worked with CPU-based general ray tracers, the approach worked like this:


There was a Primitive class - it had several virtual functions (including the one for Ray-Primitive test, and also GetAABB (Axis-aligned bounding box) which was used when building KD-Tree for the scene). Triangle, Sphere, Plane, etc. all these inherited Primitive class and implemented those functions. This approach has some overhead (V-tables), but it is the most clean way to allow multiple primitives inside your scene.


2.) Yes - in my current ray tracing project I have "Math" part of project, which contains "Numeric" folder - where you have all numeric classes needed -> float4 (4 component vector), mat4 (4x4 matrix), quat (quaternion - to represent spatial rotations).


Apart from that I also have "Geometric" folder - where I do have all primitives (Ray, Plane, Axis-Aligned Bounding Box, Sphere, Triangle, etc. - note. even though I use only triangles inside the scene it is important to note, that the others are also used inside the project).


Even though it is not absolutely necessary to distinguish the Math and Geometric structures, I recommend to do so (as your ray tracer will grow it can become bloody mess and then you will need refactoring).


3. and 4.) For naive ray tracing this is correct - like:

for (y = 0 ... height)
    for (x = 0 ... width)
        shadePixel(x, y);

Now, in naive (very simple ray tracer) your shadePixel function will look like:

shadePixel(x, y)
    d = INFINITY;
    id = -1;
    hit = false;
    for (i ... numOfObjects)
        if (id' = intersect(object[i], ray[x, y], &d'))
            if (d' < d & d' > 0)
                d = d'
                id = id'
                hit = true
    if (hit == true)
        pixel[x, y] = object[id].color

This is very basic shadePixel function - it guarantees that you always write out color of closest hit along the ray (that is non-negative).


I recommend to do 5. and 6.) once you have at least basic functionality like the above written. Hint - you can use recursion inside shadePixel, you just have to calculate secondary ray direction and origin (origin is a hitpoint, direction depends on what material have you hit).


If you update the progress (and possibly show results and some parts of code - I can give you further hints ;).

#5192613 Good C (not C++) math library?

Posted by Vilem Otte on 13 November 2014 - 03:18 AM

#Nairou C11 won't be in MSVC for quite long time (As Ravyne said there isn't a solid C99 compatibility as for now). Also I don't think Microsoft sees C compiler as their main focus (they focus on the C# mainly), and I don't really believe the situation will get better in next years.


I would second SIMDx86, although it has been 2 years since the last time I've used it - it is now marked as "inactive" on SF (there is SSE, SSE3 and Altivec implementation as for now - no AVX).

#5190071 Tube AreaLight - Falloff problem

Posted by Vilem Otte on 29 October 2014 - 08:34 PM



Important note - I'm trying to describe area lighting a bit, I couldn't test your code at this point. Anyways maybe you find my notes useful:


What we have - a tube light defined by 2 points (L0 and L1) and radius ® ... (for the simplicity let us assume that it forms a capsule shape). What we need to calculate for each point we're shading (P):

  • Closest point on tube (this can be used for attenuation)

Our 2 points define a line. Let us define that line as L0 + t * (L1 - L0) ... and we're searching for 't'. This can be done simply using formula derived here -> http://mathworld.wolfram.com/Point-LineDistance3-Dimensional.html - e.g.: t = dot(L1 - P, L2 - L1) / dot(L1 - L0, L1 - L0) - now we clamp t between 0 and 1 (e.g. if it is smaller than 0, then it equals 0 ... if it is larger than 1, then it equals 1). Thus we got our closest point on line... (lets note it as Cl)


Closest point on capsule (Cc) thus equals Cc = Cl + normalize(P - Cl) * r; (NOTE: for actual distance it is enough to use d = length(P - Cl) - r).


From here we can correctly compute attenuation.

  • Calculate diffuse light

Now this is a challenge. What we actually need to compute right now is how large solid angle is occupied by the capsule in hemisphere above given point (in direction of its normal).


Note: Your approach in code (using 2 endpoints to light it) is just wrong - you are not lighting using area light, but just using 2 points.


Generally solving solid angle (which is possible for simple analytic shapes like capsule is), would give us how much of whole capsule is visible at given point, from here we can calculate diffuse light. I won't write full process of integration here (I've quickly searched net whether someone actually computed equation for solid angle of capsule and I couldn't find any results) - it would take some time with fully running brain (it is generally not a good idea to do this kind of computations at 3am), so I will try during tomorrow.


The problem can be also solved using Monte-Carlo approach - using some good random number generator generate N points over the capsule (Note: that they need to be uniformly distributed over its surface - this is also a challenge, especially for capsule!), and use these points to calculate diffuse light.


  • Calculate specular light

We can't really take approach of monte carlo here (actually we can, but N would really need to be large number - and thus it would be slow), because seeing specular of 16 point lights instead of reflected capsule wouldn't look really convincing (using 10000 on the other hand, for good sized capsule, would look convincing though - but that is insane approach for realtime graphics).


We can do better, all we need is to derive ray-capsule intersection test (actually we're fine with ray-sphere and ray-cylinder tests - as every capsule can be composed of these ... Note: we could also use these 3 to simplify integration inprevious step). For each shading pixel we generate reflection vector and test that one against capsule (which is reflected (aka specular) light).


Now this will work for non-rough surfaces (wet surfaces), for rough surfaces we could use this approach - we generate reflected ray and compute minimal distance between capsule and reflected ray (note, it can be simplified to minimal distance between 2 lines), based upon this distance and radius of our light we can create quite good looking smooth specular (that won't "sparkle", compared to direct ray-capsule intersection).

#5161607 float or double depth buffer?

Posted by Vilem Otte on 19 June 2014 - 07:19 PM

OK, I've tried the different versions (lowest time - no triangle on screen, highest time - closeup view, triangles all over the screen):

tie50.exe -> from 22ms to 44ms

tie51.exe -> from 19ms to 31ms

tie53.exe -> from 22ms to 43ms
tie53.exe -> from 19ms to 38ms

#5161178 float or double depth buffer?

Posted by Vilem Otte on 17 June 2014 - 05:58 PM

I've tried it - I can confirm that no malware or old school virus has been installed (or it overcame my AV here). Regarding speed, I got 22ms to 44ms on Haswell-based Core i3 (in laptop). So it was quite fast...


What was visible though, your depth test isn't precise, the parts on hull flickered, dunno why (maybe it was just on my machine, or it's just setup of projection matrix that has intentionally set up near and far plane to test the precision).

#5160806 Build a render farm using DirectX and radiosity?

Posted by Vilem Otte on 16 June 2014 - 03:44 AM

On Windows you don't physically need a monitor hooked up, but in order to use D3D the video card needs to be set up as a display device with a compatible WDDM display driver. The same goes for CUDA, and I would assume OpenGL/OpenCL (I don't have experience with those so I can't tell you for sure).


Actually with OpenCL, the situation is a little bit more "complicated". You can use different device types to execute your OpenCL kernels - as for now, there are 3 supported:

CL_DEVICE_TYPE_CPU - OpenCL will use host processor for computation. This option works fine in terminal without starting X (in linux), although you would need CPU-based renderfarm for this (of course with enough CPUs with lots of cores you can beat GPUs) - but it will most likely be more expensive.

CL_DEVICE_TYPE_GPU - OpenCL will use GPU device to perform the computations. Each vendor implemennts his own OpenCL driver differently:

  • AMD implementation allows running OpenCL applications without running actual xserver (I think you need root privileges to run the app like that - but the last time I tried this was like year ago, things might have changed since then - back then the feature was called "console mode" in clinfo ... since Catalyst 13.4 as far as I remember)
  • NVidia, I haven't digged into this one. I know it is possible for CUDA, so I bet you can do the same with their OpenCL.

CL_DEVICE_TYPE_ACCELERATOR - Special OpenCL accelerators (IBM CELL Blade or Xeon Phi), I would like to get my hands on Xeon Phi and try it out. But well, it's very expensive and also hard to get (I mean like compare it to any current GPU in terms of "what you get for your money")