Hi all!
Last weekend I took the time to code a class which can rasterize triangles basing my work on some code I could find on the net, particularly
this one over at devmaster. Over the course of this week I extended it and now I would say it is a complete solution.
Features:
- The rasterizer can support an arbitrary number of render targets (you will most likely use two, a color buffer and a depth buffer)
- The rasterization is completely decoupled from the actual shading of the visible pixels. You can configure the rasterizer with a pixel shader which does the actual work of computing and assigning a color value.
- It does only use integer math because i intend to use it on the GP2X which does not have an FPU.
- The rasterizer is tile based. Currently it uses blocks of 8x8 pixels.
- It interpolates an arbitrary number of integer varyings across the triangle (so you can use fixed point here). This is done perspectively correct for the corners of each 8x8 block and affine within each block to avoid the costly per pixel divide.
- It supports a clipping rectangle.
- It provides a means for the pixel shader to compute the derivative of the interpolated varyings. This is needed for example to compute the texture mimmap level from the texture coordinates.
- It allows for an early depth test. For example the shader could store the minimum depth value for each 8x8 block and than discad a whole block if the minimum expected depth value for this block is greater than the one stored.
- The source code is actually quite short ~600 lines with a lot of comments.
The only problem I can see right now is with small or large but thin triangles. Because the rasterizer is tile based it must at least scan a whole 8x8 block and test each pixel if it is inside the triangle or not.
Large triangles are handled quite efficiently since for the inner part only the corners are tested for inout.
What do you think about this. How big a performance problem might this be when targeting the GP2X?
I include the code here for everyoune to look at. I would be very thankful for any input of possible improvements.
Header file:
/*
Copyright (c) 2007, Markus Trenkwalder
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of the <ORGANIZATION> nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef RASTERIZER_EE4F8987_BDA2_434d_A01F_BB3446E535C3
#define RASTERIZER_EE4F8987_BDA2_434d_A01F_BB3446E535C3
class Rasterizer {
public:
// some constants
static const int MAX_RENDER_TARGETS = 2;
static const int MAX_VARYING = 8;
static const int BLOCK_SIZE = 8;
public:
// Type definitions
struct Vertex {
int x, y; // in 28.4 fixed point
int z; // range from 0 to 0x7fffffff
int w; // in 16.16 fixed point
int varyings[MAX_VARYING];
};
// holds the pointers to each render target.
// These will be passed to the fragment shader which then can write to
// the pointed to location. Note: Only the first n pointers will be
// valid where n is the current number of render targets
struct BufferPointers {
void* ptr[MAX_RENDER_TARGETS];
};
// This is the data the fragment shader gets
struct FragmentData {
int z;
int varying[MAX_VARYING];
};
typedef FragmentData PixelBlock[BLOCK_SIZE][BLOCK_SIZE] ;
class RenderTarget {
public:
virtual int width() = 0;
virtual int height() = 0;
virtual int stride() = 0;
virtual void *buffer_pointer() = 0;
virtual int element_size() = 0;
virtual void clear(int x, int y, int w, int h) = 0;
};
class FragmentShader {
public:
// This provides a means for an early depth test.
// x and y are the coordinates of the upper left corner of the current block.
// If the shader somewhere stores the minimum z of each block that value
// can be compared to the parameter z.
// returns false when the depth test failed. In this case the whole block
// can be culled.
virtual bool early_depth_test(int x, int y, int z) { return true; }
// This notifies the shader of any render target clears.
// This is meant to be used in conjunction with the early depth test to update
// any buffers used
virtual void clear(int target, int x, int y, int w, int h) {}
// To compute the mipmap level of detail one needs the derivativs in x and y of
// the texture coordinates. These can be computed from the values in the pixel
// block since all the fragment values have alredy been computed for this block
// when this is called
virtual void prepare_for_block(int x, int y, PixelBlock b) {}
// This tells the rasterizer how many varyings this fragment shader needs
virtual int varying_count() = 0;
// This is called once for each visible fragment inside the triangle
// x and y are the coordinates within the block [0, BLOCK_SIZE[
// the pixel block is indexed with p[y][x] !!!
virtual void shade(const BufferPointers&, const PixelBlock& b, int x, int y) = 0;
};
private:
// Variables
struct RenderTargetParams {
int count;
int minwidth, minheight;
// cache these params to avoid too
// many virtual function calls
int stride[MAX_RENDER_TARGETS];
int element_size[MAX_RENDER_TARGETS];
} rendertarget_params_;
RenderTarget *rendertargets_[MAX_RENDER_TARGETS];
FragmentShader *fragment_shader_;
struct {
int x0, y0, x1, y1;
} clip_rect_;
private:
bool setup_valid();
public:
// constructor
Rasterizer();
public:
// main interface
// Set the render targets.
// This resets the clipping rectangle
void rendertargets(int n, RenderTarget* rt[]);
// set the fragment shader
void fragment_shader(FragmentShader *fs);
void clear();
void clear(int target);
void clip_rect(int x, int y, int w, int h);
// The triangle must be counter clockwise in screen space in order to be
// drawn.
void draw_triangle(const Vertex &v1, const Vertex &v2, const Vertex &v3);
};
#endif
Implementation file:
/*
Copyright (c) 2007, Markus Trenkwalder
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of the <ORGANIZATION> nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include "rasterizer.h"
#include <cmath>
#include <cassert>
#include <algorithm>
#ifndef _MSC_VER
#include <stdint.h>
#else
#include "stdint.h"
#endif
////////////////////////////////////////////////////////////////////////////////
// utility functions
namespace {
inline int min(int a, int b, int c)
{
return std::min(std::min(a,b), c);
}
inline int max(int a, int b, int c)
{
return std::max(std::max(a,b), c);
}
inline void compute_plane(
int v0x, int v0y,
int v1x, int v1y,
int v2x, int v2y,
int z0, int z1, int z2, int64_t plane[4])
{
const int px = v1x - v0x;
const int py = v1y - v0y;
const int pz = z1 - z0;
const int qx = v2x - v0x;
const int qy = v2y - v0y;
const int qz = z2 - z0;
/* Crossproduct "(a,b,c):= dv1 x dv2" is orthogonal to plane. */
const int64_t a = (int64_t)py * qz - (int64_t)pz * qy;
const int64_t b = (int64_t)pz * qx - (int64_t)px * qz;
const int64_t c = (int64_t)px * qy - (int64_t)py * qx;
/* Point on the plane = "r*(a,b,c) + w", with fixed "r" depending
on the distance of plane from origin and arbitrary "w" parallel
to the plane. */
/* The scalar product "(r*(a,b,c)+w)*(a,b,c)" is "r*(a^2+b^2+c^2)",
which is equal to "-d" below. */
const int64_t d = -(a * v0x + b * v0y + c * z0);
plane[0] = a;
plane[1] = b;
plane[2] = c;
plane[3] = d;
}
inline int solve_plane(int x, int y, const int64_t plane[4])
{
assert(plane[2] != 0);
return (int)((plane[3] + plane[0] * x + plane[1] * y) / -plane[2]);
}
template <int denominator>
inline void floor_divmod(int numerator, int &floor, int &mod)
{
assert(denominator > 0);
if(numerator >= 0) {
// positive case, C is okay
floor = numerator / denominator;
mod = numerator % denominator;
} else {
// Numerator is negative, do the right thing
floor = -((-numerator) / denominator);
mod = (-numerator) % denominator;
if(mod) {
// there is a remainder
floor--; mod = denominator - mod;
}
}
}
// Fixed point division
template <int p>
inline int32_t fixdiv(int32_t a, int32_t b)
{
#if 0
return (int32_t)((((int64_t)a) << p) / b);
#else
// The following produces the same results as the above but gcc 4.0.3
// generates fewer instructions (at least on the ARM processor).
union {
int64_t a;
struct {
int32_t l;
int32_t h;
};
} x;
x.l = a << p;
x.h = a >> (sizeof(int32_t) * 8 - p);
return (int32_t)(x.a / b);
#endif
}
// Perform a fixed point multiplication using a 64-bit intermediate result to
// prevent overflow problems.
template <int p>
inline int32_t fixmul(int32_t a, int32_t b)
{
return (int32_t)(((int64_t)a * b) >> p);
}
} // end anonymous namespace
////////////////////////////////////////////////////////////////////////////////
Rasterizer::Rasterizer() :
fragment_shader_(0)
{
rendertarget_params_.count = 0;
}
bool Rasterizer::setup_valid()
{
return rendertarget_params_.count >= 1 &&
fragment_shader_ != 0;
}
void Rasterizer::clear()
{
for (int i = 0; i < rendertarget_params_.count; ++i)
clear(i);
}
void Rasterizer::clear(int target)
{
assert(target <= rendertarget_params_.count);
rendertargets_[target]->clear(0, 0,
rendertarget_params_.minwidth, rendertarget_params_.minheight);
// notify shader about clear (might want to update internal data structutes)
if (fragment_shader_)
fragment_shader_->clear(target, 0, 0,
rendertarget_params_.minwidth, rendertarget_params_.minheight);
}
void Rasterizer::clip_rect(int x, int y, int w, int h)
{
if (rendertarget_params_.count == 0) return;
clip_rect_.x0 = std::max(0, x);
clip_rect_.y0 = std::max(0, y);
clip_rect_.x1 = std::min(x + w, rendertarget_params_.minwidth);
clip_rect_.y1 = std::min(y + h, rendertarget_params_.minheight);
}
////////////////////////////////////////////////////////////////////////////////
// main interface
void Rasterizer::rendertargets(int n, RenderTarget* rt[])
{
assert(n <= MAX_RENDER_TARGETS);
RenderTargetParams &rtp = rendertarget_params_;
rtp.count = n;
if (n == 0) return;
rtp.minwidth = rt[0]->width();
rtp.minheight = rt[0]->height();
for (int i = 0; i < n; ++i) {
rendertargets_ = rt;
rtp.minwidth = std::min(rtp.minwidth, rt->width());
rtp.minheight = std::min(rtp.minheight, rt->height());
// cache these to avoid too many virtual function calls later
rtp.element_size = rt->element_size();
rtp.stride = rt->stride();
}
clip_rect_.x0 = 0;
clip_rect_.y0 = 0;
clip_rect_.x1 = rtp.minwidth;
clip_rect_.y1 = rtp.minheight;
}
void Rasterizer::fragment_shader(FragmentShader *fs)
{
assert(fs != 0);
fragment_shader_ = fs;
}
void Rasterizer::draw_triangle(const Vertex &v1, const Vertex &v2, const Vertex &v3)
{
if (!setup_valid()) return;
int64_t zPlane[4];
int64_t wPlane[4];
int64_t vPlane[MAX_VARYING][4];
compute_plane(v1.x, v1.y, v2.x, v2.y, v3.x, v3.y, v1.z, v2.z, v3.z, zPlane);
compute_plane(v1.x, v1.y, v2.x, v2.y, v3.x, v3.y, // interpolate 1/w across triangle
fixdiv<16>(1 << 16, v1.w),
fixdiv<16>(1 << 16, v2.w),
fixdiv<16>(1 << 16, v3.w),
wPlane);
int varying_count = fragment_shader_->varying_count();
for (int i = 0; i < varying_count; ++i)
compute_plane(
v1.x, v1.y, v2.x, v2.y, v3.x, v3.y,
fixdiv<16>(v1.varyings, v1.w),
fixdiv<16>(v2.varyings, v2.w),
fixdiv<16>(v3.varyings, v3.w),
vPlane
);
// Deltas
const int DX12 = v1.x - v2.x;
const int DX23 = v2.x - v3.x;
const int DX31 = v3.x - v1.x;
const int DY12 = v1.y - v2.y;
const int DY23 = v2.y - v3.y;
const int DY31 = v3.y - v1.y;
// Fixed-point deltas
const int FDX12 = DX12 << 4;
const int FDX23 = DX23 << 4;
const int FDX31 = DX31 << 4;
const int FDY12 = DY12 << 4;
const int FDY23 = DY23 << 4;
const int FDY31 = DY31 << 4;
// Bounding rectangle
int minx = (min(v1.x, v2.x, v3.x) + 0xF) >> 4;
int maxx = (max(v1.x, v2.x, v3.x) + 0xF) >> 4;
int miny = (min(v1.y, v2.y, v3.y) + 0xF) >> 4;
int maxy = (max(v1.y, v2.y, v3.y) + 0xF) >> 4;
// consider clipping rectangle
minx = std::max(minx, clip_rect_.x0);
maxx = std::min(maxx, clip_rect_.x1);
miny = std::max(miny, clip_rect_.y0);
maxy = std::min(maxy, clip_rect_.y1);
// Start in corner of 8x8 block
minx &= ~(BLOCK_SIZE - 1);
miny &= ~(BLOCK_SIZE - 1);
BufferPointers buffers;
for (int i = 0; i < rendertarget_params_.count; ++i)
buffers.ptr = (char*)rendertargets_->buffer_pointer() + miny * rendertargets_->stride();
// Half-edge constants
int C1 = DY12 * v1.x - DX12 * v1.y;
int C2 = DY23 * v2.x - DX23 * v2.y;
int C3 = DY31 * v3.x - DX31 * v3.y;
// Correct for fill convention
if(DY12 < 0 || (DY12 == 0 && DX12 > 0)) C1++;
if(DY23 < 0 || (DY23 == 0 && DX23 > 0)) C2++;
if(DY31 < 0 || (DY31 == 0 && DX31 > 0)) C3++;
// Loop through blocks
for(int y = miny; y < maxy; y += BLOCK_SIZE)
{
for(int x = minx; x < maxx; x += BLOCK_SIZE)
{
// Corners of block
int x0 = x << 4;
int x1 = (x + BLOCK_SIZE - 1) << 4;
int y0 = y << 4;
int y1 = (y + BLOCK_SIZE - 1) << 4;
// Evaluate half-space functions
bool a00 = C1 + DX12 * y0 - DY12 * x0 > 0;
bool a10 = C1 + DX12 * y0 - DY12 * x1 > 0;
bool a01 = C1 + DX12 * y1 - DY12 * x0 > 0;
bool a11 = C1 + DX12 * y1 - DY12 * x1 > 0;
int a = (a00 << 0) | (a10 << 1) | (a01 << 2) | (a11 << 3);
bool b00 = C2 + DX23 * y0 - DY23 * x0 > 0;
bool b10 = C2 + DX23 * y0 - DY23 * x1 > 0;
bool b01 = C2 + DX23 * y1 - DY23 * x0 > 0;
bool b11 = C2 + DX23 * y1 - DY23 * x1 > 0;
int b = (b00 << 0) | (b10 << 1) | (b01 << 2) | (b11 << 3);
bool c00 = C3 + DX31 * y0 - DY31 * x0 > 0;
bool c10 = C3 + DX31 * y0 - DY31 * x1 > 0;
bool c01 = C3 + DX31 * y1 - DY31 * x0 > 0;
bool c11 = C3 + DX31 * y1 - DY31 * x1 > 0;
int c = (c00 << 0) | (c10 << 1) | (c01 << 2) | (c11 << 3);
// Skip block when outside an edge
if(a == 0x0 || b == 0x0 || c == 0x0) continue;
#define CLIP_TEST(X, Y) ((X) >= clip_rect_.x0 && (X) < clip_rect_.x1 && (Y) >= clip_rect_.y0 && (Y) < clip_rect_.y1)
// test for the clipping rectangle
bool clip00 = CLIP_TEST(x, y);
bool clip10 = CLIP_TEST(x + 7, y);
bool clip01 = CLIP_TEST(x, y + 7);
bool clip11 = CLIP_TEST(x + 7, y + 7);
// skip block if all is clippled
if (!clip00 && !clip10 && !clip01 && !clip11) continue;
bool clip_all_in = clip00 && clip10 && clip01 && clip11;
//! compute attribute interpolants at corners
FragmentData f00;
FragmentData f10;
FragmentData f01;
FragmentData f11;
int xx1 = (x + BLOCK_SIZE) << 4;
int yy1 = (y + BLOCK_SIZE) << 4;
f00.z = solve_plane(x0, y0, zPlane);
f10.z = solve_plane(xx1, y0, zPlane);
f01.z = solve_plane(x0, yy1, zPlane);
f11.z = solve_plane(xx1, yy1, zPlane);
if (!fragment_shader_->early_depth_test(x, y, std::min(std::min(std::min(f00.z, f10.z), f01.z), f11.z)))
continue;
int w00 = fixdiv<16>(1 << 16, solve_plane(x0, y0, wPlane));
int w10 = fixdiv<16>(1 << 16, solve_plane(xx1, y0, wPlane));
int w01 = fixdiv<16>(1 << 16, solve_plane(x0, yy1, wPlane));
int w11 = fixdiv<16>(1 << 16, solve_plane(xx1, yy1, wPlane));
for (int i = 0; i < varying_count; ++i) {
f00.varying = fixmul<16>(solve_plane(x0, y0, vPlane), w00);
f10.varying = fixmul<16>(solve_plane(xx1, y0, vPlane), w10);
f01.varying = fixmul<16>(solve_plane(x0, yy1, vPlane), w01);
f11.varying = fixmul<16>(solve_plane(xx1, yy1, vPlane), w11);
}
//! compute attribute step y left and right
struct varying_step_t {
struct step_info_t {
int step;
int rem;
int error_term;
step_info_t():error_term(0){}
int dostep() {
int r = step;
error_term += rem;
if (error_term >= BLOCK_SIZE) {
error_term -= BLOCK_SIZE;
r++;
}
return r;
}
};
step_info_t z;
step_info_t varying[MAX_VARYING];
varying_step_t(FragmentData& p1, FragmentData& p2, int vc)
{
floor_divmod<BLOCK_SIZE>(p2.z - p1.z, z.step, z.rem);
for (int i = 0; i < vc; ++i) {
floor_divmod<BLOCK_SIZE>(p2.varying - p1.varying, varying.step, varying.rem);
}
}
};
varying_step_t step_left(f00, f01, varying_count);
varying_step_t step_right(f10, f11, varying_count);
BufferPointers block_buffers = buffers;
#define RENDER_TARGET_LOOP for (int i = 0; i < rendertarget_params_.count; ++i)
#define STEP_POINTERS_BY_ELEMENTSIZE(VAR, FACTOR) { RENDER_TARGET_LOOP (char*&)VAR.ptr += FACTOR * rendertarget_params_.element_size; }
#define STEP_POINTERS_BY_STRIDE(VAR) { RENDER_TARGET_LOOP (char*&)VAR.ptr += rendertarget_params_.stride; }
#define STEP_FRAGMENTDATA(FDVAR, STEPVAR) { FDVAR.z += STEPVAR.z.dostep(); for (int i = 0; i < varying_count; ++i) FDVAR.varying += STEPVAR.varying.dostep(); }
// only copy the neccessary varyings
#define EFFICIENT_COPY(SRC, DST) { DST.z = SRC.z; for (int i = 0; i < varying_count; ++i) DST.varying = SRC.varying; }
#define BLOCK_BEGIN fragment_shader_->prepare_for_block(x, y, pixel_block); for (int iy = 0; iy < BLOCK_SIZE; ++iy) { BufferPointers inner = block_buffers; STEP_POINTERS_BY_ELEMENTSIZE(inner, x); for (int ix = 0; ix < BLOCK_SIZE; ++ix) {
#define BLOCK_END STEP_POINTERS_BY_ELEMENTSIZE(inner, 1); } STEP_POINTERS_BY_STRIDE(block_buffers); }
PixelBlock pixel_block;
bool skip_flag[BLOCK_SIZE][BLOCK_SIZE];
memset(skip_flag, 0, sizeof(skip_flag));
if (!clip_all_in) {
for (int iy = 0; iy < BLOCK_SIZE; ++iy)
for (int ix = 0; ix < BLOCK_SIZE; ++ix)
if (!CLIP_TEST(ix + x, iy + y))
skip_flag[iy][ix] = true;
}
// Accept whole block when totally covered
if(a == 0xF && b == 0xF && c == 0xF)
{
// first compute all fragment data
for(int iy = 0; iy < BLOCK_SIZE; iy++)
{
//! compute attribute step x for this scanline
varying_step_t stepx(f00, f10, varying_count);
FragmentData fragment_data = f00;
for(int ix = 0; ix < BLOCK_SIZE; ix++)
{
EFFICIENT_COPY(fragment_data, pixel_block[iy][ix]);
STEP_FRAGMENTDATA(fragment_data, stepx);
}
//! step left and right attrib y
STEP_FRAGMENTDATA(f00, step_left);
STEP_FRAGMENTDATA(f10, step_right);
}
//! fragment_shader_block (can now use derivatives of attributes)
if (clip_all_in) {
BLOCK_BEGIN
fragment_shader_->shade(inner, pixel_block, ix, iy);
BLOCK_END
} else {
BLOCK_BEGIN
if (!skip_flag[iy][ix])
fragment_shader_->shade(inner, pixel_block, ix, iy);
BLOCK_END
}
}
else // Partially covered block
{
int CY1 = C1 + DX12 * y0 - DY12 * x0;
int CY2 = C2 + DX23 * y0 - DY23 * x0;
int CY3 = C3 + DX31 * y0 - DY31 * x0;
for(int iy = 0; iy < BLOCK_SIZE; iy++)
{
int CX1 = CY1;
int CX2 = CY2;
int CX3 = CY3;
//! compute attribute step x for this scanline
varying_step_t stepx(f00, f10, varying_count);
FragmentData fragment_data = f00;
for(int ix = 0; ix < BLOCK_SIZE; ix++)
{
if(!(CX1 > 0 && CX2 > 0 && CX3 > 0))
skip_flag[iy][ix] = true;
// we still need to do this since the fragment shader might want
// to compute the derivative of attibutes
EFFICIENT_COPY(fragment_data, pixel_block[iy][ix]);
CX1 -= FDY12;
CX2 -= FDY23;
CX3 -= FDY31;
STEP_FRAGMENTDATA(fragment_data, stepx);
}
CY1 += FDX12;
CY2 += FDX23;
CY3 += FDX31;
//! step left and right attrib y
STEP_FRAGMENTDATA(f00, step_left);
STEP_FRAGMENTDATA(f10, step_right);
}
//! fragment_shader_block (can now use derivatives of attributes)
BLOCK_BEGIN
if (!skip_flag[iy][ix])
fragment_shader_->shade(inner, pixel_block, ix, iy);
BLOCK_END
}
}
for (int i = 0; i < rendertarget_params_.count; ++i)
(char*&)buffers.ptr += BLOCK_SIZE * rendertargets_->stride();
}
}
Test program:
/*
Copyright (c) 2007, Markus Trenkwalder
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of the <ORGANIZATION> nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include "SDL.h"
#include "rasterizer.h"
#include <cmath>
#include <algorithm>
class SDL_SurfaceRenderTarget : public Rasterizer::RenderTarget {
SDL_Surface *surface;
public:
SDL_SurfaceRenderTarget(SDL_Surface *s):surface(s) {}
virtual int width() { return surface->w; }
virtual int height() { return surface->h; }
virtual int stride() { return surface->pitch; }
virtual int element_size() { return sizeof(int); }
virtual void* buffer_pointer() { return surface->pixels; }
virtual void clear(int x, int y, int w, int h) {}
};
class TestFragmentShader : public Rasterizer::FragmentShader {
public:
virtual int varying_count() { return 3; }
virtual void shade(const Rasterizer::BufferPointers& ptrs, const Rasterizer::PixelBlock& block, int x, int y)
{
unsigned int* color_buffer = (unsigned int*)ptrs.ptr[0];
// unfortunaltely at the corners of the triangle we can get negative
// values for the interpolants -> std::max
int r = std::max(0, block[y][x].varying[0]);
int g = std::max(0, block[y][x].varying[1]);
int b = std::max(0, block[y][x].varying[2]);
int color = r << 16 | g << 8 | b;
*color_buffer = color;
}
};
int main(int ac, char *av[])
{
SDL_Init(SDL_INIT_VIDEO);
SDL_Surface *screen = SDL_SetVideoMode(320, 240, 32, SDL_SWSURFACE);
Rasterizer r;
SDL_SurfaceRenderTarget color_target(screen);
TestFragmentShader fragment_shader;
Rasterizer::RenderTarget* rendertargets[] = { &color_target};
r.rendertargets(1, rendertargets);
r.fragment_shader(&fragment_shader);
r.clip_rect(45, 70, 100, 100);
Rasterizer::Vertex v[3];
v[0].x = (int)(120.0f * 16.0f);
v[0].y = (int)(50.0f * 16.0f);
v[0].z = 0;
v[0].w = 1 << 16;
v[0].varyings[0] = 255;
v[0].varyings[1] = 0;
v[0].varyings[2] = 0;
v[1].x = (int)(20.0f * 16.0f);
v[1].y = (int)(100.0f * 16.0f);
v[1].z = 0x7fffffff;
v[1].w = 1 << 16;
v[1].varyings[0] = 0;
v[1].varyings[1] = 255;
v[1].varyings[2] = 0;
v[2].x = (int)(150.0f * 16.0f);
v[2].y = (int)(220.0f * 16.0f);
v[2].z = 0x7fffffff >> 1;
v[2].w = 1 << 16;
v[2].varyings[0] = 0;
v[2].varyings[1] = 0;
v[2].varyings[2] = 255;
SDL_Rect rect;
rect.x = 45;
rect.y = 70;
rect.w = 100;
rect.h = 100;
SDL_FillRect(screen, &rect, 0xffffffff);
r.draw_triangle(v[0], v[1], v[2]);
SDL_Flip(screen);
SDL_Event e;
while (SDL_WaitEvent(&e) && e.type != SDL_QUIT);
SDL_Quit();
return 0;
}
I don't include the stdint.h file. You can get it yourself if needed.
OK, i now have a domain www.trenki.net where I uploaded a vastly improved version of the renderer. The above code is obsolete, the features are retained.
[Edited by - Trenki on August 22, 2007 5:11:28 AM]