Recreate GPU Pipeline at CPU Level

Started by
12 comments, last by fs1 10 years, 4 months ago
I am working on a research project, and I was wondering if you can recreate (at a CPU level) a GPU Pipeline rendering - from the drawing primitives to the homogenuous coordinates (I suppose you do this reading constants shaders, etc.). For example, taking those same primitives and doing some CPU stuff to reach to same GPU output.
Appreciate some insights on this!
Regards
fs1
Advertisement

Absolutely, this has been done many times. In fact DirectX now comes with the WARP device, which implements the entire D3D pipeline on the CPU. There's also commercial products like SwiftShader that do the same thing.

To get started, you should google "software rasterization". That should get you plenty of reading material and tutorials to teach you the basics.

Great, thanks. Just one question, is this done reading the shaders through the standard DirectX API?

You can try this simple pipeline

unsigned char[] buffer; //screen buffer

float[] vertex; //3d space vertexes

float view_matrix[16]; //search for 'LookAt' matrix to learn how to create one

float projection_matrix[16]; //search for frustum matrix and then perspective matrix (they are related)

int width;

int height;

buffer = new unsigned char[width * height * 3] //RGB 8bit per channel buffer;

vertex = new float[4 * 3]; //3 XYZW Points to draw a simple triangle; w == homogeneous coord slot (check the projection matrixes part)

view_matrix = LookAt(eye, at, up);

projection_matrix = Perspective(fov,aspect,near,far); //Inside it, normally you can found a Frustum(left,right,top,bottom,near,far) call

foreach(point in vertex)

{

point.w = 1.0f;

if(!has_vertex_shader)

{

point = Multiply(view_matrix,point);

point = Multiply(projection_matrix,point);

}

else point = VertexShader(point);

point /= point.w; //'w' will be different than 1.0 depending on the point position between 'near' and 'far'

point.x = ((point.x + 1.0f) * 0.5f) * width; //2d screen space point

point.y = ((point.y + 1.0f) * 0.5f) * height; //2d screen space point

}

foreach(triangle in vertex) //iterate the topology you defined

{

Bresenhan(buffer,triangle); //Use Bresenhan Triangle raster algorithm to fill the buffer

//Foreach pixel do be painted call 'pixel = PixelShader(pixel)';

}

Flush(buffer); //Sends the buffer to the API graphics buffer (you could use a glTexImage2D for instance)

Starting from this you gonna check optimizations and where to fit your pseudo-shader calls also there is the DepthBuffer theory and so on.

Have fun!

I am working on a research project, and I was wondering if you can recreate (at a CPU level) a GPU Pipeline rendering - from the drawing primitives to the homogenuous coordinates (I suppose you do this reading constants shaders, etc.). For example, taking those same primitives and doing some CPU stuff to reach to same GPU output.
[/quote]

[url=http://www.mesa3d.org/]Mesa[/quote] provides a couple different OpenGL software implementations (up to ~GL3.2 or so; support for newer version is a work in progress).

There is nothing magic about the GPU. I's more efficient at certain kinds of algorithms very but absolutely nothing that a CPU can't do. The GPU is just much faster at certain highly parallelized vector mathematics and has dedicated ASIC for a few algorithms (texture filtering, triangle rasterization, blending, etc.) that can be completely implemented on a CPU (just not nearly as efficiently).

Many games ran back in the days before GPUs were a thing or even simple consumer-grade fixed-function triangle rasterization hardware existed. Some "D3D9 capable" GPUs actually didn't support vertex shaders in hardware and required that they be emulated by the driver (the hardware only implemented the pieces of the pipeline from rasterization and onward).

There are current graphics APIs designed for non-realtime rendering in Hollywood movies and the like that extend or replaces the standard GPU pipeline and uses completely custom shading languages offering far more flexibility than HLSL/GLSL allow (which in turn means that only pieces of them can run on GPU hardware). These are often farmed out to huge rendering farms; modern versions of this software does make use of GPUs for processing but relatively recent versions were CPU-only.

You could with only a few days (maybe even only a few hours) of work write a perspective-correct depth-buffered textured triangle rasterizer that used Lua or Python has a vertex and pixel shading language. My alma mater requires students to write a perspective-correct depth-buffered triangle rasterizer (with custom attributes but not shaders) over two semesters (while learning the basic math required for this and still being neophytes with C). Some students managed to make theirs so fast that they were wrongly accused of cheating and using the GPU. smile.png

Sean Middleditch – Game Systems Engineer – Join my team!

Great, thank you all for your inputs. I will follow your insights.

Just one last question: how can I reconstruct a GPU render pipeline from the CPU?

Let's say I want to recreate the GPU Pipeline, with CPU code. Can I query the GPU to get shaders, effects, and all necessary information to recreate it?

shades and effects will be "user defined" programmable sections of your graphics engine. you'd need to add the ability to parse shader and effects files, and you need to support whatever shaders and effects you desire in your engine.

back in the day i was forced to write one when MS bought Brender(?) and turned it into directx 1.0. I was about to license Brender and MS took it off the market. Took them a year to turn it into directx 1.0. in the meantime, i wrote my own p-correct t-mapped poly engine for SIMTrek/SIMSapce. As i recall, one of the optimizations you'll want is the sutherland-hodgeman clipping algo, which clips triangles to the frustum before rasterization.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

Great thank you. So you mean that if you have the disassembled shader, you can recreate/parse all the arithmetic between the input and output of the primitives, is this correct?


So you mean that if you have the disassembled shader, you can recreate/parse all the arithmetic between the input and output of the primitives, is this correct?

Yes. That's already done by both WARP and the reference rasterizer. WARP does it by translating the D3D bytecode into x86 assembly. You might find http://www.virtualdub.org/blog/pivot/entry.php?id=350 interesting.

However, you're not going to exactly match a hardware implementation by doing it in software. Various precision issues will affect the result, along with Inf/NaN handling, and those do vary between different hardware.

Great, thank you so much all!

This topic is closed to new replies.

Advertisement