Archived

This topic is now archived and is closed to further replies.

RegularKid

Is Carmack Crazy?

Recommended Posts

RegularKid    139
Ok, I''m currently making a GBA game using a wolfenstein-like engine (raycasting). Now, I have Carmack''s code from wolfenstein but can''t seem to figure out how he does the drawing. Here is the problem I''m having: If I am close up to a wall, then the wall obviously fills the entire screen. In order to draw each wall strip (2 bytes wide each) I need to figure out each wall pixel''s texture coordinate. Now, I have optimized this as much as I can think of using huge look-up tables and pure assembly but can''t seem to use fewer than 5 or 6 instructions per pixel. Since the resolution I''m using is 240x160, then this equals around 96000 to 115200 instructions to draw the entire screen!!! This is obviously too much and my frame rate is really bad! But, for the life of me I can''t figure out Carmack''s drawing code! He must be doing something crazy to get the speed he needs. The GBA port of wolfenstein runs very smooth (i''m estimating 30-60 fps), so it''s definately possible to draw all this and still have game logic going on. So, my question is: Does anyone have any idea how Carmack is able to draw so much and keep a good framerate. Or does anyone have any suggestions for speeding up the drawing part of a wolfenstein-like raycasting engine since that is the most costly part? Thanks soooooo much. Any help would be absolutely wonderful!

Share this post


Link to post
Share on other sites
walkingcarcass    116
A wolfenstein-style raytracer could concievably do 1 or 2 cycles per pixel plus a few extra per line. Post some code.

********


A Problem Worthy of Attack
Proves It''s Worth by Fighting Back

Share this post


Link to post
Share on other sites
coelurus    259
Look at how Carmack did the texturing-code. The strip-renderer is done the other way around than usual, but that''s not very important. What''s important is that he makes one specialized code for EVERY wall height. So, when he''s going to draw a wall with height 20, he picks the pre-compiled code for drawing a wall with the height 20.
I found this in the PC-version, I have no idea how it works on the GBA. When you gain some FPS for your raycaster, try to figure out how they made DOOM

Share this post


Link to post
Share on other sites
RegularKid    139
coelurus, I''m not sure I follow what you are saying. I''ve looked through his code lots but i can''t seem to find the exact place he is actually drawing the walls. By pre-compiled code, do you mean that he is using lookup tables? Thanks

Share this post


Link to post
Share on other sites
dede    132
quote:

Since the resolution I''m using is 240x160, then this equals around 96000 to 115200 instructions to draw the entire screen!!!



nope, only about 480.

His raycasting engine is not per pixal, its on a flat plane. His engine is per "vertical wall chunk". I think he places 1/4th of the wall, and draws that 1/4 of a wall, then moves over and draws the next 1/4th. It makes it go much faster.

of course, you lose 1 degree of freedom, but back then, it was amazing.

Guess
I''m prolly going to assume the GBA verson of Wolf3D is much more advance than the computer verson. They are prolly using the GBA Doom port and modifying it to read Wolf3D files.

Share this post


Link to post
Share on other sites
RegularKid    139
dede,

So he doesn''t draw the 240 vertical strips (or 120 if each is 2bytes wide) seperately? How is that possible, because depending on where the player is looking in the world, there is probably a million possibilities for combinations of vertical strips next to each other? How does he just draw a chunk of scaled, textured wall? I''m confused.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
The farther away the camera is from the wall, the wider the strips get if you were infact "in the world." The strips won''t get wider on screen but will comprise more of the wall''s texture. Look around your cubicle, can you read what''s 15 feet away and in size 9 font? I sure can''t, because as you move farther away from an object, the object loses resolution in your perspective, so to speak. When you do this deliberately with your own code, ie with terrain polygons, I think it''s called "CLOD," or Continuous Level of Detail. Or something like that. I may be wrong, but this is just from hearing people talk in the forums.

Share this post


Link to post
Share on other sites
RegularKid    139
Yeah, I understand how to scale my vertical wall strips with textures and everything, but the problem is that I have to calculate each pixel on a vertical strip to figure out what texture pixel to use. This is all fine, except if I am very close to a wall, this means that the wall strips will cover the entire screen. So, I have to figure out the texture pixel for all 160 pixels of 120 strips (resolution is 240x160 and each vertical strip is 2 bytes wide). So this is 19200 pixels to calculate. I have optimized my pixel loop down to 5 assembly instructions per pixel (a load, a store, an add, a compare, and a branch) but it is still too slow for the GBA. Since the GBA wolfenstien runs very fast, I am lead to beleive that THERE MUST BE A TRICK that carmack is using to draw all this a still retain a good frame rate. I''m just asking if anyone knows of what technique he is using to do this. coelurus and dede offered some suggestions but I''m not sure I understand them exactly. Any other suggestions. Thanks!!

Share this post


Link to post
Share on other sites
63616C68h    122
Could you explain in more detail what you''re doing? I ask because I''ve never used the terminology you speak of: ''strips'' and the combination of that with heard of words. I could probably help you but I''m not receiving the why exactly you are doing these "5 instructions" per pixel. Or you could explain for those of us who are not fluent in GBA jargon, but may be able to "port" through correspondance in architecture to your situation.

Share this post


Link to post
Share on other sites
dmh    122
>> (a load, a store, an add, a compare, and a branch)
you can surely avoid the compare and branch. i suppose you have those for handling overflows, rite? a better alternative for overflows, which only works for power of 2''s, is a simple AND instruction.
say you want a value to wrap in the range of 0..255, then just ADD whatever you need to and then AND with 255.

hope this helps...

deemage

Share this post


Link to post
Share on other sites
RegularKid    139
Sure, no prob:

1. I first have to load the texture pixel into a register
2. Next I store it at the current video buffer spot
3. Next I add to the video buffer address to get to the next pixel down
4. Then I compare the video buffer address to see if I am done with the current wall strip
5. Finally I branch to the next wall strip if I am done

This is as optimized as I can think of (i use lookup tables).

Share this post


Link to post
Share on other sites
Pseudo    100
why do you have to compare the address to see if you''re at the end of a strip every pixel? isn''t the # of pixels per strip fixed? if so, you could simply "unroll" your loop and eliminate that compare/jump for every pixel but the last one. (yes that would be a lot of assembly, but you copy/paste)

Share this post


Link to post
Share on other sites
Entz    122
Agreed. Knowing the # of pixels/strip will help immensely. as will loop unrolling.

Also, how many pixels are you reading and writing at once?
Just one or two? ARM registers are 32bits wide not 16bit. You see huge speed improvements by moving multiple pixels at once. If strips are only 2 Bytes wide (1 pixel?), then why not make them 2 or more pixels wide or if possible scan left to right instead of up and down.

Maybe post the ASM code you are using (or email it to me), I might be able to give you some suggestions on optimizing it ...

Is it possible that Carmack is using the tile mode instead of Bitmap mode and only rendering small optimized chunks at once?


[edited by - Entz on March 16, 2003 7:27:26 PM]

Share this post


Link to post
Share on other sites
merlin9x9    174
Like coelurus said, the "trick" is simply having a different routine for each possible strip height. So, instead of one generalized routine that will have to look at every pixel in a texture strip, the routine for the strip only handles and draws what''s truly necessary. If I recall correctly, this code is generated at runtime, which makes the source even more obfuscated.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
An interesting problem. For maximum speed, I would try something like:

Write out one series for the longest strip (pull pixel, put pixel, add; 160 times). Then calculate the wall strip height, and jump to an instruction part-way through the series (160-calc height) rather than using a loop instruction.

That would give you one jump for the whole strip, plus one pull/put/increment for each point. You trade RAM for speed.

HTH

Share this post


Link to post
Share on other sites