3D Graphics Primer 1: Textures.
What is a texture?
By classical definition a texture is the visual and esp. tactile quality of a surface (Dictionary.com).
Since current games lack the ability to convey tactile sensations, a texture in game terms simply refers to the visual quality of a surface, with an implicit tactile quality. That is, a rock texture should give the impression of the surface of a rock, and depending on the type, a rough or smooth tactile quality. We see these types of surfaces in real life and feel them in real life. So when we see the texture, we know what it feels like without needing to touch it due to our past experiences. But a lot more goes into making a convincing texture beyond the simple tactile quality.
As you will learn as you read on, textures in games is a very complex topic, with many elements involved in creating them for realtime rendering.
We will look at:
- Texture File Types & Formats
- Texture Image Size
- Texture File Size
- Texture Types
- Tiling Textures
- Texture Bit Depth
- Making Normal Maps (Brief overview only)
- Pixel Density and Fill Rate Limitation
- Further Reading
Creating and using textures is such a big subject that covering it entirely within this one primer is simply not sensible. All I can sensibly achieve here is a skimming over the surface, so here are some links to further reading matter.
Beautiful Yet Friendly - Written by Guillaume Provost, hosted by Adam Bromell. This is a very interesting article that goes into some depth about basic optimizations and the thought process when designing and modeling a level. It goes into the technical side of things to truly give you an understanding on what is going on in the background. You can use the information in this article to find out how to build models that use fewer resources -- polygons, textures, etc. -- for the same results.
This is the reason why this is the first article I am linking to: it is imperative to understand the topics discussed in this article. If you need any extra explanation after reading it, you can PM me and I am more than happy to help. However, parts of this article go outside the texture realm of things and into the mesh side, so keep that in mind if you're focusing on learning textures at the moment.
UVW Mapping Tutorial - by Waylon Brinck. This is about the best tutorial I have found for a topic that gives all 3D artists a headache: unwrapping your three-dimensional object into a two-dimensional plane for 2D painting. It is the process by which all 3D artists place a texture on a mesh (model). NOTE: while this tutorial is very good and will help you in learning the process, UVW mapping/unwrapping is just one of those things you must practice and experiment with for a while before you truly understand it.
Poop In My Mouth's Tutorials - By Ben Mathis. Probably the only professional I know who has such a scary website name, but don't be afraid! I swear there is nothing terrible beyond that link. He has a ton of excellent tutorials, short and long, that cover both the modeling and texturing processes, ranging from normal-mapping to UVW unwrapping. You may want to read this reference first before delving into some of his tutorials.
Texture File Types & Formats
In the computer world, textures are really nothing more than image files applied to a model. Because of this, a variety of common computer image formats can be used. These include, .TGA, .DDS, .BMP, and even .JPG (or .JPEG). Almost any digital image format can be used, but some things must be taken into consideration:
In the modern world of gaming, being heavily reliant on shaders, formats like the .JPG format are rarely used. This is because .JPG, and others like it, are lossy formats, where data in the image file is actually thrown away to make the file smaller. This process can result in compression artifacts The problem is that these artifacts will interfere with shaders, because these rely on having all the data contained within the image intact. Because of this, lossless formats are used -- formats like .DDS (if lossless option chosen), .BMP, and .TGA.
However, there is such a thing called S3TC (also known as "dxt") compression. This was a compression technique developed for use on Savage 3D graphics cards, with the benefit of keeping a texture compressed within video memory whereas non-S3TC-compressed textures are not. This results in a 1:8 or greater compression ratio and can allow either more textures to be used in a scene, or can be used to increase the resolution of a texture without using more memory. S3TC compression can be made to work with any format, but is most commonly associated with the .DDS format.
Just like the .jpg and other lossy formats, any texture using S3TC will suffer compression artifacts, and as such is not suitable for normal maps, (which we'll discuss a little later on).
Even with S3TC it is common to use a lossless format for the texture format, and then apply S3TC when necessary. This is done to provide an artist with the ability to have lossless textures when needed -- e.g. for normal maps -- but then provide them with a method for compression on textures that could benefit from S3TC compression, such as diffuse textures.
Texture Image Size
The engineers who design computers and their component parts like us to feed data to their hardware in chunks that have dimensions defined as powers of two. (E.g. 16 pixels, 32 pixels, 64 pixels, and so on.) While it is possible to have a texture that is not the power of two, it is generally a good idea to stick to power-of-two sizes for compatibility reasons (especially if you're targeting older hardware). That is, if you're creating a texture for a game, you want to use image dimensions that are the power of two. Examples, 32x32, 16x32, 2048x1024, 1024x1024, 512x512, 512x32, etc. Say for example, you have a mesh/model and you're UV Unwrapping, for a game you must work within dimensions that are a power of two.
Powers of two include: 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, and so on.
What if you want to use images that aren't powers of two? In such cases, you can often use uneven pixel ratios. This means you can create your texture at, say, 1024x768, and then you can save it as 1024x1024. When you're applying your texture to your mesh you can stretch it back out in proportion. However, it is best to go for a 1:1 pixel ratio and create the texture starting at the power of 2, but the stretching is one method for getting around this if needed
Please refer to the "Pixel Density and Fill Rate Limitation" section for more in depth info on how exactly to choose your image size.
Texture File Size
File size is important for a number of reasons. The file is the actual amount of memory (permanent or temporary) that the texture requires.
For an obvious example, an uncompressed .BMP could be 6MB, this is the space it requires to be saved on a hard drive and within the video and/or system RAM. Using compression we can squeeze the same image into a file size of, say, 400 KB, or possibly even smaller. Compression, like that used by .JPG and other similar formats, will only do compression on permanent storage media, such as hard drives. That is to say, when the image is stored in video card memory it must be uncompressed within the memory, so you truly only get a benefit on storing the texture on its permanent medium but not in memory during use.
The key benefit of the S3TC compression system is that it compresses on hard drives, discs, and other media, while also staying compressed within video card memory.
But why should you care what size it is within the video memory?
Video cards have onboard memory called, unsuprisingly enough, video memory, for storing data that needs to be accessed fast. This memory is limited, so considerations on the artists' part must be used. The good news is that video card manufacturers are constantly adding more of this video memory -- known as Video RAM (or VRAM for short). Where once 64 MB was considered good, we can now find video cards with up to 1 GB of RAM.
The more textures you have, the more RAM will be used. The actual amount is based primarily on the file size of each texture. Other data take up video memory, such as the model data itself, but the majority is used for textures. For this reason, it is a good idea to both plan your video memory usage and test how much you're using once in the engine. It is also advised that you have a minimum hardware configuration for what you want your game to run on. If this is the case, then you should always make sure your video memory usage does not go over the minimum target hardware's amount.
Another advantage of in-memory compression like S3TC is that it can increase available bandwidth. If you know your engine on your target hardware may be swapping textures back and forth frequently (something that should be avoided if possible, but is a technique used on consoles), then you may want to consider having the textures compressed and then decompress them on the fly. That is to say, you have the textures compressed, and then when they're required, they're transported and then decompressed as they're added to video memory. This results in less data having to be shunted across the graphics card's bus (transport line) to and from the computer's main memory, resulting in less bandwidth utilization, but with the added penalty of a few wasted processing clocks.
Now we're going to discuss such things are diffuse, normal, specular, parallax and cube mapping.
Aside from the diffuse map, these mapping types are common in what have become known as 'next-gen' engines, where the artist is given more control over how the model is rendered by use of control maps for shaders. Diffuse maps are textures which simply provide the color and any baked-in details. Games before shaders simply used diffuse textures. (You could say this is the 'default' texture type.) For example, when texturing a rock, the diffuse map of the rock would be just the color and data of that rock. Diffuse textures can be painted by hand or made from photographs, or a mixture of both.
However, in any modern engine, most lighting and shadow detail is preferred to not be 'baked' (i.e. pre-drawn directly into the texture) into the diffuse map, but instead to have just the color data in the diffuse and use other control maps to recreate such things as shadows, specular reflections, refraction effects and so on. This results in a more dynamic look for the texture overall, helping its believability once in-game. 'Baking' such details will hinder the game engine's ability to produce "dynamic" results, which will cause the end result to look unrealistic.
There is sometimes an exception to this rule: if you're providing "general" shading/lighting like ambient occlusion maps baked (merged) into the diffuse, then it is ok. These types of additions into the diffuse are general enough that they won't hinder the dynamic effect of running in a realtime engine, while achieving a higher level of realism.
Another point to remember is that while textures sourced from photographs tend to work very well in 3D environment work, it is often frowned upon to use such 'photosourced textures' for humans. Human diffuse textures are usually hand-painted.
Normal maps are used to provide lighting detail for dynamic lighting, however this is involved in an even more important role, as we will discuss shortly. Normal maps get their name from the fact that they recreate normals on a mesh using a texture. A 'normal' is a point (actually a vector) extending from a triangle on a mesh. This tells the game engine how much light the triangle should receive from a particular light source -- the engine simply compares the angle of the normal with the position and angle of the light itself and thus calculates how the light strikes the triangle. Without a normal map, the game engine can only use the data available in the mesh itself; a triangle would only have three normals for the engine to use -- one at each point -- regardless of how big that triangle is on the screen, resulting in a flat look. A normal map, on the other hand, creates normals right across a mesh's surface. The result is the capability to generate a normal map from a 2 million poly mesh and have this lighting detail recreated on a 200 poly mesh. This can allow an artist to recreate a high-poly mesh with relatively low polygons in comparison.
Such tricks are associated with 'next-gen' engines, and are heavily used in screenshots of the Unreal 3 Engine, where you can see large amounts of detail yet able to run in realtime due to the actual amount of polys used. Normal maps use the different color channels (red, green, and blue) for storing gray scale data of lighting information at different angles, however there is no set standard for how these channels can be interpreted and as such sometimes require a channel to be flipped for correct function in an engine.
Normal maps can be generated from high-poly meshes, or they can be image generated, that is, being generated from a gray scale map. However, high-poly generated normal maps tend to be preferred as they tend to have more depth. This is for various reasons, but the main one is that it is difficult for most to paint a grayscale map that is equal to the quality you'll receive straight from a model. Also, you will often find that the image generators tend to miss details, requiring the artist to edit the normal map by hand later. However, it is not impossible to receive near equal results using both methods, each have their own requirements, it is up to you on which to use in each situation
(Please refer to the tutorial section for extended information on normal maps.)
Specular maps are simple in creation. They are easy to paint by hand, because the map is simply a gray scale map that defines the specular level for any point on a mesh. However, in some engines and in 3D modeling applications this map can be full-color so the brightness defines the specular level and color defines the color of the specular highlights. This gives the artist finer control to produce more lifelike textures because the specific specular attribute of certain materials can be more defined and closer to reality. In this way you can give a plastic material a more plastic-like specular reflection, or simulate a stone's particular specular look.
Parallax mapping picks up where regular normal mapping fails. We create normals on a model by use of a normal map. A parallax map does the same thing as a normal map except it samples a grayscale map within the alpha channel. Parallax mapping works by using the angles recorded in the tangent space normal maps along with the heightmap to calculate which way to displace the texture coordinates. It uses the grayscale map to define how much the texture should be extruded outwards, and uses the angle information recorded in the tangent normal map to determine the angle to offset the texture. Doing things this way a parallax map can then recreate the extrusion of a normal map, but without the flattening that visible in ordinary normal mapping due to lack of data at different perspectives. It mainly gets its name from how the effect is created by the parallax effect. The end results are cleaner and deeper extrusions with less flattening that occurs in normal mapping because the texture coordinates are offset with your perspective.
Parallax mapping is usually more costly than normal mapping. Parallax also has the limitation of not being used on deformable meshes because of the tendency for the texture to "swim" due to the way textures are offset.
Cube mapping uses a texture that is not unlike an unfolded box and acts just like a sky box when used. Basically this texture is designed like an unfolded box where it all is folded back together when being used. This allows us to create a 3d dimensional reflection of sorts. The result is very realistic precomputed reflections. For example, you would use this technique to create a shiny, metal ball. The cube map would provide the reflection data. (In some engines, you can tell the engine itself to generate a cube map from a specific point in the scene and have it apply the result to a model. This is how we get shiny, reflective cars in racing games, for example. The engine is constantly taking cube map 'photos' for each car to ensure it reflects its surroundings accurately.)
Now, while you can have unique, specific textures made for specific models, a common thing to do to both save time and video memory, is tiling textures. These are textures which can be fitted together like floor or wall tiles, producing a single, seamless texture/image. The benefit is that you can texture an entire floor in an entire building using a single tiling texture, which both saves the artist's time, and video memory due to fewer textures being needed.
A tiling texture is achieved by having the left and right side of a texture blend into each other, and the same for the bottom and top of the texture. Such blending can be achieved by the use of a specialist program, a plugin, or by simply offsetting the texture a little to the left and down, cleaning up the seams, and then offsetting back to the original position
After you create your first tiling texture and test it, you're bound to see that each texture will produce a 'tiling artifact' which shows how and where the texture is tiled. Such artifacts can be reduced by avoiding high-contrast detail, unique detail (such as a single rock on a sand texture), and by tiling the texture less.
Texture Bit Depth
A texture is just an image file. This means most of the theory you're familiar with when it comes to working with images also applies to textures. One such would be bit depth. Here you will see such numbers as 8, 16, 24, and 32 bits. These each correspond to the amount of color data that is stored for each image.
How do we get the number 24 from an image file? Well, the number 24 refers to how many bits it contains. That is, that you have 3 channels, red, green, and blue, all of which are simply channels which contain a gray scale image, but are added together to produce a full color image. So black, in the red channel, means "no red" at that point, while white in the red channel means "lots of red". Same applies to blue and green. When these are combined, they produce a full color image, a mix of Red, Green and Blue (if using the RGB color model). The bits come in by the fact that they define how many levels of gray each channel has: 8 bits per channel, over 3 channels, is 24 bits total.
8 bits gives you 256 levels of gray. Combining the idea that 8 bits gives you 256 levels of gray and that each channel is simply gray scale and different levels of gray define a level within that color, we can then see that a 24 bit image will give us 16,777,216 different colors to play with. That is, 8 bits x 3 channels= 24 bits, 8 bits= 256 gray scale levels, so 256 x 3= 16,777,216 colors. This knowledge comes in useful when at certain times it is easier to edit the RGB channels individually, with a correct understanding you can then delve deeper into editing your textures.
However, with the increase in shader use, you'll often see a 32 bit image/texture file. These are image files which contain 4 channels, each of 8 bits: 4 x 8 = 32. This allows a developer to use the 4th channel to carry a unique control map or extra data needed for shaders. Since each channel is gray scale, a 32 bit image is ideal to carry a full color texture along with an extra map within it. Depending on your engine you may see a full color texture with the extra 4th channel being used to hold a gray scale map for transparency (more commonly known as an "alpha channel"), specular control map, or a gray scale map along with a normal map in the other channels to be used for parallax mapping.
As you paint your textures you may start to realize that you're probably not using all of the colors available to you in a 24 bit image. And you're probably right, this is why artists can at times use a lower bit depth texture to achieve the same or near the same look with a lesser memory footprint. There are certain cases where you will more than likely need a 24 bit image however: If your image/texture contains frequent gradations in the color or shading, then a 24 bit image is required. However, if your image/texture contains solid colors, little detail, little or no shading, and so on, you can probably get away with a 16, 8, or perhaps even a 4 bit texture.
Often, this type of memory conservation technique is best done when the artist is able to choose/make his own color pallette. This is where you hand pick the colors that will be saved with your image, instead of letting the computer automatically choose. By using a careful eye you have the possibility to choose more optimal colors which will fit your texture better. Basically, in a way, all you're doing is throwing out what would be considered useless colors which are being stored in the texture but not being used.
Making Normal Maps
There are two methods for creating normal maps:
- Using a detailed mesh/model.
- Creating a normal map from an image.
The first method is part of a common workflow that nearly all modelers who support normal map generation use. For generating a normal map from a model you can either generate it out to a straight texture, or if you're generating your normal map from a high-poly mesh, it is common to then model the low poly mesh around your high-poly mesh. (Some artists have prefer for modeling the low-poly version first while others like to do the high then the low, in the end there is no perfect way, its just preference.)
For example: you have a high-poly rock, you will then model/build a low poly mesh around the high, then UVW unwrap it, and generate the normal map from your high-poly version. Virtual "rays" will be cast from the high- to the low-poly model -- a technique known as "projecting". This allows for a better application of your high-poly mesh normal map to your low-poly mesh since you're projecting the detail from the high to the low. However, some applications will switch the requirements and have your low poly mesh be inside your high, and others allow the rays for generating the normal map to be cast both ways. So refer to your application tutorials for how to do this as it may vary.
Creating a normal map from an image.
This method can be quicker than the more common method described above. For this, all you need is an edited version of your diffuse map. The key to good-looking image-based normal maps is to edit out any unneeded information for your grayscale diffuse texture. If your diffuse has any baked-in specular, shadows, or any information that does not define depth, this needs to be removed from the image. Also, anything that is extra, like strips of paint on a concrete texture, that too should be edited out. This is because, just like bump maps, and displacement maps, the colour of the pixels defines depth, with lighter pixels being "higher" than darker pixels. So, if you have specular (will turn white when made gray), it will be interpreted as a bump in your normal map, you don't want this if the specular in fact lays on a flat surface. The same applies to shadows and everything else mentioned: it will all interfere with the normal map generation process. You simply want to only have various shades of gray represent various depths, any other data in the texture will not produce the correct results.
For generating normal maps you can always use the Nvidia Plugin, however it takes a lot of tweaking to get a good looking normal map. As such, I recommend Crazy Bump!. Crazy Bump will produce some very good normal maps if the given texture it is generating it from is good.
Combining the two methods.
It is common, even if you're generating a normal map from a 3d high-poly mesh to then generate an image generated normal map and overlay it over the high-poly generated one. This is done by generating one from your diffuse map, filling the resulting normal map's blue channel with 128 neutral gray, and then overlaying this over your high-poly generated one. This is done to add in those small details that only the image can generate. This way you get the high frequency detail along with the nice and cleanly generated mid-to-low frequency detail from your high-poly generated normal map.
Pixel Density and Fill Rate Limitation
Let's say you have a coin that you just finished UVW unwrapping, it will indeed be very small once in-game, however you decide it would be fine to use a 1024x1024 texture.
What is wrong with the above situation? Firstly, you shouldn't need to UVW unwrap a coin! Furthermore, you should not be applying the 1024x1024 texture! Not only is this wasteful of video memory, but it will result in uneven pixel density and will increase your fill rate on that model for no reason. A good rule of thumb is to only use the amount of resources that would make sense based on how much screen space an object will take up. A building will take up more of the screen than a coin, so it needs a higher resolution texture and more polygons. A coin takes up less screen space and therefore needs fewer polys and a lower resolution texture to obtain a similar pixel density.
So, what is pixel density? It is the density of each pixel from a texture on a mesh. For example, take the UVW unwrapping tutorial linked to in the "Texture Image Size" section: There you will see a checkered pattern, this is not only used to make sure the texture is applied right, but to also keep track of pixel density. If you increased the pixel density, you would see the checkered pattern get more dense; if you decrease the density, the checkered pattern would be less dense, with fewer squares showing.
Maintaining a consistent pixel density in a game helps all of the art fit together. How would you feel if your high pixel density character walks up to a wall with a significantly lower pixel density? Your eyes would be able to compare the two and see that the wall looks like crap compared to the character, however would this same thing happen if the character were near the same pixel density of the wall? Probably not -- such things only become apparent (within reason) to the end user if they have something to compare it to. If you keep a consistent pixel density throughout all of the game's assets, you will see all of it fits together better.
It is important to note that there is one other reason for this, but we'll come to it in a moment. First, we need to look at two related problems that can arise: transform(ation) limited and fill-rate limited modeling. A transform-limited model will have less pixel density per polygon than a fill-rate limited model where it has a higher pixel density per polygon. The theory is that a model takes longer on either processing the polys, or processing the actual pixel-dense surface. Knowing this, we can see that our coin, with very few polys will have a giant pixel density per polygon, resulting in a fill rate limited mesh. However, it does not need to be fill rate limited if we lower the texture resolution, resulting in a lower pixel density.
The point is that your mesh will be held back when rendering based on which process takes longer: transform or fill rate. If your mesh is fill rate limited then you can speed up its processing by decreasing its pixel density, and its speed will increase until you reach transform limitation, in which your mesh is now taking longer to render based on the amount of polygons it contains. In the latter case, you would then speed up the processing of the model by decreasing the amount of polygons the model contains. That is, until you decrease the polygon count to the point where you're now fill rate limited once again! As you can see, it's a balancing act. The trick is to maximize the speed of the transform and fill rate processing (minimize the impact of both as much as you can), to get the best possible processing speed for your mesh.
That said, being fill rate limited can sometimes be a good thing. The majority of "next-gen" games are fill rate limited primarily because of their use of texture/shading techniques. So, if you can't possibly get any lower on the fill rate limitation and you're still fill rate limited, then you have a little bit of wiggle room to work around where you can actually introduce more polygons with no performance hit. However, you should always try to cut down on fill rate limitations when possible because of general performance concerns
Some methods revolve around splitting up a single polygon into multiple polygons on a single mesh (like a wall for example). This works by then decreasing the pixel density and processing (shaders) for the single polygon by splitting the work into multiple polygons. There are other methods for dealing with fill rate limitation, but mainly it is as simple as keeping your pixel density at a reasonable level.
It is fitting that after we discuss pixel density and fill rate limitation that we discuss a thing called Mipmapping. Mipmaps (or mip maps) are a collection of precomputed lower resolution image copies for a texture contained in the texture. Let's say you have a 1024x1024 texture. If you generate mipmaps for your texture, it will contain the original 1024x1024 texture, but it will also contain a 512x512, 256x256, 128x128, 64x64, 32x32, 16x16, 8x8, 4x4, 2x2 version of the same texture, (exactly how many levels there are is up to you). These smaller mipmaps (textures) are then used in sequence, one after the other, according to the model's distance from the camera in the scene. If your model uses a 1024x1024 texture up close, it may be using a 256x256 texture when further away, and an even smaller mipmap texture level when it's way off in the distance. This is done because of many things:
- The further away you are from your mesh, the less polygonal and texture detail is needed. This is because all displays have a fixed display resolution and it is physically impossible for the player to decipher the detail of a 1024x1024 texture and 6,000 polygon mesh when the model takes up only 20 pixels on screen. The further away we are from the mesh, the fewer the polygons and the lower the texture resolution we need to render it.
- Because of the whole fill rate limitation described above, it is beneficial to use mipmaps as less texture detail must be processed for distant meshes. This results is a less fill-rate-heavy scene because only the closer models are receiving larger textures, whereas more distant models are receiving smaller textures.
- Texture filtering. What happens when the player tries to view a 1024x1024 texture at such a distance that only 40 pixels are given to render the mesh on screen? You get noise and aliasing artefacts! Without mipmaps, any textures too large for the display resolution will only result in unneeded and unwanted noise. Instead of filtering out this noise, mipmaps use a lower resolution texture for different distances, this results in less noise.
It is important to note, that while mipmaps will increase performance overall, you're actually increasing your texture memory usage. The mipmaps and the whole texture will be loaded into memory. However, it is possible to have a system where the user or dev can select the highest mipmap they want and the ones higher than this limit will not be loaded into memory (as the system we're using now), however the mipmaps which meet or are lower than this limit will still be loaded into memory. It is widely agreed that the benefits of mipmaps vastly outweigh the small memory hit.
NOTE: Mesh detail can also affect performance, so the equivalent method used for mesh detail is known as LOD -- "Level Of Detail. Multiple versions of the mesh itself are stored at different levels of detail. The less-detailed mesh is rendered when it's a long way away. Like mipmaps, a mesh can have any number of levels of detail you feel it requires.
The image below is of a level I created which makes use of most of what we've discussed. It makes use of S3TC compressed textures, normal mapping, diffuse mapping, specular mapping, and tiled textures.