• Create Account

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

19 replies to this topic

### #1KaiserJohan  Members

2104
Like
0Likes
Like

Posted 26 June 2013 - 02:26 PM

I am making a game and have a resource file format for loading 3d models, with textures and meshes etc defined like this:

/* PackageHeader definition */
{
std::string mSignature;
uint8_t mMajorVersion;
uint8_t mMinorVersion;

};

/* PackageMesh definition */
struct PackageMesh
{
std::vector<Vec3> mVertexData;
std::vector<Vec3> mNormalData;
std::vector<Vec2> mTexCoordsData;
std::vector<uint32_t> mIndiceData;
uint16_t mMaterialIndex;
bool mHasMaterial;

PackageMesh();
};

/* PackageTexture definition */
struct PackageTexture
{
std::string mName;
std::vector<uint8_t> mTextureData;
uint32_t mTextureWidth;         // width/height in pixels
uint32_t mTextureHeight;
ITexture::TextureFormat mTextureFormat;
ITexture::TextureType mTextureType;

PackageTexture();
};

/* PackageMaterial definition */
struct PackageMaterial
{
std::string mName;
PackageTexture mDiffuseTexture;
Vec3 mDiffuseColor;
Vec3 mAmbientColor;
Vec3 mSpecularColor;
Vec3 mEmissiveColor;

PackageMaterial();
};

/* PackageModel definition */
struct PackageModel
{
std::string mName;
std::vector<PackageModel> mChildren;
std::vector<PackageMesh> mMeshes;
Mat4 mTransform;

PackageModel();
};

/* JonsPackage definition */
struct JonsPackage
{
std::vector<PackageModel> mModels;
std::vector<PackageMaterial> mMaterials;

JonsPackage();
};

I am using Boost Serialization to save/load from the filesystem, which up untill now has been absolutely wonderous as it requires almost no code to do it.

However after importing some 3d models and then try to load it up again, the loading times are enormous; it takes almost 30 seconds to load from filesystem and then to deserialize.

This is the code to serialize/deserialize:

JonsPackagePtr ReadJonsPkg(const std::string& jonsPkgName)
{
std::ifstream jonsPkgStream(jonsPkgName.c_str(), std::ios::in | std::ios::binary);        // TODO: support opening of older resource packages
JonsPackagePtr pkg(HeapAllocator::GetDefaultHeapAllocator().AllocateObject<JonsPackage>(), boost::bind(&HeapAllocator::DeallocateObject<JonsPackage>, &HeapAllocator::GetDefaultHeapAllocator(), _1));

if (jonsPkgStream && jonsPkgStream.good() && jonsPkgStream.is_open())
{
std::stringstream buf(std::ios_base::binary | std::ios_base::in | std::ios_base::out);
buf << jonsPkgStream.rdbuf();
buf.seekg(0);
jonsPkgStream.close();

boost::archive::binary_iarchive iar(buf);

iar >> (*pkg.get());
}

jonsPkgStream.close();

return pkg;
}

bool WriteJonsPkg(const std::string& jonsPkgName, const JonsPackagePtr pkg)
{
std::ofstream outStream(jonsPkgName.c_str(), std::ios::out | std::ios::binary | std::ios::trunc);
bool ret = false;

if (outStream.is_open())
{
boost::archive::binary_oarchive oar(outStream);
oar << (*pkg.get());

ret = true;
}

return ret;
}

Here is an image of the VS2012 performance analys:

The resource file I am using is about 26 MB on disc, contains 3 package models and 14 package textures. What could I possibly do about this, is my file format design a dead-end?

EDIT:

Constructors:

/* JonsPackagePtr definition */
typedef boost::shared_ptr<JonsPackage> JonsPackagePtr;

/*
*/
bool WriteJonsPkg(const std::string& jonsPkgName, const JonsPackagePtr pkg);

{
}

/* PackageModel inlines */
inline PackageModel::PackageModel() : mName(""), mTransform(1.0f)
{
}

/* PackageMesh inlines */
inline PackageMesh::PackageMesh() : mMaterialIndex(0), mHasMaterial(false)
{
}

/* PackageTexture inlines */
inline PackageTexture::PackageTexture() : mName(""), mTextureWidth(0), mTextureHeight(0), mTextureFormat(ITexture::UNKNOWN_FORMAT), mTextureType(ITexture::UNKNOWN_TYPE)
{
}

/* PackageMaterial inlines */
inline PackageMaterial::PackageMaterial() : mName(""), mDiffuseColor(0.0f), mAmbientColor(0.0f), mSpecularColor(0.0f), mEmissiveColor(0.0f)
{
}

/* JonsPackage inlines */
inline JonsPackage::JonsPackage()
{
}


Boost::serialization::serialize, non-intrusive:

template<class Archive>
{
}

template<class Archive>
void serialize(Archive & ar, JonsEngine::PackageModel& model, const unsigned int version)
{
ar & model.mName;
ar & model.mChildren;
ar & model.mMeshes;
ar & model.mTransform;
}

template<class Archive>
void serialize(Archive & ar, JonsEngine::PackageMesh& mesh, const unsigned int version)
{
ar & mesh.mVertexData;
ar & mesh.mNormalData;
ar & mesh.mTexCoordsData;
ar & mesh.mIndiceData;
ar & mesh.mMaterialIndex;
ar & mesh.mHasMaterial;
}

template<class Archive>
void serialize(Archive & ar, JonsEngine::PackageTexture& texture, const unsigned int version)
{
ar & texture.mName;
ar & texture.mTextureData;
ar & texture.mTextureWidth;
ar & texture.mTextureHeight;
ar & texture.mTextureFormat;
ar & texture.mTextureType;
}

template<class Archive>
void serialize(Archive & ar, JonsEngine::PackageMaterial& material, const unsigned int version)
{
ar & material.mName;
ar & material.mDiffuseTexture;
ar & material.mDiffuseColor;
ar & material.mAmbientColor;
ar & material.mSpecularColor;
}

template<class Archive>
void serialize(Archive & ar, JonsEngine::JonsPackage& pkg, const unsigned int version)
{
ar & pkg.mModels;
ar & pkg.mMaterials;
}

template<class Archive>
void serialize(Archive & ar, glm::detail::tmat4x4<glm::mediump_float>& transform, const unsigned int version)
{
ar & transform[0];
ar & transform[1];
ar & transform[2];
ar & transform[3];
}

template<class Archive>
void serialize(Archive & ar, glm::detail::tvec4<glm::mediump_float>& vec, const unsigned int version)
{
ar & vec.x;
ar & vec.y;
ar & vec.z;
ar & vec.w;
}

template<class Archive>
void serialize(Archive & ar, glm::detail::tvec3<glm::mediump_float>& vec, const unsigned int version)
{
ar & vec.x;
ar & vec.y;
ar & vec.z;
}

template<class Archive>
void serialize(Archive & ar, glm::detail::tvec2<glm::mediump_float>& vec, const unsigned int version)
{
ar & vec.x;
ar & vec.y;
}


Edited by KaiserJohan, 27 June 2013 - 06:17 AM.

### #2swiftcoder  Senior Moderators

17830
Like
4Likes
Like

Posted 26 June 2013 - 09:02 PM

- You may see a performance benefit from disabling iterator checking (#define _HAS_ITERATOR_DEBUGGING 0), or by testing a release build.

- Why are you reading the whole file into a stringstream before loading the archive from it?

- boost::serialisation in feature-rich, but not all that fast. Consider a simpler alternative, like google's protocol buffers.

Tristam MacDonald - Software Engineer @ Amazon - [swiftcoding] [GitHub]

### #3frob  Moderators

41378
Like
4Likes
Like

Posted 26 June 2013 - 10:24 PM

With 10% of the time being spent in std::vector::iterator::operator!=(), that is a very clear sign that iterator debugging is killing performance.

Chances are good that you are also spending a huge amount of time in stack frame checks. You find them in VC++ debug builds, and in real-world tests they decrease performance by about 5 times.  (That is, a function that calls lots of other functions that should require 10 microseconds requires 50 microseconds with stack frame checks enabled.)

It is usually good to have (at least) 3 builds.  One is fully debug.  Don't use it unless you absolutely must.  One is fully release with all optimizations enabled. That is what you sell and QA against.

The third is a build with some optimizations turned on (such as automatic inlining) and some debugging info turned off (such as checked iterators and stack frame checks).  That is a good one for general development.

Check out my book, Game Development with Unity, aimed at beginners who want to build fun games fast.

Also check out my personal website at bryanwagstaff.com, where I occasionally write about assorted stuff.

### #4KaiserJohan  Members

2104
Like
0Likes
Like

Posted 27 June 2013 - 01:21 AM

I'm gonna give it a try in a release build. I cant help but feel if it takes 30ish seconds already, even reducing the loading times by 5 is still way too much for such a small number of assets but I'll try it first.

### #5frob  Moderators

41378
Like
1Likes
Like

Posted 27 June 2013 - 01:34 AM

Microsoft decided to have checked iterators in release builds, too. You need to specifically turn them off. It is a speed/safety decision, but most everyone decides the speed is more important. See this MSDN article for details.

Check out my book, Game Development with Unity, aimed at beginners who want to build fun games fast.

Also check out my personal website at bryanwagstaff.com, where I occasionally write about assorted stuff.

### #6KaiserJohan  Members

2104
Like
0Likes
Like

Posted 27 June 2013 - 04:15 AM

Are you sure? from reading http://msdn.microsoft.com/en-us/library/vstudio/hh697468.aspx it seems to imply that by default it is disabled in Release mode, but that it can be turned on if necessary

### #7LorenzoGatti  Members

4089
Like
-2Likes
Like

Posted 27 June 2013 - 05:27 AM

I am making a game and have a resource file format for loading 3d models, with textures and meshes etc defined like this:

...
{
std::string mSignature;
uint8_t mMajorVersion;
uint8_t mMinorVersion;

};
...


I am using Boost Serialization to save/load from the filesystem, which up untill now has been absolutely wonderous as it requires almost no code to do it.

However after importing some 3d models and then try to load it up again, the loading times are enormous; it takes almost 30 seconds to load from filesystem and then to deserialize.

This is the code to serialize/deserialize:


...
iar >> (*pkg.get());
...


Are you kidding? Your post doesn't not include struct constructors, nor the actual serialization and deserialization code (I assume it is defined in the missing JonsPackagePtr class), the only places where you might be doing something slow. Post a real code sample.

Omae Wa Mou Shindeiru

### #8KaiserJohan  Members

2104
Like
0Likes
Like

Posted 27 June 2013 - 06:24 AM

Added it, I don't see what it would change though as nothing fancy is done, I would assume the problem is in the design layout

edit: JonsPackagePtr is just a typedef for shared_ptr

Edited by KaiserJohan, 27 June 2013 - 06:25 AM.

### #9Hodgman  Moderators

49429
Like
2Likes
Like

Posted 27 June 2013 - 08:06 AM

Do you make modifications to the contents of those std::vectors during runtime, or are they read-only assets once they're loaded?

I've largely given up on serialization libraries for use with assets, and just load them in-place, e.g.
//https://code.google.com/p/eight/source/browse/include/eight/core/blob/types.h
#include "eight/core/blob/types.h"

{
StringOffset mSignature;
uint8_t mMajorVersion;
uint8_t mMinorVersion;
};
struct PackageMesh
{
Offset<List<Vec3>> mVertexData;
Offset<List<Vec3>> mNormalData;
Offset<List<Vec2>> mTexCoordsData;
Offset<List<uint32_t>> mIndiceData;
uint16_t mMaterialIndex;
uint16_t mHasMaterial;
};
struct PackageTexture
{
StringOffset mName;
Offset<List<uint8_t>> mTextureData;
uint32_t mTextureWidth;//in pixels
uint32_t mTextureHeight;
uint32_t mTextureFormat;
uint32_t mTextureType;
};
struct PackageMaterial
{
StringOffset mName;
PackageTexture mDiffuseTexture;
Vec3 mDiffuseColor;
Vec3 mAmbientColor;
Vec3 mSpecularColor;
Vec3 mEmissiveColor;
};
struct PackageModel
{
StringOffset mName;
Offset<List<PackageModel>> mChildren;
Offset<List<PackageMesh>> mMeshes;
Mat4 mTransform;
};
struct JonsPackage
{
Offset<List<PackageModel>> mModels;
Offset<List<PackageMaterial>> mMaterials;
};

JonsPackage* Load( FileStuff& files, const std::string& name )
{
char* bytes = files.ReadAllTheBytes( name );
return (JonsPackage*)bytes;//Done, no serialization. Format on disc is same as format in memory.
}

### #10KaiserJohan  Members

2104
Like
0Likes
Like

Posted 27 June 2013 - 08:31 AM

They are read-only once they are deserialized.

That solution looks awesome, if I can skip on the serialization library I'm all for it, as I can then so no reason for it. Can you give a brief explanation on how this solution would  work?

### #11LorenzoGatti  Members

4089
Like
0Likes
Like

Posted 28 June 2013 - 01:27 AM

Now that there is code, it doesn't seem to do anything suspicious.

• A 26MB file is large, but is it bloated compared to the source assets it was created from? Did you inspect it?
• Are textures compressed? In the file, they should be. The Boost Serialization library allows you to read and write arbitrary byte arrays, so integrating jpeglib, pnglib, or other mature libraries to encode and decode image data shouldn't be difficult.
• Maybe std::vector isn't the most lightweight choice. Did you benchmark arrays of primitive types with their special wrapper objects (http://www.boost.org/doc/libs/1_53_0/libs/serialization/doc/wrappers.html#arrays)?

Edited by LorenzoGatti, 28 June 2013 - 01:34 AM.

Omae Wa Mou Shindeiru

### #12frob  Moderators

41378
Like
2Likes
Like

Posted 28 June 2013 - 09:34 AM

Are textures compressed? In the file, they should be. The Boost Serialization library allows you to read and write arbitrary byte arrays, so integrating jpeglib, pnglib, or other mature libraries to encode and decode image data shouldn't be difficult.

Yes and no.  Yes, they should be using a compressed format, and no they should not be those formats and they should require absolutely zero decoding.

Games don't use jpeg or png.  Those formats require quite a lot of processing to turn into formats that the game can actually use.  Game artists don't use those formats either, they use psd.

Games use DXT1-5 for images because they can be fed directly to the video card in both OpenGL and DirectX and they are nicely compressed.  Even mobile devices support DXT-format images.

If your game is using jpeg images, that's an issue right there.  Jpeg is great for web pages and photos, but horrible for just about everything else.

Edited by frob, 28 June 2013 - 09:35 AM.

Check out my book, Game Development with Unity, aimed at beginners who want to build fun games fast.

Also check out my personal website at bryanwagstaff.com, where I occasionally write about assorted stuff.

### #13KaiserJohan  Members

2104
Like
0Likes
Like

Posted 28 June 2013 - 02:40 PM

As an update, doing a release-build (took awhile to get the projects setup), drastically improved the situation, going from 30-40 seconds to a some 400ish MS, like night and day.

I like the idea of a "light-debug" build, with the _HAS_ITERATOR_DEBUGGING 0, is there any other configurations which could improve the debug-build speeds?

### #14frob  Moderators

41378
Like
0Likes
Like

Posted 28 June 2013 - 03:49 PM

There are all kinds of flags.  Pulling from my current project (not the latest version of Visual Studio) I get these.

/O2 /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /D "WIN32_LEAN_AND_MEAN" /D "VC_EXTRALEAN" /D "NOMINMAX" /D "_CRT_SECURE_NO_DEPRECATE" /D "_CRT_NONSTDC_NO_DEPRECATE" /D "_SCL_SECURE_NO_DEPRECATE" /D "_SECURE_SCL=0" /D "_MBCS" /FD /MD /GS- /arch:SSE /fp:fast /GR- /W4 /WX /Zi

There are a few more that I left out, such as precompiled headers, include directories, and output file names. But that should give you an idea for what kinds of options go to the compiler for a faster debug build.

Check out my book, Game Development with Unity, aimed at beginners who want to build fun games fast.

Also check out my personal website at bryanwagstaff.com, where I occasionally write about assorted stuff.

### #15swiftcoder  Senior Moderators

17830
Like
0Likes
Like

Posted 28 June 2013 - 03:56 PM

/D "NDEBUG"

Are you usually in the habit of defining NDEBUG for debug builds?

It's a little odd to turn off assertions in favour of speed...

Tristam MacDonald - Software Engineer @ Amazon - [swiftcoding] [GitHub]

### #16BGB  Members

1570
Like
0Likes
Like

Posted 28 June 2013 - 04:17 PM

FWIW, actually, a lot of games (Quake 3, Xonotic, Minecraft, ...) have used formats like JPEG and PNG.

Rage uses HD Photo / JPEG-XR.

...

OTOH, lots of games have also used raw DXT, such as via DDS or VTF.

generally, we want DXT on the GPU though, either via conversion on load, or using a DXT-based format for storage.

a minor downside of on-disk storage of raw DXTn (DXT1 or DXT5) is that it does often take more space than a JPEG version of the texture.

some additional filtering and compression can help here, making it much more competitive space-wise with JPEG (basically, we want an algorithm to merge and eliminate similar looking pixel-blocks, as well as possibly perform some additional entropy coding, *1).

*1: in my case, I am using a custom format here, which uses "block reduction" and a combination of custom LZ77 and Deflate. I am mostly left considering this as an option for video-mapped textures (vs the use of a customized JPEG variant...).

FWIW: JPEG -> DXTn conversion can be sped up some by making a specialized decoder which decodes directly to DXTn.

Edited by cr88192, 28 June 2013 - 04:23 PM.

### #17frob  Moderators

41378
Like
0Likes
Like

Posted 28 June 2013 - 04:29 PM

/D "NDEBUG"

Are you usually in the habit of defining NDEBUG for debug builds?

It's a little odd to turn off assertions in favour of speed...

Oh, quite.  We have a custom assertion system.  Yeah, unless you have your own custom assertion library, you probably want that one left on for a mixed debugrelease build.

Check out my book, Game Development with Unity, aimed at beginners who want to build fun games fast.

Also check out my personal website at bryanwagstaff.com, where I occasionally write about assorted stuff.

### #18Khatharr  Members

7726
Like
0Likes
Like

Posted 28 June 2013 - 05:19 PM

a minor downside of on-disk storage of raw DXTn (DXT1 or DXT5) is that it does often take more space than a JPEG version of the texture.

Isn't that because JPEG is lossy, though? Or is it still better size with the quality maxxed out?

Edit - nvm, I'm looking at DXT now and I see that it's also lossy.

Question - Do adapters convert DXT textures to table/map when loading, or do they just store in DXT format and do the calculations per-render?

Edited by Khatharr, 28 June 2013 - 05:45 PM.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

### #19BGB  Members

1570
Like
2Likes
Like

Posted 28 June 2013 - 06:58 PM

a minor downside of on-disk storage of raw DXTn (DXT1 or DXT5) is that it does often take more space than a JPEG version of the texture.

Isn't that because JPEG is lossy, though? Or is it still better size with the quality maxxed out?

Edit - nvm, I'm looking at DXT now and I see that it's also lossy.

Question - Do adapters convert DXT textures to table/map when loading, or do they just store in DXT format and do the calculations per-render?

AFAIK: it is done per-render.

namely, DXT is as it is to save video RAM vs the use of raw RGBA textures.

a lot of JPEGs small size is due to the way the DCT / quantization / entropy coding work. often, JPEG will often end up using a moderately small number of bits for each block (as a lot of values become zero and are simply skipped), so on average the images come out fairly small.

OTOH, DXT reserves bits for each pixel in every block, as well as storing explicit per-block color information, ...

also, if using DXT5 and mipmaps, these can make the image around 4x larger than if using DXT1 with no mipmaps, which is also an issue.

IME, for many images, DXT1 (with no mipmaps) will be around similar size to a 90-95% quality JPEG, but using DXT5 and/or mipmaps makes it a lot bigger (whereas turning down the quality will make a JPEG a lot smaller).

for JPEG images, it is more common to generate the mipmaps dynamically (so they are not stored). (DXTn doesn't leave as much room for efficiently generating mipmaps dynamically).

standard JPEG also lacks alpha-channel support, but this can be hacked on, but will usually compress down pretty well.

whereas, using DXT5 (with a full alpha-channel) effectively doubles the image size.

...

but, granted, some of this is why secondary compression can squeeze down DXTn images somewhat.

for the direct JPEG -> DXT conversion option, basically this means mostly shaving off a lot of the upper-end of the JPEG decoding logic, and basically once the DCT blocks are decoded, then we basically just convert the YUV macroblocks directly into DXT (basically, more directly converting each 16x16 macroblock into DXT blocks).

granted, there are still time costs, so more directly compressed DXT blocks are faster to decode.

### #20Hodgman  Moderators

49429
Like
0Likes
Like

Posted 28 June 2013 - 11:36 PM

They are read-only once they are deserialized.

That solution looks awesome, if I can skip on the serialization library I'm all for it, as I can then so no reason for it. Can you give a brief explanation on how this solution would  work?

The function I posted shows that there is no work to be done in deserialization (it's just a cast of bytes to your structure type). The magic behind that is in the changes to your structures included above the function. The header that I include at the top is open-source (MIT license), or you can implement your own pretty easily by copying what the header does.

Pointers have been replaced with Offset<T>'s, which are basically just a 32-bit integer that holds a relative offset value.
To convert an Offset<T> into a T* at runtime, the operation is:

int& offset = ...;//given an integer offset variable
bytes += offset;//increment the byte pointer by the value of the integer
T* object = (T*)bytes;//cast the new address to the desired object type


In that header, this is done inside Offset<T>::operator->, so that you can use these offset variables as if they were pointers!

Now that we're not using pointers (which are absolute memory addresses), and we're instead using relative/local memory offsets, we're able to save them to disk. The same offset value can be stored in disk and used in RAM, so no processing is required on load -- instead, every time that you use the Offset<T> variable as a pointer, you pay the cost of an extra addition instruction, which is negligible. Also, performance may be improved despite that additional addition, as now the locality of all your data is guaranteed to be great (it's all in a single contiguous allocation, whereas the std::vector's are at the mercy of wherever new feels like storing your data).

The other change to your structures is I've replaced the std::vector<T>'s with List<T>'s from that header. This is a variable-sized structure, which begins with an int32 "header" containing the length of the array, and then the header is followed by the actual array data.

Because List<T> is a variable sized struct, you can't embed it as a member easily, because it's size isn't known at compile time, so your member variable is only big enough to hold the header. The actual array data will overlap with the other members. To fix that, I use offsets to lists in the modified version of your code:

struct Foo_Broken
{
List<Bar> b;//uh oh, a's data will overwrite b's header
};
struct Foo_Fixed
{
Offset<List<Bar>> a;//the list is somewhere else, not right here, no overflow
List<Bar> b;//this one doesn't *have* to use an offset.
//Just be aware that now Foo_Fixed is a variable-sized structure, because it will be followed by b's data!
};

So that's the "deserialization"/runtime part covered, which is easy. The tricky part is the serialization routines to get data into this format.

Personally, I generate all my data from some C# tools, so I've made some extensions to C#'s BinaryWriter to help with this task... but the same ideas would work with any kind of "binary file writer" class that can write data of different sizes, can tell you it's position in the file, and lets you jump back and forth in the file.

Say I wanted to write some data to match the C++ struct of:

struct Header
{
Offset<List<float>> data1;
Offset<List<u8>> data2;
};

I'd use some C# code like this:

List<float> data1 = ...
List<byte> data2 = ...
//^^ inputs
BinaryWriter writer = ...
//^^ output file

//first write some placeholder data for the header structure (two 32-bit offsets), but remember their positions

//now to write the List<float> data1 member
//first, rewind to headerData1pos, and overwrite it with the offset from there to here, then fast-forward back to here
writer.Write32( data1.Count )//write the list header - 32bit array size
foreach( var data in data1 )
writer.WriteFloat( data );//write the array contents

//now to write the List<u8> data2 member
//again, rewind to the header, write the actual offset value in it, then fast-forward back to the end of the file
writer.Write32( data2.Count )//write the list header - 32bit array size
foreach( var data in data2 )
writer.Write8( data );//write the array contents

The resulting file's bytes can then be loaded into RAM in your C++ app, cast to a Header*, and it will just work.
To save space on disc, your file system / file loader can implement some kind of compression on storing/loading files if you want, like ZLIB/GZ/LZMA/etc...

If the inputs to the above routine were 1 float with hex value 0x12345678, and 4 bytes with the values 1, 2, 3 and 4, the output file would look like this (when interpreted as groups of 32-bit integers expressed as hex):

0: 0x00000008 // data1 offset - jump forward 8 bytes to line #2
1: 0x0000000C // data2 offset - jump forward 12 bytes to line #4
2: 0x00000001 // data1 list header - 1 item in array
3: 0x12345678 // our float value
4: 0x00000004 // data2 list header - 4 items in array
5: 0x04030201 // our 4 byte values (in little endian order, the right hand byte is written/read before the left hand byte).

Question - Do adapters convert DXT textures to table/map when loading, or do they just store in DXT format and do the calculations per-render?

GPUs have dedicated hardware to perform DXT decompression on pixels at the last possible moment (right when the shader asks for a pixel). This greatly improves performance because it means that the texture is still compressed even in the texture cache, which means more data can be cached at once, and less bandwidth is required per texture-fetch

also, if using DXT5 and mipmaps, these can make the image around 4x larger than if using DXT1 with no mipmaps, which is also an issue.

To use this as an excuse to expand on the above statement on DXT performance - mipmaps are also extremely important for performance (regardless of texture format), because they improve locality of texture fetches / reduce bandwidth of fetches (during minification scenarios). So you should usually use them in most-cases. As mentioned by cr88192, with DXT textures, mipmaps are usually saved on disc, but with other formats like JPEG, they're often generated on-load.

Edited by Hodgman, 29 June 2013 - 12:01 AM.

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.