Optomize Binary Texture Loading

Started by
9 comments, last by GalacticCrew 5 years, 8 months ago

Hello,

I have a custom binary ImageFile, it is essentially a custom version of DDS made up of 2 important parts:


struct FileHeader
{
	dword m_signature;
	dword m_fileSize;
};

struct ImageFileInfo
{
	dword m_width;
	dword m_height;
	dword m_depth;
	dword m_mipCount;  //atleast 1
	dword m_arraySize; // atleast 1
	SurfaceFormat m_surfaceFormat;
	dword m_pitch; //length of scanline
	dword m_byteCount;
	byte* m_data;
};

It uses a custom BinaryIO class i wrote to read and write binary, the majority of the data is unsigned int which is a dword so ill only show the dword function:


bool BinaryIO::WriteDWord(dword value)
{
	if (!m_file && (m_mode == BINARY_FILEMODE::READ))
	{
		//log: file null or you tried to read from a write only file!
		return false;
	}

	byte bytes[4];
	bytes[0] = (value & 0xFF);
	bytes[1] = (value >> 8) & 0xFF;
	bytes[2] = (value >> 16) & 0xFF;
	bytes[3] = (value >> 24) & 0xFF;

	m_file.write((char*)bytes, sizeof(bytes));

	return true;
}

//-----------------------------------------------------------------------------
dword BinaryIO::ReadDword()
{
	if (!m_file && (m_mode == BINARY_FILEMODE::WRITE))
	{
		//log: file null or you tried to read from a write only file!
		return NULL;
	}

	dword value;
	byte bytes[4];
	m_file.read((char*)&bytes, sizeof(bytes));
	value = (bytes[0] | (bytes[1] << 8) | (bytes[2] << 16) | bytes[3] << 24);
	return value;
}

So as you can Imagine you end up with a loop for reading like this:


byte* inBytesIterator = m_fileInfo.m_data;

for (unsigned int i = 0; i < m_fileInfo.m_byteCount; i++)
{
	*inBytesIterator = binaryIO.ReadByte();
	inBytesIterator++;
}

And finally to read it into dx11 buffer memory we have the following:


//Pass the Data to the GPU: Remembering Mips
	D3D11_SUBRESOURCE_DATA* initData = new D3D11_SUBRESOURCE_DATA[m_mipCount];
	ZeroMemory(initData, sizeof(D3D11_SUBRESOURCE_DATA));

	//Used as an iterator
	byte* source = texDesc.m_data;
	byte* endBytes = source + m_totalBytes;


	int index = 0;
	for (int i = 0; i < m_arraySize; i++)
	{
		int w = m_width;
		int h = m_height;
		int numBytes = GetByteCount(w, h);

		for (int j = 0; j < m_mipCount; j++)
		{
			if ((m_mipCount <= 1) || (w <= 16384 && h <= 16384))
			{
				initData[index].pSysMem = source;
				initData[index].SysMemPitch = GetPitch(w);
				initData[index].SysMemSlicePitch = numBytes;
				index++;
			}

			if (source + numBytes > endBytes)
			{
				LogGraphics("Too many Bytes!");
				return false;
			}

			//Divide by 2
			w = w >> 1;
			h = h >> 1;

			if (w == 0) { w = 1; }
			if (h == 0) { h = 1; }
		}
	}

It seems rather slow particularly for big textures, is there any way i could optimize this? as the render grows too rendering multiple textured objects the loading times may become problematic. At the moment it takes around 2 seconds to load a 4096x4096 texture, you can see the output in the attached images.

Thanks.

 

TextureBall.png

Advertisement

Reading a file a byte or a DWORD at a time is incredibly inefficient. I would liken that approach to issuing a Draw Call per Triangle when rendering a model rather than exploiting the fact that there's an API available to draw multiple triangles in a single call.

When reading the pixel data, why are you not reading at least an entire mip's worth of bytes in a single call to ReadFile?

Adam Miles - Principal Software Development Engineer - Microsoft Xbox Advanced Technology Group

Read the entire file into RAM with a single asynchronous read operation. If you're writing Windows specific code, look into their "overlapped IO" stuff for how to get the OS to copy a file into your own buffers asynchronously. Then once that's done (some number of frames later, possibly) use the header information to fill in your array of D3D11_SUBRESOURCE_DATA objects with the right offsets into the pixel section of the file. Then call CreateTexture2D (again, possibly on a background thread). 

12 hours ago, ajmiles said:

Reading a file a byte or a DWORD at a time is incredibly inefficient. I would liken that approach to issuing a Draw Call per Triangle when rendering a model rather than exploiting the fact that there's an API available to draw multiple triangles in a single call.

When reading the pixel data, why are you not reading at least an entire mip's worth of bytes in a single call to ReadFile?

The original reason was encase i needed to implement endian flip if i ignore platforms that are not x86 then it is a simple case of doing:


//-----------------------------------------------------------------------------
bool BinaryIO::WriteBuffer(byte * data, unsigned int byteCount)
{
	if (!mFile && (mMode == BINARY_FILEMODE::WRITE_BINARY))
	{
		//log: file null or you tried to read from a write only file!
		return false;
	}

	mFile.write((char*)data, byteCount);

	return true;
}

//-----------------------------------------------------------------------------
bool BinaryIO::ReadBuffer(byte * data, unsigned int byteCount)
{
	if(!mFile && (mMode == BINARY_FILEMODE::READ_BINARY))
	{
		//log: file null or you tried to read from a write only file!
		return false;
	}

	mFile.read((char*)data, byteCount);

	return true;
}

Which is pretty much instantaneous load, so ill just leave it at that for the time being.

1 hour ago, Jemme said:

The original reason was encase i needed to implement endian flip

If you port it to an Xbox360, then you can endian-flip your texture files ahead of time, and store the pre-flipped pixel data on the disc :) 

With PowerPC now largely consigned to the scrap heap in terms of hardware that might run a game, I think you can be pretty sure that little endian will cover all your x86 and ARM needs. If in the future it doesn't for some reason, either produce two sets of assets that are pre swapped into the right format or get the GPU to do the endian swapping for you on load rather than burden the CPU.

With D3D12 (and Vulkan?) you can even setup your ShaderResourceViews to have an arbitrary swizzle on sampling, so for 'free' you could always swap RGBA back around to ABGR (and vice versa) without ever touching the underlying data.

Adam Miles - Principal Software Development Engineer - Microsoft Xbox Advanced Technology Group

32 minutes ago, Hodgman said:

If you port it to an Xbox360, then you can endian-flip your texture files ahead of time, and store the pre-flipped pixel data on the disc :) 

I think, having textures in an intermediate formate which get converted to target format (big endian on PPC, little endian on x86) during compile time might work as well. 

http://9tawan.net/en/

1 hour ago, Hodgman said:

If you port it to an Xbox360, then you can endian-flip your texture files ahead of time, and store the pre-flipped pixel data on the disc :) 

Yeah that makes sense,loading can benefit from the entire read but compiling can do the pre-flip.

10 hours ago, mr_tawan said:

I think, having textures in an intermediate formate which get converted to target format (big endian on PPC, little endian on x86) during compile time might work as well. 

Yeah I recommend doing this for ALL if you game assets. Don't save any game-ready / shippable files manually - have them all be generated from intermediate files via some kind of build system. 

If you want to change your binary model format, or how textures are packed/compressed, or convert text configuration files to some binary format, etc, then you just have to change your build process and recompile the assets. Without an asset compiler, you'd have to go and re-export every model in order to change the model file format, which could be a huge amount of manual labour. 

Hey Jemme,

I developed my game (engine) in C# using SharpDX as wrapper for DirectX in C#. I can send you the source code I use for loading DDS textures, if you want. It is based on several code snippets I found only, but I improved it. It loads even large textures extremely fast.

I also have to fully agree with Hodgman! I spent a lot of time developing an asset converter. I have XML files in my game projects listing all assets I want to use including paths to their files and connected files (textures, materials, ...) and the asset converter sucks everything in and poops my own custom file format out. For Galactic Crew it takes around 20 minutes to convert all 8 GB of data in my 4 GB data I use in my game, but once you have written such a tool, adding assets to your projects becomes a no-brainer.

This topic is closed to new replies.

Advertisement