#### Archived

This topic is now archived and is closed to further replies.

# 1024x1024 heightmap impossible??

This topic is 5278 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

This stems probably from my bad memory management but oh well. My comp craps out if I try and build a 1024x1024 heightmap with LoD using geomipmapping. For each pixel on the map, I create two triangles and put them into an octree (all x, y, z values are floats). I notice, however, that doing what I did with a 512x512 octree took 266 megs of ram! What does everyone do to get around memory issues, because well I don''t see a way around this one but I hear everyone saying how they implement 1024x1024 heightmaps. Right now I don''t use meshes but each node in the octree has attached to it all of the polygons which are in the node and all their vertices are stored independently of all of the others.

##### Share on other sites
Not enough info. How about listing your data structures, what they actually stores and their relationships?

Equally, try a little practical experimentation. Disable creation of certain parts of the data structure and see how the memory behaves, and for different size heightmaps.

##### Share on other sites
Ur octants are too small. Use 32x32 or something ... 2x2 is nothing!

##### Share on other sites
2 likely problems:

1.) You''re subdividing your octree far too many generations. For optimal rendering, each leaf node should have a few thousand triangles.

2.) 1024x1024x6x32 = 201,326,592 bytes for storage. Debugger overhead adds another 10-15% to that. You''re probably running out of memory.

##### Share on other sites
Ok My octants generally have 512 triangles in them but you're telling me that's too small, so I'll up them a bit. Here is how I store the polygons and here is the class that stores them (each polygon is just triangles for now, so you can see I have to invoke the draw() function many times)

#ifndef POLYGON_H_#define POLYGON_H_class CPoly{public:	CPoly(): Vertices(NULL), m_nVertices(0), texles(NULL), details(NULL){}	void *Draw();	CVector3 *GetVerts() { return Vertices;}	int GetVertCount() { return m_nVertices;}	void SetVerts(CVector3 *verts);	void SetVertCount(int num){m_nVertices=num;}	CVector3 GetPoint(int j){return Vertices[j];}	CVector3 Normal;private:	CVector3 *Vertices;	CVector3 *texles;	CVector3 *details;	int m_nVertices;};void CPoly::SetVerts(CVector3 *verts){	Vertices=verts;	texles = new CVector3[m_nVertices];	details = new CVector3[m_nVertices];	Normal = Cross(Vertices[0]-Vertices[1], Vertices[2]-Vertices[1]);	Normal = Normalize(Normal);	if(Normal.y < 0)		Normal = Normal * -1;	for(int i=0;i<m_nVertices;i++)	{		texles[i].x=verts[i].x/16384.0f;		texles[i].y=verts[i].z/16384.0f;		details[i].x=verts[i].x/128;		details[i].y=verts[i].z/128;	}}void * CPoly::Draw(){	int vertices=m_nVertices;	glBegin(GL_POLYGON);	while(vertices--)	{		glMultiTexCoord2fARB(GL_TEXTURE1_ARB, details[vertices].x, details[vertices].y);		glMultiTexCoord2fARB(GL_TEXTURE0_ARB, texles[vertices].x, texles[vertices].y);		glVertex3f(Vertices[vertices].x, Vertices[vertices].y, Vertices[vertices].z);	}	glEnd();}#endif

Currently testing with larger leaves. (max subdivisions was 7, lol). EDIT: changed max triangles to 2048 and maxsubdivisions to 3 and usage dropped by 30 megs.

[edited by - uber_n00b on June 5, 2004 3:51:04 PM]

##### Share on other sites
its not impossible.
*checking*
17424kb vertex buffer
16kb index buffer
144kb terrain patches

so a 1024x1024 terrain uses less than 18000kb if you really keep it small. if its an nvidia that doesnt goes "screw you" and drops to 0.00001fps if vertex data isnt 4byte aligned like my ati you can get it down to 4100kb vertex buffer, making it about 4.5mb (plus 8mb heightmap if you keep it around). if you dont use vbo it also works but bigger terrains (over 2048) were again horribly slow.

for 4096:
vertex buffer: 69700kb
index buffer: 16kb
patches: 576kb
total: ~70600kb ati or ~17256kb nvidia

heightmap uses 2bytes as 1byte for height is too limited.

how?

first: its redundant and useless to story x,z coordinates for the whole terrain, instead each patch has an x,y,z offset (if you stick with floats or shorts for height you dont need y offset, if you use bytes you do).

so you get 1024/32= 32 patches in each direction, being 1024 patches total. one patch looks like this:

TerrainPatch* next; //general next pointer for lists
TerrainPatch* Adj[4]; //the 4 neighbours, necessary because they arent stored as simple 2d array
Buffer* VBO; //buffer isnt important (either va or vbo)
unsigned VBOffset; //offset into the vertex buffer, switching for every patch kills performance
float [numlevels]; //the screen space errors for each level
short x,y,z; //position offset, you can use bytes and multiply with 256 before adding it, but patches really arent the problem
unsigned char LOD;

60bytes per patch, 60kb total.

index buffers are constant, though the way i subdivide results in twice the number of detail levels (2x rather than 4x the triangles per step). with all the buffers you need it might be more than 16kb, but even 50kb wouldnt matter.

vertex buffers are created for every 2048x2048 piece of the terrain, so here we get away with one. but data is patchwise, ie you get redundant vertices at the edges. else shorts might not be sufficient as indices (and you make sure data isnt spread over 30mb). each patch has 33x33 vertices. we only need height (float on ati, byte should do on nvidia). 4356 or 1089byte per patch, 4356 or 1089kb total.

x and z are stored exactly once in a seperate buffer, so thats 33x33x2(x4)= 2178 or 8712byte.

-set general terrain stuff (textures, some buffers)
-set patch offset and buffer if necessary
-set patch position offset
-render

the tree will be a little confusing. first: no pointers. hunting pointers down a tree is kind of wasteful and for terrains you know that ALL leaves will be filled. so the idea is storing all nodes in an array, starting with 0 for the root, 1,2,3,4 for the four nodes on level 1, 5,6,7,8 for the subnodes of node 1 and so on. the beauty is that +-1 will move back and fourth on nodes on a level (sorted by parent) and bitshifts by 2 move you up and down one level.

also, you can calculate the number of levels and tell at which index your leaves are starting. by storing the patch array in the same order you dont need any pointers from leaf/node to array, you just subtract x and get the patch index.

a node looks like this:
short Px, Py, Pz; //its position (center)
unsigned short Size; //only for bounding box (not necessary)
unsigned short Height; //ditto

so, unless you want to display the bounding box at any point thats 8byte per node. the radius is rounded up (spheres arent too precise anyway so this wont kill us).

the tree itself needs this:
SCNode* Data; //the array of nodes, obviously
unsigned numLevels; //number of levels (tree depth)
unsigned numNodes; //number of nodes (useful to have around)
unsigned firstLeaf; //index of the first leaf

so for culling we a) only use bounding spheres (not too precise, but dirt cheap and bounding box tests would only be used along the frustum edge/plane.. drawing a handful patches more again wont kill us). to avoid a list of visible patches we have the generic next pointer in each patch. so we have a pointer and set it to the visible patch and first patches next to 0. next time we set the patches pointer to the last visible patch and the list pointer to the new patch. for drawing we just follow the pointers until we reach the next=0.

texture coordinates are generated in the vertex shader. two sets for global textures and per patch textures using parameters 1/patchsize and 1/mapsize that are multiplied to the x,z position without and with added position offset. 1/patchsize can be adjusted if you want tiling (2/patchsize to tile twice etc.)

so on an nvidia with 128mb memory (or better 256) you should be able to cram a 8192x8192 terrain completely into video memory and use "only" 69mb video memory plus 1-2mb system memory (plus 128mb heightmap, if you keep it around).

though obviously with a terrain this size you should start thinking about less brutish ways than keeping the whole terrain in memory all the time.

you could use 4 bytes per vertex and instead of just having y you would send x,y,z,1. then its 4byte aligned and you dont need an extra buffer with x,z coordinates (its still a ton of redundant data, but ati pretty much forces you to waste 3 bytes per vertex).

if you dont like vertex programs/shaders and now think that 256x256x256 terrains are lame: you can still translate to the right offset, but i noticed that doing so pulls down the performance a lot more than just setting parameters for the vertex program (obviously, as its performing pointless matrix multiplication and requires pushing/popping a lot.. you could try to fetch the matrix once, change the last column to the right position and use loadmatrix to directly set it.. maybe that would be speeding it up).

im wondering: will graphic cards ever add some kind of geometry compression in addition to texture compression? or maybe even general buffer compression, as it seems that everything you can store in video memory will soon just be a general buffer with data thats interpreted however you tell it to be.

[edited by - Trienco on June 6, 2004 3:06:07 AM]

##### Share on other sites
"if its an nvidia that doesnt goes "screw you" and drops to 0.00001fps if vertex data isnt 4byte aligned like my ati you can"
I'am also having a strange lag on nvidia cards at present. What do you exactly mean with the statement above? If your using floats only, your data is always 4 byte aligned!

[edited by - Dtag on June 6, 2004 3:32:43 AM]

##### Share on other sites
quote:
Original post by Dtag
"if its an nvidia that doesnt goes "screw you" and drops to 0.00001fps if vertex data isnt 4byte aligned like my ati you can"
I'am also having a strange lag on nvidia cards at present. What do you exactly mean with the statement above? If your using floats only, your data is always 4 byte aligned!

right, but using floats when even bytes are more than you need means wasting a ton of memory (here: over 50mb).

so while anything above a handful of verticex data that isnt 4byte aligned completely kills anything resembling performance on ati i couldnt detect the slightest difference on nvidia (of course i cant promise the driver isnt converting it to float while filling the buffer). nvidia might also advice to use native types for best performance, but in stark contrast to ati their hardware doesnt seem to have that much trouble handling non native data.

if newer nvidia cards show the same problem then i'll simply be frustrated, and use all that brilinear here and hidden brilieanar there as excuse to not buy any more graphics hardware until the first raytracing cards come out ,-)

[edited by - Trienco on June 6, 2004 7:08:33 AM]

##### Share on other sites
That method works, but it''s kind of limiting in that:

1.) You only get 255 possible levels. That might cause an issue for smooth interpolation if there''s a quick change in height.

2.) You can''t really do anything other than terrain and other "grid aligned" data with it; you''d need a completely seperate rendering methodology for any other geometry you use.

Which, in short, means it''d be great for making pretty terrain demos (of which there are already millions of look-alikes), but kludgy for most anything else.

If you can always be 100% sure that your data is going to essentially be "2d", it''s a fine way to go though, I suppose.

A better solution:

Using "full sized" 32 byte vertices (positional x,y,z, normal x,y,z, and a single set of texture coordinates) uses 33,554,432 bytes of memory (32MB). A 2 byte index buffer would eat up an additional 12,582,912 bytes (assuming you have 6 indices for every 1 vertex) (12MB). If your data is static, you don''t even need to store it locally, and it can reside in video memory. You can further reduce index buffer usage by using triangle strips.

Considerably less memory used than the original requirement, and still nearly as flexibile as storing seperate triangles for every piece of data. The hardware only does 1024x1024 transformations (instead of 1024x1024x6). This is why index buffers are your friend.

1. 1
Rutin
41
2. 2
3. 3
4. 4
5. 5

• 16
• 18
• 12
• 14
• 9
• ### Forum Statistics

• Total Topics
633361
• Total Posts
3011525
• ### Who's Online (See full list)

There are no registered users currently online

×