SNES Like 5 Layer 320x240 Graphics Library

Started by
18 comments, last by Promit 10 years ago

NO "NVIDIA or AMD or Intel OpenGL or DirectX License to unlock it's full potential" needed but it's 12FPS if you don't have license, if you have license it takes NO TIME to render at all.
GeForce 7600GT or Higher needed.(Consult source code for needed extensions)
If you are enthusiast and don't have money, use ths library, connect it to TV and it's awesome.
12FPS for GeForce 7600GT and it's slow only for graphics so it's kind of fixed FPS and if you use CPU time for hard working it won't slow down at all like 12FPS fixed forever unless you do some damned thing...
License: LGPL, CC0


It is basically 2D Library with similar graphical environment like THE CONSOLE Super Nintendo Entertainment System. We have this awesome and fast CPU why would we want to use crippled Public OpenGL library for this? Because, not only doing it on GPU is COOL but also gives you totally free cpu time per frame. GL Draw Call is done in another process using nvglvt.dll or similar things and it's basically same thing as just doing Sleep(1000/12) in another thread.

So, if you tries to implement this kind of environment like 32Bit 320x240 and 5 Layer in GPU blit only, it gives you about 3FPS no matter how great your graphics card is.

The more objects you have the slower it gets. Means you CAN'T make even Starcraft 1 if you only use what is given to you.

Point is, CPU is FREE and 12FPS is just Graphics do some crazy stuff like voxel 3D filling in cpu and path tracing in vertex shader and stuff. It is about using only uncrippled DP4 Vertex Shader processing line.


Think about Raspberry PI


What exactly is this? I see no documentation at all.

What exactly is this? I see no documentation at all.

Friend, it means nothing to you if you don't understand what I have wrote.

Try to run it in your old and slow computer. If it gives you 2000FPS it means nothing to you but if it gives you 300FPS you have a problem.

Also, code is so simple, just follow through Lesson45.cpp and vs2.txt

The problem is that we DON'T understand what you wrote. Is this something you made? Is this just something you found and thought was cool and wanted to share?

What's the purpose? Is it supposed to be a SNES emulator running on the GPU? Etc?

Waramp.Before you insult a man, walk a mile in his shoes.That way, when you do insult him, you'll be a mile away, and you'll have his shoes.

The problem is that we DON'T understand what you wrote. Is this something you made? Is this just something you found and thought was cool and wanted to share?
What's the purpose? Is it supposed to be a SNES emulator running on the GPU? Etc?

I've taken a look at the source code and feel qualified to answer on OP's behalf.
To answer your questions: Yes. No, yes, and yes. Retro-style graphics library. No.

Basically what we have here is software rendering library that draws quads for every emulated pixel. The library can support up to fifteen layers. It's fancy though because it uses vertex buffers and OpenGL.

If anybody wants a taste of what the code is like without downloading, you should look no farther than Lesson45.cpp@300. The code should speak for the quality of the library itself:
bool CMesh :: LoadHeightmap( char* szPath, float flHeightScale, float flResolution )
	m_nVertexCount = (int) wdt*hgt*4;
	m_pVertices = new CVec[m_nVertexCount];	
	m_pLayer1 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 2)
		m_pLayer2 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 3)
		m_pLayer3 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 4)
		m_pLayer4 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 5)
		m_pLayer5 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 6)
		m_pLayer6 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 7)
		m_pLayer7 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 8)
		m_pLayer8 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 9)
		m_pLayer9 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 10)
		m_pLayer10 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 11)
		m_pLayer11 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 12)
		m_pLayer12 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 13)
		m_pLayer13 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 14)
		m_pLayer14 = new CColor[m_nVertexCount];
	if(NUM_LAYERS >= 15)
		m_pLayer15 = new CColor[m_nVertexCount];
	mLayers[0] = m_pLayer1;
	mLayers[1] = m_pLayer2;
	mLayers[2] = m_pLayer3;
	mLayers[3] = m_pLayer4;
	mLayers[4] = m_pLayer5;
	mLayers[5] = m_pLayer6;
	mLayers[6] = m_pLayer7;
	mLayers[7] = m_pLayer8;
	mLayers[8] = m_pLayer9;
	mLayers[9] = m_pLayer10;
	mLayers[10] = m_pLayer11;
	mLayers[11] = m_pLayer12;
	mLayers[12] = m_pLayer13;
	mLayers[13] = m_pLayer14;
	mLayers[14] = m_pLayer15;

	for(int y=0; y< hgt; ++y)
		for(int x=0; x< wdt; ++x)
			m_pVertices[y*wdt*4 + x*4 + 0].x = (float)x;
			m_pVertices[y*wdt*4 + x*4 + 0].y = (float)y;
			m_pVertices[y*wdt*4 + x*4 + 0].z = 0.0f;
			m_pVertices[y*wdt*4 + x*4 + 0].w = 1.0f;
			m_pVertices[y*wdt*4 + x*4 + 1].x = (float)x+1.0f;
			m_pVertices[y*wdt*4 + x*4 + 1].y = (float)y;
			m_pVertices[y*wdt*4 + x*4 + 1].z = 0.0f;
			m_pVertices[y*wdt*4 + x*4 + 1].w = 1.0f;
			m_pVertices[y*wdt*4 + x*4 + 2].x = (float)x+1.0f;
			m_pVertices[y*wdt*4 + x*4 + 2].y = (float)y+1.0f;
			m_pVertices[y*wdt*4 + x*4 + 2].z = 0.0f;
			m_pVertices[y*wdt*4 + x*4 + 2].w = 1.0f;
			m_pVertices[y*wdt*4 + x*4 + 3].x = (float)x;
			m_pVertices[y*wdt*4 + x*4 + 3].y = (float)y+1.0f;
			m_pVertices[y*wdt*4 + x*4 + 3].z = 0.0f;
			m_pVertices[y*wdt*4 + x*4 + 3].w = 1.0f;

			for(int i=0; i<NUM_LAYERS; ++i)
			mLayers[i][y*wdt*4 + x*4 + 0].r = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 0].g = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 0].b = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 0].a = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 1].r = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 1].g = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 1].b = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 1].a = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 2].r = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 2].g = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 2].b = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 2].a = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 3].r = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 3].g = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 3].b = 0.0f;
			mLayers[i][y*wdt*4 + x*4 + 3].a = 0.0f;


	return true;

DirectX/NVIDIA license needed but it's 12FPS if you don't have license...

What? No! It's 12 FPS because you're loading image resources every frame and you're blitting those images in software. Both of those things are terribly slow! Lesson45.cpp@188:

// Render
int width, height;
unsigned char *ht_map = LoadImg(".\\imgs\\water.png", &width, &height);
BlitTest(g_pMesh->mLayers, 0, width, height, 0, height, ht_map);
BlitTest(g_pMesh->mLayers, 0, width, height, 0, 0, ht_map);

for(int y=0; y< hgt/height; ++y)
	for(int x=0; x< wdt/width; ++x)
		BlitTest(g_pMesh->mLayers, 0, width, height, x*width, y*height, ht_map);

unsigned char *penguin = LoadImg(".\\imgs\\penguin.png", &width, &height);

//for(int i=1; i< NUM_LAYERS; ++i)
	//BlitTest(g_pMesh->mLayers, i, width, height, 315+i*2, 235+i*2, penguin);
for(int i=1; i< NUM_LAYERS; ++i)
	BlitTest(g_pMesh->mLayers, i, width, height, -30+i*2, -5+i*2, penguin);
FreeImg( ht_map );
FreeImg( penguin );
char temp[100];
sprintf(temp, "%d %d", width, height);
MessageBox(NULL, temp, temp, MB_OK);
BlitTest(g_pMesh->mLayers, 0, width, height, 0, 0, ht_map);
BlitTest(g_pMesh->mLayers, 0, width, height, 0, height, ht_map);*/

glBindBufferARB(GL_ARRAY_BUFFER, g_pMesh->m_nVBOVertices);
glBufferDataARB(GL_ARRAY_BUFFER, sizeof(CVec)*g_pMesh->m_nVertexCount, (GLvoid*)g_pMesh->m_pVertices, GL_STATIC_DRAW);
glVertexAttribPointerARB(0, 4, GL_FLOAT, GL_FALSE, 0, 0);
for(int i=0;i<NUM_LAYERS;++i)
	glBindBufferARB(GL_ARRAY_BUFFER, g_pMesh->m_nColors[i]);
	glBufferDataARB(GL_ARRAY_BUFFER, sizeof(CVec)*g_pMesh->m_nVertexCount, (GLvoid*)g_pMesh->mLayers[i], GL_STATIC_DRAW);
	glVertexAttribPointerARB(i+1, 4, GL_FLOAT, GL_FALSE, 0, 0);
glDrawArrays( GL_QUADS, 0, g_pMesh->m_nVertexCount );	// Draw All Of The Triangles At Once
SwapBuffers (g_window->hDC);					// Swap Buffers (Double Buffering)

OK friends, you guys either know about licensing problems or don't know what you guys are talking about.

Us OpenGL programmers without proper job always tried to make some 3D or 2D game with existing hardwares.

I've had access to Properly licensed NVIDIA and AMD card and even shitty code that loads 1MegaByte 3D Mesh ran on 2000FPS and they do it all on CPU.

I asked why is it so fast here and so slow at home?

My previous boss said, yeah.. you need PAY NVIDIA and AMD to get full speed on OpenGL or DirectX or Assembly SDK.

I'm saying here is, no need to pay if you are interested with just playing along... I mean, did you guys even made a finished game anyways with existing hardware without licensing? I did.

I runs fast but it should run at least 500 FPS if I have licensed it correctly.

Play with my source code you guys will get it. Change int wdt=320 into 160 and int hgt=240 into 120 and you will see FPS goes up to 300FPS.

It runs on 12FPS not because I load images every frame but NVIDIA driver cripples and blocks it.

Upper comments said it right it uses Vertex Shader why I did it? Vertex Shader is only thing that is not crippled in unlicensed hardware.

Upper comments said it wrong it loading PNG every frame does NOT slow it down. Updating VBO every frame does not slow it down either.

Well you guys just wanted for me to explain it all but really, if someone don't get it and don't like it, and don't understand WHAT I HAVE WROTE IN THE FIRST PLACE even if you don't like it or not, it's NOT FOR YOU.

I'm confused. Is this a trolling attempt?

Hello to all my stalkers.

I think the most accurate thing here is the OP's name: "walking time bomb" indeed...!

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

OK here is the WHOLE explanation since you guys want to know what it is all about.

MFU and other functionality in Vertex Shader is all slow and crippled even if you have correct license.

Look at my vs2.txt code:

in  vec4 in_Pixel;
in  vec4 in_Layer1;
in  vec4 in_Layer2;
in  vec4 in_Layer3;
in  vec4 in_Layer4;
in  vec4 in_Layer5;
in  vec4 in_Layer6;
in  vec4 in_Layer7;
varying vec3 color;
varying float test;

void main()
	gl_Position = gl_ModelViewProjectionMatrix * in_Pixel; // Vertex Array Object's First Vertex Buffer Obect it Simulates Pixel. Remember. This is Vertex Shader. Don't transform it.

	vec4 curLayer = in_Layer2; // VAO's Second VBO. Color Buffer. Thus, First Layer of Bitmap.
	vec4 colR = vec4(in_Layer1.r, 0.0, 0.0, 0.0); // Read RGB pixel from current Bitmap.
	vec4 colG = vec4(in_Layer1.g, 0.0, 0.0, 0.0);
	vec4 colB = vec4(in_Layer1.b, 0.0, 0.0, 0.0);

	vec4 col2R = vec4(curLayer.r, 0.0, 0.0, 0.0);
	vec4 col2G = vec4(curLayer.g, 0.0, 0.0, 0.0);
	vec4 col2B = vec4(curLayer.b, 0.0, 0.0, 0.0);
	vec4 col2W = vec4(curLayer.a, 0.0, 0.0, 0.0);
	vec4 oneMinusLayer2Alpha2 = vec4(1.0, curLayer.a, 0.0, 0.0); // For blending

	vec4 oneMinusLayer2Alpha1 = vec4(1.0, -1.0, 0.0, 0.0); // For blending
	float colAf = dot(oneMinusLayer2Alpha1, oneMinusLayer2Alpha2); 
        // dot oneMinusLayer2Alpha2 oneMinusLayer2Alpha1 ?
        // 1.0*1.0 + curLayer.a*-1 + 0.0*0.0 + 0.0*0.0 == 1.0-curLayerAlpha
	float colRf = dot(colR, vec4(colAf)); //RGB * Alpha
	float colGf = dot(colG, vec4(colAf));
	float colBf = dot(colB, vec4(colAf));
	float col2Rf = dot(col2R, col2W); // RGB * 1.0 not used here but in next layer it will be used as current layer alpha
	float col2Gf = dot(col2G, col2W);
	float col2Bf = dot(col2B, col2W);
	float colFinalRf = dot(vec4(colRf, 1.0, 0.0, 0.0), vec4(1.0, col2Rf, 0.0, 0.0));
	float colFinalGf = dot(vec4(colGf, 1.0, 0.0, 0.0), vec4(1.0, col2Gf, 0.0, 0.0));
	float colFinalBf = dot(vec4(colBf, 1.0, 0.0, 0.0), vec4(1.0, col2Bf, 0.0, 0.0));

	color = vec3(colFinalRf,colFinalGf,colFinalBf);

It is all about using SO ABUNDANT 1GB average PREFETCHABLE from GPU DP4 unit line in GPU processor.

First of all lets clear up a misconception: You don't pay nVidia / AMD/ Intel or anyone else for using the OpenGL or DirectX APIs. They are free and if someone has told you that you have to pay to get better performance then you have been misinformed (Lied too).

You can write code, and compile it, run it, give away that programs etc, all without paying anyone and you will enjoy access to the same performance as everyone else does including big companies.

The reason your program runs slowly is because of the way that it is written, not because of nVidia / AMD, just because of your code.

fastCall22 has given you some of the reasons why it is slow and they should be easy enough to understand:

  • Loading the image every frame - ask yourself this question: which is faster, loading something, copying, then deleting it and repeating that every frame, or loading it once, using it until the program ends and only then releasing it? The answer is obviously that it is always faster to do less work so loading/deleting it only once is always going to be faster.
  • Software blitting - this is very slow and unnecessary. You should look into drawing textured quads (2 triangles), load your images (once only) turn the, into a texture and apply that texture too your quads, render those using opengl instead or blitting to the layers using software.
  • Updating your buffers every frame - Your data isn't changing so why are you updating them? Create them, fill them with data, and then use them every frame without updating them anymore. This goes back to the first problem of doing more work than you need/want to do each frame. This one comes with a 2nd problem though, you've told OpenGL that you won't update them very often by using the STATIC_DRAW flag... but then you are updating all of the time. That means that OpenGL has to do a lot of extra work which means that _you_ are slowing it down.

Everyone has to start learning somewhere, and I started with the NeHe tutorials a long time ago as well, but part of learning involves listening to what other people are trying to tell you.


"Ars longa, vita brevis, occasio praeceps, experimentum periculosum, iudicium difficile"

"Life is short, [the] craft long, opportunity fleeting, experiment treacherous, judgement difficult."

This topic is closed to new replies.
