Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!


1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Fragment shader variables count (iPhone4)


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
5 replies to this topic

#1 Martin Perry   Members   -  Reputation: 1378

Like
0Likes
Like

Posted 07 May 2014 - 05:51 AM

I have high number of variables (30 uniforms (mostly vec4), about 20 variables (vec3, float, vec4) within shader) within fragment shader. It runs just fine on iPhone5S, but I have serious problem on iPhone4. GPU time is 1s / frame and 98% of the time is shader run time.

According to Apple API

 

OpenGL ES limits the number of each variable type you can use in a vertex or fragment shader. The OpenGL ES specification doesn’t require implementations to provide a software fallback when these limits are exceeded; instead, the shader simply fails to compile or link. When developing your app you must ensure that no errors occur during shader compilation, as shown in Listing 10-1.

 

 

 

But from this I quite dont understand. Do they provide SW fallback or not? Because I have no errors during compilation or linking of shader and yet performance is poor. I have comment almost everything out and just leave 2 texture lookups and directional light computation. I changed other functions to return just vec4(0,0,0,0).


Edited by Martin Perry, 07 May 2014 - 05:52 AM.


Sponsor:

#2 L. Spiro   Crossbones+   -  Reputation: 16455

Like
0Likes
Like

Posted 07 May 2014 - 08:44 AM

But from this I quite dont understand. Do they provide SW fallback or not? Because I have no errors during compilation or linking of shader and yet performance is poor. I have comment almost everything out and just leave 2 texture lookups and directional light computation. I changed other functions to return just vec4(0,0,0,0).

I don’t believe they have a software fallback There is no software fallback on any iOS device, and your case essentially proves this isn’t the issue anyway.
The compiler will strip unused uniforms entirely; only uniforms that contribute to the output remain in any shaders. Even if you do access them, if they still don’t contribute to the output they are eliminated (as are local variables).

If you have removed enough to return hard-coded black mixed with a few texture reads, you have undoubtedly eliminated substantially enough uniforms and locals that you are nowhere near any device limits.
Your bottleneck should be elsewhere.


You should post the shaders before and after you reduce its complexity though to be sure this is the case.


L. Spiro

Edited by L. Spiro, 07 May 2014 - 06:48 PM.


#3 Martin Perry   Members   -  Reputation: 1378

Like
0Likes
Like

Posted 07 May 2014 - 09:55 AM

I have about 30 shaders with high complexity (about 30+ uniforms for each of them). My render loop currently uses only one shader, that I reduced to bare minimum (you are right, uniforms are stripped away during build)

 

My current shader looks like this:


vec4 CalcAmbientLight(Material mat, float fIntensity)
{
    return mat.vAmbient * en_vAmbLightColor * fIntensity;
}

uniform sampler2D normal_buffer;
uniform sampler2D bg_buffer;

uniform vec2 canvasSize;

uniform SpotLight en_spotLight[6];

varying vec2 vTexCoord;

void main()
{			

  vec4 normalMap = texture2D(normal_buffer, vTexCoord); //normal_buffer is RGBA (GL_RGBA)
  vec3 bgColor = texture2D(bg_buffer, vTexCoord).rgb; //bg_buffer is RGB (GL_RGB)

    
  vec3 vNormal = normalize((2.0 * normalMap.rbg) - 1.0);

  vec3 posWS = vec3(vTexCoord.x * canvasSize.x, normalMap.a, vTexCoord.y * canvasSize.y);


  //----------------------------------------------------------------------------------
  Material mat;
  mat.vAmbient = vec4(bgColor, 1.0);
  mat.vDiffuse = vec4(bgColor, 1.0);
  mat.vSpecular = vec4(1.0);
  mat.fShiness = 30.0;
 

    vec4 vPhong = CalcAmbientLight(mat, 0.2);
    
  vPhong.a = 1.0;
   gl_FragColor = vPhong;
}

Vertex shader is simple pass-through. I am rendering fullscreen quad at resolution of the iPhone4 screen.

 

With this piece of "nothing" code, I have 45fps (I know it is debug mode, but still it is way to low from 60fps). Same code has 60fps on iPhone5S. CPU is at 4.4ms / frame. GPU takes 20.1ms

 

Here is also my sequnce of calls from XCode frame trace

//use render to texture for the whole scene
#0 glBindFramebuffer(GL_FRAMEBUFFER, 7)
#1 glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 6, 0)
#2 GL_FRAMEBUFFER_COMPLETE <- glCheckFramebufferStatus(GL_FRAMEBUFFER)
#3 glBindRenderbuffer(GL_RENDERBUFFER, 8)
#4 glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, 8)
#5 GL_FRAMEBUFFER_COMPLETE <- glCheckFramebufferStatus(GL_FRAMEBUFFER)
#6 glUseProgram(17)
#7 glBindTexture(GL_TEXTURE_2D, 3)
#8 glBindVertexArray(1)
#9 glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 44)
#10 glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT, nullptr)
#11 glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0)
#12 glBindVertexArray(0)
//----------------------------- End of main rendering --------
//now render texture to screen
#13 glBindFramebuffer(GL_FRAMEBUFFER, 1)
#14 glBindRenderbuffer(GL_RENDERBUFFER, 2)
#15 glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER, 2)
#16 GL_FRAMEBUFFER_COMPLETE <- glCheckFramebufferStatus(GL_FRAMEBUFFER)
#17 glBindRenderbuffer(GL_RENDERBUFFER, 1)
#18 glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, 1)
#19 GL_FRAMEBUFFER_COMPLETE <- glCheckFramebufferStatus(GL_FRAMEBUFFER)
#20 glUseProgram(25)
#21 glBindTexture(GL_TEXTURE_2D, 6)
#22 glBindVertexArray(2)
#23 glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 4)
#24 glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT, nullptr)
#25 glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0)
#26 glBindVertexArray(0)
#27 glDiscardFramebufferEXT(GL_FRAMEBUFFER, 1, {GL_DEPTH_ATTACHMENT})
#28 glBindRenderbuffer(GL_RENDERBUFFER, 2)
#29 ["Context 1" presentRenderbuffer:GL_RENDERBUFFER]



Edited by Martin Perry, 07 May 2014 - 10:43 AM.


#4 C0lumbo   Crossbones+   -  Reputation: 2723

Like
0Likes
Like

Posted 07 May 2014 - 12:46 PM

45 fps when rendering at full resolution on an iPhone4 might well be normal, even with a shader that simple. An iPhone4 GPU is only slightly superior to a 3GS GPU, but it has 4 times the number of pixels to fill, so it really struggles with fill rate.

 

Remember that vsync is always on, so your reported 45fps might in reality be 59fps getting rounded down due to vsync.

 

Also, adding logical buffer discard and clear commands can speed things up. At the start of your frame the first thing that OpenGL is doing there is copying the previous frames framebuffers onto the new framebuffer, because the driver has no way of telling that you don't want them.

 

edit: Oops, just spotted the discard command. I think a glClear would still help though probably.


Edited by C0lumbo, 07 May 2014 - 12:57 PM.


#5 kalle_h   Members   -  Reputation: 1799

Like
0Likes
Like

Posted 07 May 2014 - 02:33 PM

 

But from this I quite dont understand. Do they provide SW fallback or not? Because I have no errors during compilation or linking of shader and yet performance is poor. I have comment almost everything out and just leave 2 texture lookups and directional light computation. I changed other functions to return just vec4(0,0,0,0).

I don’t believe they have a software fallback, and your case essentially proves this isn’t the issue anyway (at least at face value for now).
The compiler will strip unused uniforms entirely; only uniforms that contribute to the output remain in any shaders. Even if you do access them, if they still don’t contribute to the output they are eliminated (as are local variables).

If you have removed enough to return hard-coded black mixed with a few texture reads, you have undoubtedly eliminated substantially enough uniforms and locals that you are nowhere near any device limits.
Your bottleneck should be elsewhere.


You should post the shaders before and after you reduce its complexity though to be sure this is the case.


L. Spiro

 

 

There definelty is software fallback. I once hit that when I tried to support too many lights at vertex shader and peformance dropped to >400ms per frame. Then I reduced one light and frame time dropped to 10ms. Profiler clearly indicated that most of the time was spent at software shader pipeline.



#6 L. Spiro   Crossbones+   -  Reputation: 16455

Like
3Likes
Like

Posted 07 May 2014 - 06:29 PM

With this piece of "nothing" code, I have 45fps (I know it is debug mode, but still it is way to low from 60fps).

That is correct for an iPhone 4 whose OpenGL ES view has a contentScaleFactor of 2.0.
At work we allow game teams to set a preferred contentScaleFactor on supported devices, except for iPhone 4, which we force to 1.0.
In other words, your numbers are perfectly reasonable and you simply need to disable retina mode.


There definelty is software fallback. I once hit that when I tried to support too many lights at vertex shader and peformance dropped to >400ms per frame. Then I reduced one light and frame time dropped to 10ms. Profiler clearly indicated that most of the time was spent at software shader pipeline.

There is no software fallback under any circumstances for any iOS device released to date (nor will there likely ever be).
Your slowdown could be because of your lighting equations or because the complexity of your shader increased significantly with more lights and was harder to optimize, etc., but it was not because of software shaders.


L. Spiro

Edited by L. Spiro, 07 May 2014 - 06:48 PM.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS