• Advertisement
Sign in to follow this  

OpenGL C++ vs. Java: an experience

This topic is 1401 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Greetings,

 

I am writing a block-based game. I began in Java using JOGL, having had recent experience with that pair on another project.  Later, in defference to the many conversations about JOGL vs LWJGL, I ported to that library and posted the experienc here (performance was identical, but LWJGL easier to use).

 

Well, in speaking with some friends I learned that it's not as hard to write portable C++ code as I thought, particularly with C++ 11 (I was terrified of having to mess with XCode to get a Mac port working!).  So, given that C++ is my favorite language, I ported my game again.

 

Why I think this matters

There are often threads here (and elsewhere) asking about what language a person should learn, or asking if certain language/platform choices are good ones to make games on.  My view where most OpenGL-based games are concerned is that the performance bottlenecks are almost always GPU based, and as modern GL use uploads all data to the GPU, the language will rarely matter.  Consider Minecraft: could this game be "better" or faster if it had been written in C++?  Sure, the code could run faster (if written well), but the frame rate would not necessarily improve.  To say it another way, Minecraft could be improved through optimization within its current language (Java) more than by a move to another language.

 

So why would I rather work in C++?

I know it best.  I have many more years of experience in C++, and having also many years of assembler, am able to write much more performant code.  For the vast majority of uses, this would not matter, but being a bit of an optimization fanatic, I like to keep resource use low "just in case."  This is impractical, but part of my character as a programmer.

 

So what were my results after porting?

My Java code was written "just to make it work," i.e., no optimization whatsoever.  The C++ code was as trivial a port as could be done, also with no optimization.

 

Note: I spent a fair amount of time optimizing use of the GPU, and that work is language agnostic, but really no work optimizing for the CPU.

 

The program renders as many as four million block faces, meshed into perhaps two million triangles (one million quads). It uses frustum culling, z-test and a few tricks to eliminate geometry that does not need to be rendered, but nothing else.

My results were as follows:
* No change in framerate (~265 fps, minimum, in both languages, and > 3,000 maximum)
* No change in CPU use while playing (~10%, but it is as yet single threaded on a 6-core box)
* Perhaps 15% faster world generation time in C++
* Slightly smaller memory use in C++ as apposed to Java

 

Although the numbers differ, the difference (no difference) is similar across many different PCs and GPUs.

From personal experience, I know that*
* I can bring the CPU use down for C++, and also bring down the world-gen time (with a little optimization)
* I can bring the memory use down dramatically in C++

 

But one more part of the experience bears some mention: development and debugging.

 

In Java, all builds are created equal. They all run with full debug support, and they run fairly fast.

 

In C++, release builds are the fastest but lack even trivial debug support. Debug builds are unbelievably slow.

 

Also, a Java port to another platform requires a few command line switch changes, whereas a port in C++ requires a platform-specific build which, while easier than I had thought (using glfw, etc.), is still much more technically challenging than with Java.

 

Conclusion

For an experienced C++ programmer, and particularly working on a long-term project, in which future requirements may demand the utmost performance, and for which the time to do platform-specific builds is comparitively small, C++ is what I will choose to use.

 

However, if tasked with building anything with more certain requirements, or in which the CPU-related needs are both known and moderate (at least not high), Java is a better language, at least insofar as I know I can complete the work and cross-platform port in far less time.

To put it another way: assuming Notch is reasonably skilled, Minecraft would not be better nor give higher FPS if written in C++.  Therefore, for most games, people should choose the language they are most productive in.

A few details, for those that care
The data in this post does not depend on tech specs, but I know I often want them when I read such things, so here they are.

I am running Windows 8.1 on an i7 980x (6-core) CPU with 24 GB 1,600 tripple channel DDR-3. It has a Radeon HD 5870 GPU.  I am developing with Microsoft Visual Studio 2013 Express (for C++) and Intelli-J Community edition (for Java).

Share this post


Link to post
Share on other sites
Advertisement

The GPU may be strong, but with voxel worlds you suddenly find yourself bottlenecked just about everywhere. :(

The good thing about the CPU-side is that you can use threadpools to slowly generate the world meshes, which is basically a must-have for complex voxel engines with raycasted lighting etc.

But the GPU side stays the same, and is not quite solved for me. Decent FPS when having a monster GPU is easy enough.

I only wish i had more options when it came to how i rendered the data, especially with how many commands i need to go through. The rest is scaleable - such as settings and especially view distance. In the end you could run my engine on just about anything with opengl 3.x support, more or less. It just wouldn't be the same with terrain all the way into the horizon.

Share this post


Link to post
Share on other sites

Whenever I play Minecraft I keep thinking to myself "I wonder how much faster/cleaner/more stable this would have been if they had something like value types in C#/.NET". I'm not saying Java is a poor choice for game development, but rather that each language has its strengths and weaknesses, and that I'm skeptical MC is a game for which Java truly shines.

 

Also, doing a straight port from one language/platform to another is rarely intresting to look at since it usually means you won't take advantage of the new language fully. It's very possible that you could have created very different solutions in C++ that simply are not possible in Java for example. Of course, if you're GPU bound then it won't matter.

Share this post


Link to post
Share on other sites

Also, doing a straight port from one language/platform to another is rarely intresting to look at since it usually means you won't take advantage of the new language fully.

I would only apply this to his memory-use comparison. Java has garbage collection, so it probably keeps objects around in memory for longer than his C++ implementation does. The rest of the comparsion results seem interesting to me, especially the point he makes that optimized C++ code doesn't yield better frame-rates than the Java counterpart. To me, this clearly means that game development nowadays is all about GPU, rather than CPU optimization.

Ok, maybe the point about CPU use is also invalid, since  a single thread running full speed on a six-core CPU would be expected to consume 16%. He mentions ~10%, but if that's 10% of one core, or 10% of all cores is relevant too. And it also means that his tread does some waiting somewhere that might not be needed in such a comparison (since it never gets to 16%, or is that power-saving?).

Edited by tonemgub

Share this post


Link to post
Share on other sites

No, it would be expected to consume ~8% of total due to Hyper Threading on that cpu (12 logical cpus). This means we cannot be sure that he is in fact gpu bound, he could just as well be cpu bound. Also, even if he is gpu bound, that only means that _he_ is gpu bound, not that games in general is. It's also possible that he shifted work to the gpu from the cpu because some rendering/optimization techniques are hard to do effectively with Java that would have been easier in another language such as C++. Please note that I'm not trying to compare C++ to Java with this statement but rather point out a flaw with this type of language comparison.

Share this post


Link to post
Share on other sites

UPDATE

 

Regarding some of the posts here: my game was GPU bound.  I have deep and lengthy experience with JNI and other native issues, and I promise that ~8k JNI calls per frame were not taxing the CPU, the RAM or the PCE-e bus, although if the JNI calls were powerfully contributing to my FPS one would have expected an improvement in C++ (no marshalling), which did not occur.  (I have since analyzed performance on several GPUs and found the bottlenecks to be geometry processing on all but one, and fill rate on that outlier, neither of which are language related).

 

Now, the comparison I made is salient for the type of game I am building: largely static geometry, whereby nearly everything is uploaded to the GPU for continual processing.

 

Java garbage collection can seriously impact games, albeit moreso on core-limited devices (such as mobile), but my code had no measurable loss of performance from garbage collection as I am a bit of an optimization geek and everything that could cause these problems is pooled (I have used and taught Java since about '98).

 

I have recently done some testing on various GPUs and found some drivers do rather unpredictable things with respect to how they move data from RAM to VRAM.  In particular, standard "best practice" methods for moving data do not apply on some of the Radeon 5xxx series drivers, in which glBufferData is every bit as fast as mapping and uploading in a background thread.  My tests show that glBufferData initiates a DMA transfer from the RAM buffer directly to the GPU, whereas conventional knowledge suggests that the driver first copies the data.  Of course this is an optimization, but is means if you reuse the buffer before the DMA transfer is complete, you will get some very interesting results (and I did!).  Ironically, this particular optimization will benefit Java implementations a little more than C++ ones.

 

Finally, my real update: after re-writing some code--particularly my task scheduler (logic to maximize performance on all CPU cores)--my performance is up about 10x from what I started with.  When I back-ported this logic to Java, I got a 3x improvement, so something about Java threads causes them to suffer a significant loss compared to native C++ 11 threads.  This is visible watching the task manager: whereby both programs have the same number of threads, only the C++ 11 peggs all cores, and the Java one leaves them at around 35% utilization.

 

I would love to pursue this and underand why for my own edification, but as I am no longer working in Java, I just don't care any more :-)

 

In conclusion: I believe the following statement to be true: if you write a game in C++ and it uses on average less than 50% of the typical CPU, the game could be written to perform nearly as well in Java.  If it uses less than 30% of the CPU, it could perform as well in Java.

 

This is to say what I have always said: you should program in the language you know best (C++, Java, C#, VB,...) as that will maximize your chance of completing a game.  If you have performance problems, chances are you can fix them algorithmically, and if they are FPS related, chances are you can NOT fix them with a language change (unless you are CPU bound).

Share this post


Link to post
Share on other sites


I am a bit of an optimization geek and everything that could cause these problems is pooled (I have used and taught Java since about '98).
What resources do you pool often? 3D applications often use lots of vectors and matrices. Java has the peculiarity of needing direct buffers for interfacing with native code like OpenGL lib. How do you handle those?

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
  • Advertisement
  • Popular Now

  • Advertisement
  • Similar Content

    • By Balma Alparisi
      i got error 1282 in my code.
      sf::ContextSettings settings; settings.majorVersion = 4; settings.minorVersion = 5; settings.attributeFlags = settings.Core; sf::Window window; window.create(sf::VideoMode(1600, 900), "Texture Unit Rectangle", sf::Style::Close, settings); window.setActive(true); window.setVerticalSyncEnabled(true); glewInit(); GLuint shaderProgram = createShaderProgram("FX/Rectangle.vss", "FX/Rectangle.fss"); float vertex[] = { -0.5f,0.5f,0.0f, 0.0f,0.0f, -0.5f,-0.5f,0.0f, 0.0f,1.0f, 0.5f,0.5f,0.0f, 1.0f,0.0f, 0.5,-0.5f,0.0f, 1.0f,1.0f, }; GLuint indices[] = { 0,1,2, 1,2,3, }; GLuint vao; glGenVertexArrays(1, &vao); glBindVertexArray(vao); GLuint vbo; glGenBuffers(1, &vbo); glBindBuffer(GL_ARRAY_BUFFER, vbo); glBufferData(GL_ARRAY_BUFFER, sizeof(vertex), vertex, GL_STATIC_DRAW); GLuint ebo; glGenBuffers(1, &ebo); glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ebo); glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(indices), indices,GL_STATIC_DRAW); glVertexAttribPointer(0, 3, GL_FLOAT, false, sizeof(float) * 5, (void*)0); glEnableVertexAttribArray(0); glVertexAttribPointer(1, 2, GL_FLOAT, false, sizeof(float) * 5, (void*)(sizeof(float) * 3)); glEnableVertexAttribArray(1); GLuint texture[2]; glGenTextures(2, texture); glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, texture[0]); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); sf::Image* imageOne = new sf::Image; bool isImageOneLoaded = imageOne->loadFromFile("Texture/container.jpg"); if (isImageOneLoaded) { glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, imageOne->getSize().x, imageOne->getSize().y, 0, GL_RGBA, GL_UNSIGNED_BYTE, imageOne->getPixelsPtr()); glGenerateMipmap(GL_TEXTURE_2D); } delete imageOne; glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, texture[1]); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); sf::Image* imageTwo = new sf::Image; bool isImageTwoLoaded = imageTwo->loadFromFile("Texture/awesomeface.png"); if (isImageTwoLoaded) { glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, imageTwo->getSize().x, imageTwo->getSize().y, 0, GL_RGBA, GL_UNSIGNED_BYTE, imageTwo->getPixelsPtr()); glGenerateMipmap(GL_TEXTURE_2D); } delete imageTwo; glUniform1i(glGetUniformLocation(shaderProgram, "inTextureOne"), 0); glUniform1i(glGetUniformLocation(shaderProgram, "inTextureTwo"), 1); GLenum error = glGetError(); std::cout << error << std::endl; sf::Event event; bool isRunning = true; while (isRunning) { while (window.pollEvent(event)) { if (event.type == event.Closed) { isRunning = false; } } glClear(GL_COLOR_BUFFER_BIT); if (isImageOneLoaded && isImageTwoLoaded) { glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, texture[0]); glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, texture[1]); glUseProgram(shaderProgram); } glBindVertexArray(vao); glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT, nullptr); glBindVertexArray(0); window.display(); } glDeleteVertexArrays(1, &vao); glDeleteBuffers(1, &vbo); glDeleteBuffers(1, &ebo); glDeleteProgram(shaderProgram); glDeleteTextures(2,texture); return 0; } and this is the vertex shader
      #version 450 core layout(location=0) in vec3 inPos; layout(location=1) in vec2 inTexCoord; out vec2 TexCoord; void main() { gl_Position=vec4(inPos,1.0); TexCoord=inTexCoord; } and the fragment shader
      #version 450 core in vec2 TexCoord; uniform sampler2D inTextureOne; uniform sampler2D inTextureTwo; out vec4 FragmentColor; void main() { FragmentColor=mix(texture(inTextureOne,TexCoord),texture(inTextureTwo,TexCoord),0.2); } I was expecting awesomeface.png on top of container.jpg

    • By khawk
      We've just released all of the source code for the NeHe OpenGL lessons on our Github page at https://github.com/gamedev-net/nehe-opengl. code - 43 total platforms, configurations, and languages are included.
      Now operated by GameDev.net, NeHe is located at http://nehe.gamedev.net where it has been a valuable resource for developers wanting to learn OpenGL and graphics programming.

      View full story
    • By TheChubu
      The Khronos™ Group, an open consortium of leading hardware and software companies, announces from the SIGGRAPH 2017 Conference the immediate public availability of the OpenGL® 4.6 specification. OpenGL 4.6 integrates the functionality of numerous ARB and EXT extensions created by Khronos members AMD, Intel, and NVIDIA into core, including the capability to ingest SPIR-V™ shaders.
      SPIR-V is a Khronos-defined standard intermediate language for parallel compute and graphics, which enables content creators to simplify their shader authoring and management pipelines while providing significant source shading language flexibility. OpenGL 4.6 adds support for ingesting SPIR-V shaders to the core specification, guaranteeing that SPIR-V shaders will be widely supported by OpenGL implementations.
      OpenGL 4.6 adds the functionality of these ARB extensions to OpenGL’s core specification:
      GL_ARB_gl_spirv and GL_ARB_spirv_extensions to standardize SPIR-V support for OpenGL GL_ARB_indirect_parameters and GL_ARB_shader_draw_parameters for reducing the CPU overhead associated with rendering batches of geometry GL_ARB_pipeline_statistics_query and GL_ARB_transform_feedback_overflow_querystandardize OpenGL support for features available in Direct3D GL_ARB_texture_filter_anisotropic (based on GL_EXT_texture_filter_anisotropic) brings previously IP encumbered functionality into OpenGL to improve the visual quality of textured scenes GL_ARB_polygon_offset_clamp (based on GL_EXT_polygon_offset_clamp) suppresses a common visual artifact known as a “light leak” associated with rendering shadows GL_ARB_shader_atomic_counter_ops and GL_ARB_shader_group_vote add shader intrinsics supported by all desktop vendors to improve functionality and performance GL_KHR_no_error reduces driver overhead by allowing the application to indicate that it expects error-free operation so errors need not be generated In addition to the above features being added to OpenGL 4.6, the following are being released as extensions:
      GL_KHR_parallel_shader_compile allows applications to launch multiple shader compile threads to improve shader compile throughput WGL_ARB_create_context_no_error and GXL_ARB_create_context_no_error allow no error contexts to be created with WGL or GLX that support the GL_KHR_no_error extension “I’m proud to announce OpenGL 4.6 as the most feature-rich version of OpenGL yet. We've brought together the most popular, widely-supported extensions into a new core specification to give OpenGL developers and end users an improved baseline feature set. This includes resolving previous intellectual property roadblocks to bringing anisotropic texture filtering and polygon offset clamping into the core specification to enable widespread implementation and usage,” said Piers Daniell, chair of the OpenGL Working Group at Khronos. “The OpenGL working group will continue to respond to market needs and work with GPU vendors to ensure OpenGL remains a viable and evolving graphics API for all its customers and users across many vital industries.“
      The OpenGL 4.6 specification can be found at https://khronos.org/registry/OpenGL/index_gl.php. The GLSL to SPIR-V compiler glslang has been updated with GLSL 4.60 support, and can be found at https://github.com/KhronosGroup/glslang.
      Sophisticated graphics applications will also benefit from a set of newly released extensions for both OpenGL and OpenGL ES to enable interoperability with Vulkan and Direct3D. These extensions are named:
      GL_EXT_memory_object GL_EXT_memory_object_fd GL_EXT_memory_object_win32 GL_EXT_semaphore GL_EXT_semaphore_fd GL_EXT_semaphore_win32 GL_EXT_win32_keyed_mutex They can be found at: https://khronos.org/registry/OpenGL/index_gl.php
      Industry Support for OpenGL 4.6
      “With OpenGL 4.6 our customers have an improved set of core features available on our full range of OpenGL 4.x capable GPUs. These features provide improved rendering quality, performance and functionality. As the graphics industry’s most popular API, we fully support OpenGL and will continue to work closely with the Khronos Group on the development of new OpenGL specifications and extensions for our customers. NVIDIA has released beta OpenGL 4.6 drivers today at https://developer.nvidia.com/opengl-driver so developers can use these new features right away,” said Bob Pette, vice president, Professional Graphics at NVIDIA.
      "OpenGL 4.6 will be the first OpenGL release where conformant open source implementations based on the Mesa project will be deliverable in a reasonable timeframe after release. The open sourcing of the OpenGL conformance test suite and ongoing work between Khronos and X.org will also allow for non-vendor led open source implementations to achieve conformance in the near future," said David Airlie, senior principal engineer at Red Hat, and developer on Mesa/X.org projects.

      View full story
    • By _OskaR
      Hi,
      I have an OpenGL application but without possibility to wite own shaders.
      I need to perform small VS modification - is possible to do it in an alternative way? Do we have apps or driver modifictions which will catch the shader sent to GPU and override it?
    • By xhcao
      Does sync be needed to read texture content after access texture image in compute shader?
      My simple code is as below,
      glUseProgram(program.get());
      glBindImageTexture(0, texture[0], 0, GL_FALSE, 3, GL_READ_ONLY, GL_R32UI);
      glBindImageTexture(1, texture[1], 0, GL_FALSE, 4, GL_WRITE_ONLY, GL_R32UI);
      glDispatchCompute(1, 1, 1);
      // Does sync be needed here?
      glUseProgram(0);
      glBindFramebuffer(GL_READ_FRAMEBUFFER, framebuffer);
      glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
                                     GL_TEXTURE_CUBE_MAP_POSITIVE_X + face, texture[1], 0);
      glReadPixels(0, 0, kWidth, kHeight, GL_RED_INTEGER, GL_UNSIGNED_INT, outputValues);
       
      Compute shader is very simple, imageLoad content from texture[0], and imageStore content to texture[1]. Does need to sync after dispatchCompute?
  • Advertisement