Where to start.. it's not the method itself you need to optimize, it's the way you're storing the cubes. At the moment your method is identifying adjacent cubes on the fly, for each cube. But the problem is that each cube is being checked against every single other cube to determine if it is surrounded by cubes on all sides, and hence if it is occluded and should not be rendered. The main issue here is that you are not using the right data structure to store your cubes. Your current data structure (a raw array of cubes) provides no sane way to check if a cube is completely surrounded by other cubes. It's just not efficient, and no amount of micro-optimizing your current approach will make it efficient: your data structure is not giving you the information you need. What you want is a better way to represent your world such that you can very quickly answer the question "should I draw this cube" without iterating through the entire list of cubes to find out.
There are many possible approaches, the easiest (but probably least flexible) is perhaps to store your cubes inside a 3D array, with their positions inferred from their index in the array, and then adjacency check consists of simply checking if the 6 surrounding array indices contain cubes. Of course this is still not particularly efficient, you will have poor cache locality and will probably still be doing too much work. Another option is to store pointers to adjacent cubes inside your cube structure while constructing your world (and keeping track of them as you add and remove cubes) which could be interesting, and might make flood-fill type occlusion algorithms easy to implement.
But here's something to consider: do you really need 100% accuracy on the adjacency test? What if you could have a very quick CPU test that culls, say, 95% of all the cubes that will never be seen, and let the graphics card churn through the rest? Wouldn't that be a better tradeoff than going to great lengths to ensure that every last occluded cube is never drawn, at an immense computational cost to the CPU? Think of frustum culling, which is a related - though quite distinct - optimization technique. Does the CPU cull every off-screen triangle itself? No, that would be stupid, it only roughly checks if large blocks of the world are off-screen, and if it misses some that's fine, all that matters is that it culled most of the off-screen triangles at a very low cost. Of course frustum culling does not do the same thing as occlusion culling (the former reduces draw calls whereas the latter tries to get less triangles to the rasterization stage) but the basic idea is the same: it doesn't have to be perfectly accurate, just correct and good enough for your game to run faster.
With that in mind, what would be a fast but approximate approach to determining occlusion for a cube world? One way which springs to mind is to store your cubes in a special octree, which tracks which nodes are completely full with cubes, and that way you can take a look at the topmost nodes of the tree, discard huge chunks of cubes if they are occluded by other huge chunks of the same size. For instance, think of the inside of a mountain, there is probably a 5x5x5 chunk inside it which is surrounded by other 5x5x5 chunks on all six sides, and so you can immediately discard those 125 cubes as occluded and not draw them at all - that is the idea I am trying to convey. You can then look at the smaller nodes one level deeper, and do the same thing. Eventually, if you carried on this process down to the deepest level of the octree, you would be left with only the surface of your cube world, with no occluded cubes drawn at all. It would probably be faster than your current approach, but you will notice that it is likely not worth it going through the entire tree, as going deeper gets progressively more expensive. It's up to you to decide where to stop to achieve optimum balance between fast CPU-side culling and letting the GPU deal with leftover cubes, likely through empirical testing.
Note I haven't tested the above approach, and there are likely much better alternatives out there for voxel-style worlds like yours, and you should probably not use the above algorithm literally. It was just to illustrate that using a better data structure which maps well to the problem you are trying to solve is the way to go, and that sometimes going for perfection can actually make things run slower. And, yes, it's a bit more involved than storing cubes in a list and doing it the naive way, but if everything was simple enough to be done efficiently with only arrays and for loops there would be no need to come up with advanced data structures! :)
(btw, you are comparing floats in your code, this is not very robust: if your positions can only be integers, then make it so)