It doesn't matter if I add a renderbuffer or not, as long as I have a depth-texture target the screen is black
If I don't have depth-texture target it draws fine, just without depth-testing anymore
So, with no test it works but with a test it fails. It sounds like the test may be wrong or at least not configured how you expect it to be. At least make sure glDepthMask, glDepthRange and glDepthFunc are set as you expect them to be, personally it sounds like your fragment output is rejected due to depth working but not as you expect.
It's probably also worth reviewing (googling) the concept of 'framebuffer completeness' to be more certain about what you understand a framebuffer can/can't do. That will cover the necessary minimum and clarify what attachments need to be in place.
Checking for framebuffer completeness via glCheckFramebuffer status is a good idea and may help to debug any obvious errors you might have. You will find the concept of rendering to a depth buffer alone is considered 'incomplete' which is possibly where your actual problem might be. This might be specific to GL ES, but is worth checking anyway.
Specifically, what you might be witnessing above is a misconfiguration of depth parameters when you do have a renderbuffer and when you don't (if I'm reading your 'chain' correctly), you may have a framebuffer incomplete problem.