I suppose these past few days have really showed to me how thin my knowledge about 3d is.
I did notice that urhos profiler is really nice, but after playing around with different engines and testing stuff, I realised that what I measured was not exactly what I should be measuring. There were many things said about draw calls, and I figured 1 draw call per model is not so bad and 130 draw calls would be fine.
I noticed that in both libgdx and cocos2dx both I think one of the problems for the slow downs is actually animation. If I use non-animated objects I can put a lot more of those, which made me try more vertices per bone. I subsurfaced my model 2 times, so it had 16 times the verticles in blender. So in blender I had 12530 triangles, which seems to become close to 40k triangles.
So I tried rendering 121 40k triangle animated mesh, to my surprise the fps dropped to around 15 from 30 on cocos2dx, and to 15 from the roughly 50 fps without shadows and 40 with shadows on urho and on libgdx the fps remained 11 like what it was with the old mesh. I guess adding even more triangles would make them all equal, when it fully uses the whole capacity on rendering.
So my conclusion is, that the speed doesn't seem to be issue as long as I understand why the slow downs do happen. Also I am sorry if someone else who has been reading the posts and my own tests have drawn wrong kind of conclusions from my wrong kind of testing methods. At least I myself am now more aware of this.