Ok, I checked out the 2D gjk explanation. And while I think it is not optimal it should still yield the correct result.
The direction of the y axis should not matter in your implementation, since you check the direction of the normal vectors anyway.
You do not mention how you actually calculate the normal vectors. If you have x=[x1, x2] the normal could be calculated as nx=[x2, -x1].