it's up to you
there aren't that many matrix representations to choose from.
regular ortho from (-x to x, -y to y)
NDC (which means you don't need to specify projection) and goes from (-1 to 1, -1 to 1)
pixel-based ortho (0 to x, 0 to y) where Y goes downwards positively
and various modes of perspective with varying FOV, typically used with 3D, because it looks 3D :)
what you use it entirely up to you, but it's common to use orthographic projections with 2D
keep in mind you still have a Z axis, just there's no way for you or the player to perceive it, unless you use math-magic
SDL is just easier overall. It provides you with all the multimedia aspects you need, including window, input, sound and other things.
If you use OpenGL barebones, you will need to implement many concepts on your own, such as sprites
You'll also need to provide music & sound (typically as streams & samples)
this is nothing too complicated with todays libraries, and if you already have a good grasp of whatever language you are using, you will be ok
your challenges lie in learning OpenGL (which is inadequately documented, and I know OpenGL well), and implementing concepts for your game
If you are using glm:: you already have much of the barebones OpenGL bits covered. Create orthographic projection with whatever coordinate system you desire
The easiest version probably being mapping pixels 1:1
Implement sprites, create a train (sprites following sprites), and have it follow a path
if you can do that, you can create a game no probs :)
Ask if anything