First I would entrust the Book Game Engine Architecture, Second Edition to you so you could take a look of what a game engine does and what you need for a graphics engine as there are some equalities where each game engine is including a graphics engine too.
Graphics Engine
First you should setup your general rendering pipeline before going anywhere else. First there will be one single step for rendering when you are not using any stencil buffer or alpha blends at the moment, later for example wehn rendering your own UI, you will need different rendering steps for UI because depending on the UI system you will need the stencil buffer too so might have that in mind when designing your first pattern.
Secondly you should write a camera class that wraps arround a mat4 type for your view matrix. Anything general transform these days is done by the view matrix in order with the global projection matrix you choose (e.g. for 3D perspective rendering) so you should become familiar with vector math. It is very helpful to check if anything gets rendered correctly by using a camera to move arround.
If you have the base of this done I would go for creating a simple Mesh class that will do the basic rendering of one single mesh. This should consist of a mat4 type too for the meshes individual transform in the world and anything you need to render the mesh like VBO, VAO and maybe a Material and a Texture class that refers to a shader and this materials settings for that shader like the texture you use, colors, alpha blend if necessary. This may take a bit more work for getting shaders into OpenGL, loading textures and bring anything together.
To get more advanced you might build up your Scene structure by creating a scene graph. In modern game engines you have this scene graph that contains nodes to scene objects that are rendered in the scene and these may contain sub-nodes for child objects that are attached on them like a sword is attached to a warrior model. Scene graph then iterates through the nodes, skipps the nodes out of sight and so for any child node attached to it and otherwise renders that node to screen and any child node attached to it by also keeping root nodes transform to child ndoes (The warrior model is located at point 0,-<its feet to the ground>,0 but the sword itself is also transformed from the warrior models origin to fit into its right or left hand)
Particle systems are also some kind of Mesh class but they manage not a single mesh depending on the particle system a point cloud or textured quads with some kind of alpha blending. They also apply transforms to the points depending on mathematical expressions of what the particle system spreads these objects over the space. This may be a curve, a line or even a spherical movement over certain lifetime.
At last to get your graphics engine on you need to get into FBOs for the post processing pipeline. Anything else you need for a pure graphics engine is done via shader, lightning, shadows and other effects are applied to the scene in post processing. There are some technics out in the internet worth to take a look into.
Extending to Game Engine
You wrote that you want to have a character walking over a terrain in first person mode so to extend your graphics engine to a game engine there are some more steps to do. First you should inherit a first person camera from you already existing camera because its movement is a bit different from a third person camera. Make your camera a scene object in the graph and attach your character as child or vice versa so moving one moves the other too and your camera should then recognize its parent transform to work.
Movement/control input is some kind of system where you might consider first of how you will handle interactions in your game code. Some engines use events to share interactions over the game code, others use a fixed update loop and poll for example any inputs done since last update. Keep in mind to use some kind of delta timing regardless of what interaction system you use, this is important to prevent jumps between frames and updates.
-----------------------------------
I have build up more than once such systems in the past and currently still doing this as part of my job in games industry, using complete systems and also making my own one from scratch using just OS APIs and nothing else, so there are plenty of other things to know and take into account like asset loading/management and hdd optimazations of data packages, multi-thrading, networking up to encryption so since this is a beginner question, anything is written as basic as possible where game engine architecture is a whole engeneering business and needs a lot of research for more general and complex systems.
Keep always learning and you will go easily