To get the most out of your rendering speed you have to optimize everything.
Models are uploaded to the GPU's memory and drawn directly from there. There is nothing to send except the draw command.
Polygons are sorted by state to reduce the amount of times that drawing has to stop so states can be changed (rendering modes, material settings, textures, etc...)
Textures and tiles are atlased into one big image to reduce texture state changes.
Many other things, but I can't rewrite the whole book in a post, etc..etc...
So you could load up all your tiles onto 1 big texture, then create a static model of your map and upload it to the card. Then you just need to make 1 draw call to render the map. If needed, you can split the mesh up into smaller meshes. Like you said, you have 4000 tiles. Maybe split it up by 5 or 10 to get smaller renderable chunks.
I'm guessing you are drawing your tiles one by one?