Well, it seems like you're iterating "row" from 0 to width, and not from 0 to height... and the converse mistake with "col". Unless your graphics data and screen buffer are column-major, and you're deliberately rendering rotated 90 degrees from the screen's natural orientation?
If you're on a tiny microcontroller like this, why are you burning RAM by copying image data from one "bitmap" area in EEPROM to many "sprite" areas the RAM heap at runtime? Is EEPROM access significantly slower than RAM access on this platform?
In your DrawSprite method, you only need to calculate the *work pointer once per method call, not once per pixel.
And last, if that processor has any kind of CPU cache and cache lines are wider than a UINT16, you will very likely get much better performance if you make Y coordinates your outer loop; currently it's your inner loop.
RIP GameDev.net: launched 2 unusably-broken forum engines in as many years, and now has ceased operating as a forum at all, happy to remain naught but an advertising platform with an attached social media presense, headed by a staff who by their own admission have no idea what their userbase wants or expects.Here's to the good times; shame they exist in the past.