You should try using glMapBufferRange instead; it gives you finer-grained control over synchronization, etc.
Alternatively, try a regular glTexSubImage without the PBO - the real key to performance here is getting the parameters correctly matching the internal storage format of the texture, so that the driver can do a DMA transfer without having to go through any intermediate software steps. Commonly that means using GL_BGRA for format and either GL_UNSIGNED_BYTE or GL_UNSIGNED_INT_8_8_8_8_REV (your mileage may vary depending on your GL implementation) for type. Whatever else, never ever ever use GL_RGB for format - even if you think it's saving memory - see http://www.opengl.org/wiki/Common_Mistakes#Texture_upload_and_pixel_reads and http://www.opengl.org/wiki/Common_Mistakes#Image_precision for more info on this.
This may well be more than fast enough without the PBO. If you have (or are willing to use) GL4.3 you may also consider using glInvalidateTexSubImage to get even more control.