You are lucky if you can fix your window overlay grid dimensions into constants
Then you can do pointer arithematic on a fixed size map array (ie ptr+1 ptr-1 ptr+span ptr+span+1 ptr-span+1 etc..) .... example - to access the usual 8 neighbors when finding open list candidates You can do this with a single value offset (array of the 8 of offsets and use a for loop) if you make your map 1 dimensional array 
Fixed size HeapQ for the open list (also doing pointer math to walk nodes of the tree -- a systematic power of 2 progression for the binary tree) Max depth shouldnt need to be sized to worst case as you can have them fall off the bottom and IF they ever become canditates again they can be readded to the open list.
use flags ON the window A* grid map (overlay) nodes for your closed list (flags)
If you are still doing link lists then absolutely use node pools instead of costly allocate/deallocate overhead
crush your map/open list data as much as is possible to maximize cache hits
(300 coord size calls for 16 bit ints ...) use bitmaps for flags avoid floats if you can for weight values to use try to use bytes/int16 .
all needed AI* data extracted from the real world map this 300x300 window is overlaid on (yanking from chunks at tthat time
Eliminate if-then X Y edge of map tests being done when getting open list candidates by oversizing map +2 and pre-marking the boundry lines as CLOSED (flagged) - the map edges will now be excluded as part of the normal Closed node test
Finish the current pathfind task - dont timeslice multiple A* pathfindings
if precalc'd edge cost values can be stored in the real world map nodes, then do that (memory is cheap) and make them contiguous so they might be copied/extracted in a block into the A* working data
IF additional map data needs to be extracted/examined BUT the number of nodes this is required for is largely culled down THEN that data doesnt need to be extracted and put into the A* working data (less initial copying AND smaller A* working data....)
figure out how to maximize reuse of the same A* working data for multiple objects
Something they rarely include for A* speed calc is the setup time of the working data you will use (interpretting/extracting from the world map and resetting between A* runs)