If your problem is detecting known patterns in an unknown image and doing something when a moving window touches the patterns, the simplest and most efficient approach is preprocessing to find cars, men etc. in the image and obtain a list of object positions and types, Then as you display the image with scrolling you can transform the "triggers" from a fixed screen location to a variable image location, and compare their position with object positions to see if they hit.
1D Example: cars at x=4, and x=150 in the image (location of some conventional reference point, say the front wheel hub); if a car scrolls past screen x=90 a collision imminent alarm begins; the scrolling amount is D, screen x = D + image x. The first car triggers when you scroll to D=86, which can be in the middle of a move (e.g. D decreasing from 88 to 85), while the second triggers at D=-60. As you know D, you can search a sorted list of object locations for the items that overlap the interval between the D values of the previous and current frame.