grab an html table off a webbrowser
One of my joys at work is to develop tools to automate my own tasks and henceforth enhance my own productivity. One such automation task I'd like to do involves grabbing a table from an html document. I'm fairly well versed in the windows API methods of manipulating another program, and while I feel like I could programmatically control the buttons/edit boxes/text labels of a web browser (the stuff at the top), I wouldn't know how to grab the table in its webpage section. The table is larger than the screen, so you have to scroll through to read it all.
suggestions?
If the web browser is IE, you should be able to do this pretty easily. If you have a handle to the window, you can get a pointer to an IHTMLDocument2 object from it. You then have full DOM access to the webpage and can do whatever you want with it. We do something very similar in one of our apps, except that instead of targeting IE itself, we target the instances of IE that Outlook uses to display its email messages.
Have a look at this KB article: http://support.microsoft.com/kb/q249232/.
Have a look at this KB article: http://support.microsoft.com/kb/q249232/.
You can just write your own client to request the webpage. I know how to do this through Python, but Ruby and Perl should be very viable options as well. Once you have the webpage data, then it’s a matter of parsing it as necessary and extracting the relevant content out of it.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement