Sign in to follow this  

reading through a web page c/c++

This topic is 4095 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Just wondering how to open a webpage as a file.I didnt have this sort of trouble in PHP,but c++ is a whole different story as i cant seem to be able to go beyond the files on my computer.Point me to some docs or libs out there i can read through if possible.Thanks ahead.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by ilavos
Just wondering how to open a webpage as a file.I didnt have this sort of trouble in PHP,but c++ is a whole different story as i cant seem to be able to go beyond the files on my computer.Point me to some docs or libs out there i can read through if possible.Thanks ahead.


1) download the webpage from its URL using either native sockets or wrappers such as libCURL, respectively the MSIE stuff

2) either save the contents to a temporay file or work directly with the buffer

Share this post


Link to post
Share on other sites
I don't think the AP was giving you two options. I think s/he was suggesting you do 1), then 2). You can't open a file with standard C or C++ file operations with a web address - the file needs to be in a location that your computer sees as a drive, such as your local hard drive or a mapped network drive.

I think the AP is suggesting that you need to download the webpage onto such a drive and then deal with it as a normal file.

Share this post


Link to post
Share on other sites
I've done some website crawling code before. Basically you can do one of two things:

1) Using standard winsock, you can basically create a simple HTTP client that connects to an ip address and recieves the incomming data, and then do with that data what you like.
2) What I did was use the microsoft IE OCX (activeX) control. I created an instance, used the "navigate" function to load a webpage. Done.

The second option gives you an advantage in that you can then use the IE control (I use it non visually but there are some quirks to that so typically you can just have it 0 pixels wide/tall) to traverse the data with objects. IE will also allow you to pump in raw javascript, which is nice for form filling (of course you can also use the actuial html interface in the control to manually set the values).

I used the TWebBrowser component in Delphi (which is a wrapper to the IE ocx), but I'm very sure the same can be done in pure C++.

Share this post


Link to post
Share on other sites
There are more efficient options available in C++, such as InternetReadFile, libCURL, and so on. In fact, using manual sockets and a trivial HTTP request implementation is probably easier, simply because accessing ActiveX/COM stuff from C++ is a bit of a nightmare.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by EasilyConfused
I don't think the AP was giving you two options. I think s/he was suggesting you do 1), then 2). You can't open a file with standard C or C++ file operations with a web address - the file needs to be in a location that your computer sees as a drive, such as your local hard drive or a mapped network drive.

I think the AP is suggesting that you need to download the webpage onto such a drive and then deal with it as a normal file.


the AP in fact suggested to use whatever method the OP is familiar with to download the HTML page from the web, using for example plain sockets, libCURL, MSIE or even ACE and then either saving the contents to a temporary file, or working DIRECTLY with the data in the buffer if that's feasible. You don't necessarily have to save downloaded data to a file in order to be able to process it further.



Share this post


Link to post
Share on other sites
Guest Anonymous Poster
So i went with ApochPiQ's suggestion of using the wininet.h functions.Now get this,there's a syntax error inside this header file!Is this a known problem or what?I'm using msvc++ 6.0.

Share this post


Link to post
Share on other sites
Where did you get your copy of the headers from? What is the error you're getting when you compile? Are you sure you didn't do something like include foo.h before wininet.h, and leave off a semicolon at the end of a class definition inside foo.h?


You should note that VC6 is extremely out of date, and using it to write new software is morally tantamount to cutting off your own legs. And arms. And maybe poking out an eye or two.

I would very highly recommend you switch to VC2005 (the Express Edition is free) if your project allows.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
It was a missing semicolon error inside the header,compiled fine in devc++ though.I'll take a look at the 2005 edition.Thanks for your help,i owe you one :)

Share this post


Link to post
Share on other sites

This topic is 4095 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this