save a website
I am interested in saving a page, and all the related pages that the page links to. (but not ads, or really any off-site page) Is there freeware to do this? I post this in general programming because I imagine it could probably done in VB pretty easily, and if anyone can offer me advice in that regard I'd be interested.
It should be relatively easy to do in code. I've done something similar-ish in C++ (I know it's not really suited to it though). You just need to get the requested page, then parse the HTML and get everything linked to (I just parsed <a href=""> tags, but you'd need to parse img and link tags too at least). If it links to a domain outside the original requested page, you can assume it's linking to an ad, or some external link and ignore it.
It should be easy to do roughly, but it'd be more difficult to do if oyu want to cover all sorts of edge cases.
It should be easy to do roughly, but it'd be more difficult to do if oyu want to cover all sorts of edge cases.
Quote:Original post by ddyer
There are many such tools out there. I've used one called "Httrack"
Seconded.
Quote:Original post by discman1028Quote:Original post by ddyer
There are many such tools out there. I've used one called "Httrack"
Seconded.
Thirded. I use it every time I want to save a website, and it has great flexibility in limiting the crawling to the specified sites.
There is also a GUI version, WinHttrack.
http://www.httrack.com/
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement