save a website

Started by
4 comments, last by LorenzoGatti 15 years, 12 months ago
I am interested in saving a page, and all the related pages that the page links to. (but not ads, or really any off-site page) Is there freeware to do this? I post this in general programming because I imagine it could probably done in VB pretty easily, and if anyone can offer me advice in that regard I'd be interested.
Advertisement
There are many such tools out there. I've used one called "Httrack"

---visit my game site http://www.boardspace.net - free online strategy games

It should be relatively easy to do in code. I've done something similar-ish in C++ (I know it's not really suited to it though). You just need to get the requested page, then parse the HTML and get everything linked to (I just parsed <a href=""> tags, but you'd need to parse img and link tags too at least). If it links to a domain outside the original requested page, you can assume it's linking to an ad, or some external link and ignore it.

It should be easy to do roughly, but it'd be more difficult to do if oyu want to cover all sorts of edge cases.
Linux/UNIX has a tool called 'wget'. There is a Windows version here. Good luck.
Quote:Original post by ddyer
There are many such tools out there. I've used one called "Httrack"


Seconded.
--== discman1028 ==--
Quote:Original post by discman1028
Quote:Original post by ddyer
There are many such tools out there. I've used one called "Httrack"


Seconded.

Thirded. I use it every time I want to save a website, and it has great flexibility in limiting the crawling to the specified sites.
There is also a GUI version, WinHttrack.

http://www.httrack.com/

Omae Wa Mou Shindeiru

This topic is closed to new replies.

Advertisement