• Advertisement
Sign in to follow this  
  • entries
  • comments
  • views

First Step

Sign in to follow this  


I have been working on this little web crawler and at this point it gets all of the absolute links (doing relative links will be a pain, ugh...) from the given website. My next task is to make it keep going after finding all of the links from the first page. That will be easy enough. But making it check if its been to that page already will surely slow things down a ton, especially after its been to a few thousand sites. But oh well, im just trying to get a rudimentary version completed.

Here is a pic of what I have so far. Note that the list of URLs are not the URLs that have been visited, but they are the ones that were found on the first page. That will change shortly as this is just for debug purposes.
Sign in to follow this  


Recommended Comments

There are no comments to display.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Advertisement