Jump to content
  • Advertisement
Sign in to follow this  
TheComet

urllib2 - Website loads fine in browser but not in Python

This topic is 1418 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm trying to download a page from this website: http://www.digikey.de

If I click on the link from within Firefox, it loads fine.

 

If I try to download the page using urllib2, I keep getting this message:

    <span id="ctl00_mainContentPlaceHolder_lblInvalidRequest"><H2>There was a pr
oblem with your request.</H2>

We are unable to process your request.<br/> Please return to the previous page t
o try again or contact <a href="mailto:webmaster@digikey.com?subject=Incident Nu
mber: 18&#46;ce969d50&#46;1404818243&#46;20b8e71">Digi-Key Webmaster</a> if you
feel that you have received this message in error. Please reference the followin
g incident number so we may assist you with this error.
<br/><br/

Here's the code:

import urllib2
import sys

if __name__ == '__main__':

    # optional proxy
    if len(sys.argv) > 1:
        proxy = {'http': str(sys.argv[1])}
        proxy = urllib2.ProxyHandler(proxy)
        opener = urllib2.build_opener(proxy)
        urllib2.install_opener(opener)

    html = urllib2.urlopen('http://www.digikey.de').read()
    error = html.find('request')
    if not error == -1:
        print html[error-400:error+400]
    else:
        print 'success'

Can anyone explain to me why it's doing this and how I can fix it?

Share this post


Link to post
Share on other sites
Advertisement

In my case the webserver was not accepting the default User Agent from urllib2. Try changing it:

 

opener.addheaders = [('User-agent', 'Mozilla/5.0')]

Share this post


Link to post
Share on other sites

Yes, that worked wonderfully. Thanks!

 

For the future, how were you able to diagnose what the problem was?

Share this post


Link to post
Share on other sites

Yes, that worked wonderfully. Thanks!

 

For the future, how were you able to diagnose what the problem was?

Since the request was working on my browser I tried to match the header exactly. I enabled the debug verbose for the opener to see what it was sending:

 

opener = urllib2.build_opener(urllib2.HTTPHandler(debuglevel=1))

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!