Jump to content

  • Log In with Google      Sign In   
  • Create Account


urllib2 - Website loads fine in browser but not in Python

  • You cannot reply to this topic
3 replies to this topic

#1 TheComet   Members   -  Reputation: 1460

Like
0Likes
Like

Posted 08 July 2014 - 05:20 AM

I'm trying to download a page from this website: http://www.digikey.de

If I click on the link from within Firefox, it loads fine.

 

If I try to download the page using urllib2, I keep getting this message:

    <span id="ctl00_mainContentPlaceHolder_lblInvalidRequest"><H2>There was a pr
oblem with your request.</H2>

We are unable to process your request.<br/> Please return to the previous page t
o try again or contact <a href="mailto:webmaster@digikey.com?subject=Incident Nu
mber: 18&#46;ce969d50&#46;1404818243&#46;20b8e71">Digi-Key Webmaster</a> if you
feel that you have received this message in error. Please reference the followin
g incident number so we may assist you with this error.
<br/><br/

Here's the code:

import urllib2
import sys

if __name__ == '__main__':

    # optional proxy
    if len(sys.argv) > 1:
        proxy = {'http': str(sys.argv[1])}
        proxy = urllib2.ProxyHandler(proxy)
        opener = urllib2.build_opener(proxy)
        urllib2.install_opener(opener)

    html = urllib2.urlopen('http://www.digikey.de').read()
    error = html.find('request')
    if not error == -1:
        print html[error-400:error+400]
    else:
        print 'success'

Can anyone explain to me why it's doing this and how I can fix it?


YOUR_OPINION >/dev/null


Sponsor:

#2 Key_46   Members   -  Reputation: 384

Like
1Likes
Like

Posted 08 July 2014 - 09:14 AM

In my case the webserver was not accepting the default User Agent from urllib2. Try changing it:

 

opener.addheaders = [('User-agent', 'Mozilla/5.0')]



#3 TheComet   Members   -  Reputation: 1460

Like
0Likes
Like

Posted 08 July 2014 - 09:39 AM

Yes, that worked wonderfully. Thanks!

 

For the future, how were you able to diagnose what the problem was?


YOUR_OPINION >/dev/null


#4 Key_46   Members   -  Reputation: 384

Like
0Likes
Like

Posted 08 July 2014 - 09:46 AM

Yes, that worked wonderfully. Thanks!

 

For the future, how were you able to diagnose what the problem was?

Since the request was working on my browser I tried to match the header exactly. I enabled the debug verbose for the opener to see what it was sending:

 

opener = urllib2.build_opener(urllib2.HTTPHandler(debuglevel=1))







PARTNERS