473,480 Members | 1,845 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

urllib2 through basic auth'ed proxy

I see from googling around that this is a popular topic, but I haven't seen
anyone saying "ah, yes, that works", so here it goes.

How does one connect through a proxy which requires basic authorisation?
The following code, stolen from somewhere, fails with a 407:

proxy_handler = urllib2.ProxyHandler({"http" :
"http://the.proxy.address:3128"})
proxy_auth_handler = urllib2.ProxyBasicAuthHandler()
proxy_auth_handler.add_password("The name of the realm sniffed from
telnetting to the proxy and doing a
get",'the.proxy.address','theusername','thepasswor d')
opener = urllib2.build_opener(proxy_handler, proxy_auth_handler)
urllib2.install_opener(opener)
f = urllib2.urlopen('http://www.google.com/')
I still get a 407 if I set the realm to None, I change host to the
'http://the.proxy.address/' form or even 'http://the.proxy.address:3128'
form.

The proxy is squid. Python version is 2.3.4 (I read that this version has a
problem in that it introduces an extra return after the authorisation, but
it isn't even getting to that bit). And yes, going through firefox,
everything works fine.

Can anyone explain me why this fails, or more importantly, code that would
work?

Thanks,
alejandro

Mar 29 '06 #1
6 11650
Alejandro Dubrovsky <du*******@physics.uq.edu.au> writes:
[...]
How does one connect through a proxy which requires basic authorisation?
The following code, stolen from somewhere, fails with a 407:
[...code involving urllib2.ProxyBasicAuthHandler()...] Can anyone explain me why this fails, or more importantly, code that would
work?


OK, I finally installed squid and had a look at the urllib2 proxy
basic auth support (which I've steered clear of for years despite
doing quite a bit with urllib2). Seems quite broken. Appears to have
been broken back in December 2004, with revision 38092 (note there's a
little revision number oddness in the Python SVN repo, BTW:
http://mail.python.org/pipermail/pyt.../058269.html):

--- urllib2.py (revision 38091)
+++ urllib2.py (revision 38092)
@@ -720,7 +720,10 @@
return self.retry_http_basic_auth(host, req, realm)

def retry_http_basic_auth(self, host, req, realm):
- user,pw = self.passwd.find_user_password(realm, host)
+ # TODO(jhylton): Remove the host argument? It depends on whether
+ # retry_http_basic_auth() is consider part of the public API.
+ # It probably is.
+ user, pw = self.passwd.find_user_password(realm, req.get_full_url())
if pw is not None:
raw = "%s:%s" % (user, pw)
....
That can't be right, can it? With a proxy, you're always
authenticating yourself for the whole proxy, and you want to look up
(RFC 2617 section 3.2.1). The ProxyBasicAuthHandler subclass
dutifully passes in the right thing for the host argument, but
AbstractBasicAuthHandler ignores it, which means that it never finds
the password -- e.g. if you're trying to connect to python.org through
myproxy.com, it'll be looking for a username/password for python.org
instead of the needed myproxy.com.

Obviously nobody else uses authenticating proxies either, or at least
nobody who can be bothered to fix urllib2 :-(

A workaround is to supply a stupid HTTPPasswordMgr that always returns
the proxy credentials regardless of what the handler asks it for (only
tested with a perhaps-broken 2.5 install, since I've broken my 2.4
install):

import urllib2

class DumbProxyPasswordMgr:
def __init__(self):
self.user = self.passwd = None
def add_password(self, realm, uri, user, passwd):
self.user = user
self.passwd = passwd
def find_user_password(self, realm, authuri):
return self.user, self.passwd
proxy_auth_handler = urllib2.ProxyBasicAuthHandler(DumbProxyPasswordMgr ())
proxy_handler = urllib2.ProxyHandler({"http": "http://localhost:3128"})
proxy_auth_handler.add_password(None, None, 'john', 'blah')
opener = urllib2.build_opener(proxy_handler, proxy_auth_handler)
f = opener.open('http://python.org/')
print f.read()
Yuck, yuck, yuck! I had realised the auth/proxies code in urllib2 was
buggy, but... And all those hoops to jump through.

Also, if you're using 2.5 SVN HEAD, it seems revision 42133 broke
ProxyHandler in an attempt to fix the URL host:post syntax!

I'll try to get some fixes in tomorrow so that 2.5 isn't broken (or at
least flag the issues to let somebody else fix them), but no promises
as usual...
John

Mar 30 '06 #2
Alejandro Dubrovsky <du*******@physics.uq.edu.au> writes:
[...]
The proxy is squid. Python version is 2.3.4 (I read that this version has a
problem in that it introduces an extra return after the authorisation, but
it isn't even getting to that bit). And yes, going through firefox,
everything works fine.

[...]

FWIW, at a glance, Python 2.3.4 has neither of the bugs I mentioned,
but the code I posted seems to work with 2.3.4. I'm not particularly
interested in what's wrong with 2.3.4's version or your usage of it
(probably both), since bugfix releases for 2.3 are no longer
happening, I believe.

I think the Examples section of the docs on this are wrong too, though
that's a bit of a moot point when the code is as broken as it seems...
John

Mar 31 '06 #3
John J. Lee wrote:
FWIW, at a glance, Python 2.3.4 has neither of the bugs I mentioned,
but the code I posted seems to work with 2.3.4. I'm not particularly
interested in what's wrong with 2.3.4's version or your usage of it
(probably both), since bugfix releases for 2.3 are no longer
happening, I believe.


"ah, yes, that works" on 2.3.4. Excellent. (I don't see what's so ugly
about that code, but i'm mostly accustomed to my own)
Thanks,
alejandro
Mar 31 '06 #4
Alejandro Dubrovsky <du*******@physics.uq.edu.au> writes:
John J. Lee wrote:
FWIW, at a glance, Python 2.3.4 has neither of the bugs I mentioned,
but the code I posted seems to work with 2.3.4. I'm not particularly
interested in what's wrong with 2.3.4's version or your usage of it
(probably both), since bugfix releases for 2.3 are no longer
happening, I believe.


"ah, yes, that works" on 2.3.4. Excellent. (I don't see what's so ugly
about that code, but i'm mostly accustomed to my own)


supplying a password surely shouldn't be that complicated...
John

Mar 31 '06 #5
jj*@pobox.com (John J. Lee) writes:
Alejandro Dubrovsky <du*******@physics.uq.edu.au> writes: [...Alejandro complains about non-working HTTP proxy auth in urllib2...]

[...John notes urllib2 bug...] A workaround is to supply a stupid HTTPPasswordMgr that always returns
the proxy credentials regardless of what the handler asks it for (only
tested with a perhaps-broken 2.5 install, since I've broken my 2.4
install): [...snip ugly code] Yuck, yuck, yuck! I had realised the auth/proxies code in urllib2 was
buggy, but... And all those hoops to jump through.

Also, if you're using 2.5 SVN HEAD, it seems revision 42133 broke
ProxyHandler in an attempt to fix the URL host:post syntax!

[...]

In fact the following also works with Python 2.3.4:

import urllib2
proxy_handler = urllib2.ProxyHandler({"http": "http://john:blah@localhost:3128"})
print urllib2.build_opener(proxy_handler).open('http://python.org/').read()
....but only just barely skirts around the bugs!-) :-(

(at least, the current bugs: I've no reason to work out what things
were like back in 2.3.4, but the above certainly works with that
version)
John

Apr 1 '06 #6
John J. Lee wrote:
jj*@pobox.com (John J. Lee) writes:
Alejandro Dubrovsky <du*******@physics.uq.edu.au> writes:

[...Alejandro complains about non-working HTTP proxy auth in urllib2...]

[...John notes urllib2 bug...]
A workaround is to supply a stupid HTTPPasswordMgr that always returns
the proxy credentials regardless of what the handler asks it for (only
tested with a perhaps-broken 2.5 install, since I've broken my 2.4
install):

[...snip ugly code]
Yuck, yuck, yuck! I had realised the auth/proxies code in urllib2 was
buggy, but... And all those hoops to jump through.

Also, if you're using 2.5 SVN HEAD, it seems revision 42133 broke
ProxyHandler in an attempt to fix the URL host:post syntax!

[...]

In fact the following also works with Python 2.3.4:

import urllib2
proxy_handler = urllib2.ProxyHandler({"http":
"http://john:blah@localhost:3128"}) print
urllib2.build_opener(proxy_handler).open('http://python.org/').read()

It does too. Thanks again. (I think this version is uglier, but easier to
insert into third party code)
Apr 3 '06 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
21075
by: O. Koch | last post by:
Until now, i know that ftplib doesn't support proxies and that i have to use urllib2. But i don't know how to use the urllib2 correct. I found some examples, but i don't understand them. Is...
4
4436
by: bmiras | last post by:
I've got a problem using urllib2 to get a web page. I'm going through a proxy using user/password authentification and i'm trying to get a page asking for a HTTP authentification. And I'm using...
5
7373
by: Pascal | last post by:
Hello, I want to acces my OWA (Outlook Web Acces - http Exchange interface) server with urllib2 but, when I try, I've always a 401 http error. Can someone help me (and us)? Thanks. ...
4
3843
by: news.easynet.be | last post by:
Hello, I would like to access an HTTPS site via a proxy The following code is working for HTTP://www.hotmail.com but not for HTTPS I have try with other sites without success l_proxy_info = {...
0
1676
by: Jörg Braun | last post by:
Hello NG, i have a problem With WebRequest! i want to download a htaccess saved file over a proxyconnetion with authentication. Download a file what is not saved with htaccess is no problem,...
0
1047
by: BobJones | last post by:
After inserting the third CD during the installation process, the program starts rolling back the installation and displays an error message. I am running Windows 2000 SP4, installing VB.Net from...
1
7982
by: rx | last post by:
I'm trying to hide my IP with the following code: import urllib2 proxy= opener=urllib2.build_opener(proxy) f=opener.open('http://www.whatismyipaddress.com') print f.read() But that didn't...
2
6236
by: mrstephengross | last post by:
I'm working on learning how to use urllib2 to use a proxy server. I've looked through the postings on this group, and it's been helpful. I have not, however, found complete documentation on the...
6
4502
by: Larry Hale | last post by:
Greetings, Pythonistas! My employer has a Squid Proxy between my Python programs and The Internet. I've searched high-and-low, and can only find examples online of how to do basic...
0
6908
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7048
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7088
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
6956
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5342
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4485
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
1300
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
563
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
183
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.