Created on 2006-05-08.09:03:15 by thimm, last changed 2008-07-01.10:21:34 by rasker.
| msg1421 (view) |
Author: rasker |
Date: 2008-07-01.10:21:34 |
|
Retired
Reason for Retirement: Please confirm if this is still a problem in the latest
version of Smart.
Please reopen this issue in the new bugtracker if it is still an issue.
New Bugtracker : http://bugs.launchpad.net/smart
further details:
https://blueprints.launchpad.net/smart/+spec/bug-reporting-migration.
|
| msg532 (view) |
Author: thimm |
Date: 2006-06-12.18:25:48 |
|
I haven't yet verified whether presence of pycurl will fix it. If you are aware
of the issue and know this fixes it go ahead and close it - should my check fail
I would reopen it.
|
| msg531 (view) |
Author: mvo |
Date: 2006-06-12.16:58:48 |
|
Thimm, do you think can we close this bug? If pycurl is used "Connection:
keep-alive" is used automatically so this is more or less a
packaging/documentation issue (packaging smart with "Depends: pycurl" should fix
it for most users).
|
| msg486 (view) |
Author: niemeyer |
Date: 2006-05-11.20:14:40 |
|
> > Opening new connections for each package, while not ideal, shouldn't
> > kill a web server, should it?
>
> If (when) 60 concurrent smart users open 10-20 connections
> simultaneously it runs oom.
Yes, that's what I meant. It's not opening new connections for
each package that kills the server (that's what keep-alive would
handle), but opening them in parallel.
> I think per IP it should be exactly one. Anything else is abusing the
> server's resources for no gain. The client does get the bits faster
> than other concurrent clients, though, but at the cost of the total
> bandwidth and server resources.
Indeed.
> No, I haven't. It looks like pycurl is automatically detected at
> run-time, so all I need to do is package pycurl?
Yes, it is. Packaging it should be all that is needed.
|
| msg485 (view) |
Author: thimm |
Date: 2006-05-11.19:08:50 |
|
> Opening new connections for each package, while not ideal, shouldn't
> kill a web server, should it?
If (when) 60 concurrent smart users open 10-20 connections simultaneously it
runs oom.
> If you belive that 5 is really a bad idea, let's talk. I'm open to reducing
> this limit.
I think per IP it should be exactly one. Anything else is abusing the server's
resources for no gain. The client does get the bits faster than other concurrent
clients, though, but at the cost of the total bandwidth and server resources.
> Have you tested Smart with pycurl? It should reuse connections
> automatically, like you're suggesting.
No, I haven't. It looks like pycurl is automatically detected at run-time, so
all I need to do is package pycurl?
|
| msg483 (view) |
Author: niemeyer |
Date: 2006-05-11.14:11:28 |
|
> W/o keep-alive smart's concurrent downloading kills a web server, as
> each package/metadata file opens up a new concurrent httpd process on
> the other side.
Opening new connections for each package, while not ideal, shouldn't
kill a web server, should it?
> A typical smart session on a rather often updated system shows up to
> 15 httpd processes simultaneously serving this one IP.
When using URLLIB, Smart is currently limited to 5 active connections.
If you want to make tests, or even patch your local Smart, you can easily
change the constant in fetcher.py (MAXACTIVE). If you belive that 5 is
really a bad idea, let's talk. I'm open to reducing this limit.
> There are keep-alive solutions for urllib2, for instance urlgrabber's
> keepalive.py that can be used as a handler for urllib2. This looks
> easy enough for me to try patching up smart with it, if it's
> considered useful.
There's already a urllib2 fetcher, but it's commented out because it's
not thread safe. I'm not sure if we can do the same thing in urllib.
Have you tested Smart with pycurl? It should reuse connections
automatically, like you're suggesting.
> But most probably keep-alive is not enough as smart deliberately fires
> up package retrievals in parallel, and keep-alive is only of help for
> reusing connections. If a connection is still in use, you end up
> creating a new one. Therefore a serialization procedure is neccessary
> when the packages come from the same host (or at least the same
> channel).
Smart does limit the number of active connections already. Improving
that limit shouldn't be hard. OTOH, since you say that you have 15
open connections, there must be something else wrong.
|
| msg480 (view) |
Author: thimm |
Date: 2006-05-08.09:03:13 |
|
W/o keep-alive smart's concurrent downloading kills a web server, as each
package/metadata file opens up a new concurrent httpd process on the other side.
A typical smart session on a rather often updated system shows up to 15 httpd
processes simultaneously serving this one IP.
There are keep-alive solutions for urllib2, for instance urlgrabber's
keepalive.py that can be used as a handler for urllib2. This looks easy enough
for me to try patching up smart with it, if it's considered useful.
But most probably keep-alive is not enough as smart deliberately fires up
package retrievals in parallel, and keep-alive is only of help for reusing
connections. If a connection is still in use, you end up creating a new one.
Therefore a serialization procedure is neccessary when the packages come from
the same host (or at least the same channel).
|
|
| Date |
User |
Action |
Args |
| 2008-07-01 10:21:34 | rasker | set | status: chatting -> done-review nosy:
+ rasker messages:
+ msg1421 assignedto: rasker |
| 2006-06-14 17:13:49 | mvo | set | title: smart's fetcher.py needs keep-alive and serialization support -> smart's non-pycurl fetcher.py needs keep-alive and serialization support |
| 2006-06-12 18:25:59 | thimm | set | nosy:
mvo, thimm, niemeyer messages:
+ msg532 |
| 2006-06-12 16:58:49 | mvo | set | nosy:
+ mvo messages:
+ msg531 |
| 2006-05-11 20:14:41 | niemeyer | set | messages:
+ msg486 |
| 2006-05-11 19:08:51 | thimm | set | nosy:
thimm, niemeyer messages:
+ msg485 |
| 2006-05-11 14:11:31 | niemeyer | set | status: unread -> chatting nosy:
+ niemeyer messages:
+ msg483 |
| 2006-05-08 09:03:17 | thimm | create | |
|