Re: unequal load distribution

From: Pavel Jezek (Pavel.Jezek@i.cz)
Date: Fri Oct 17 2003 - 12:13:28 CEST


Hi, thanks for anwer,
use of simple round-robin algorithm for equal balancing requests among two http caches, probably cause trouble with web sites, that tracks clients session information (target web server will be confused, if one request come from one IP adress and next request (but belongs to same session) come from another IP)
so, I will probably probe second method - hash algorithm (pen -h), but how to set client limit "-c"?
I thing, that settings this value too low, will cause to lose session information (hash table overflow, when another clients connect), and cause trouble likewise simple round-robin.

imho, better solution will be time based hash table expiration...
imho best solution is to track http requests and make balancing between two proxy caches according to session information (like client IP - destination server IP pair, wrapped in http request)
e.g.
client1 always connects to web serverA via cache-isp1
client1 always connects to web serverB via cache-isp1
client1 always connects to web serverC via cache-isp2
...
this entries could expire after configurable time... ;-)
PJ

----- Original Message -----
From: Ulric Eriksson
To: Pavel Jezek
Cc: pen@siag.nu
Sent: Thursday, October 16, 2003 11:10 PM
Subject: Re: unequal load distribution

On Thu, 16 Oct 2003, Pavel Jezek wrote:

> Hi,
> I'm planning to use PEN as load balancer for two SQUID http caches and
> distribute load between two line to Internet (but with different
> capacity 512kb and 256kb).
> So, is possible, that PEN forwards twice more requests to first cache
> (connected to 512 kb line) than second cache (256kb), like this example?
>
> pen -h 3128 cache-isp1 cache-isp2 cache-isp1

That command line should work unchanged. Pen will see three backend
servers, two of which are in fact the same (cache-isp1). Assuming equal
distribution among the three servers, cache-isp1 will serve twice as many
clients as cache-isp2.

There's a gotcha to look out for. I assume that you're using the hash as a
means to distribute clients in a predictable way among the squids. Now, if
one of the squids, say cache-isp1, goes down for whatever reason, all the
clients will be sent to cache-isp2. Since the hash algorithm uses client
tracking (what some call stickyness), they will stay on cache-isp2 even
when cache-isp1 comes back up. You probably don't want that. It can be
solved using one of two methods.

1. Use round-robin distribution of clients. That will bypass client
tracking completely, but it also means that every client will access
all the caches. That's bad for caching efficiency.

2. Make the table of tracked clients so small that it will overflow, i.e.
add the command line argument -c 1. The default number is 2048, which is
somewhat appropriate for a small website.

Don't try to use -c 0, because pen insists on storing the clients.

So the complete command line would be (untested):

pen -h -c 1 3128 cache-isp1 cache-isp1 cache-isp2

Ulric



This archive was generated by hypermail 2.1.2 : Fri Oct 17 2003 - 12:15:04 CEST