[Spread-users] Unable to support more than 1000 sessions after increasing MAX_FD_EVENTS

Ryan Caudy rcaudy at gmail.com
Sun Aug 24 21:30:28 EDT 2008


In addition to Dan's suggestion, you may also want to check the value
of FD_SETSIZE compiled into your spread daemon.  Last I knew, the
Spread daemon's event handling loop is using select(), which limits
the file descriptors that can be monitored to those with a value lower
than FD_SETSIZE.

On Sun, Aug 24, 2008 at 5:42 PM, Rodrick Brown <rodrick.brown at gmail.com> wrote:
> I'm using the Java spread API and connecting to the local spread
> daemon version 4.0. Spread was compiled on a 64bit Linux host.
> Under heavy load my application stops accepting new messages. I see
> about ~1000 connections to local spread daemon before my application
> throws a SocketException - Socket(): java.net.SocketException: Too
> many open files
>
> On the host running the spread daemon when looking at the number of
> local connections to the daemon using lsof I see
>
> spread  16046    spread   10u     IPv4             298424
>    TCP 127.0.0.1:4903->127.0.0.1:35585 (ESTABLISHED)
> ...
> ...
> spread  16046    spread 1013u     IPv4             304225
>    TCP 127.0.0.1:4903->127.0.0.1:49967 (ESTABLISHED)
>
> [root at nybmlx01a logs]# lsof -n |grep -ic ^spread
> 1076 <<--- Total number of connections to localhost via spread.
>
>
> So it seems my java process has made about ~1000 concurrent
> connections to the local daemon.
> I've slightly modified the spread sources and changed MAX_FD_EVENTS in
> sp_events.h to 10005 so that we can support 5000 concurrent sessions.
> This does not seem to make any difference at all and once again my app
> stops responding to new requests around ~1000 concurrent connections.
> Is this some kind of bug? I'm not really sure what's going on here. .
>
> I've also tried using the value 55 in MAX_FD_EVENTS and this does seem
> to work as expected after about 25 or so connections I'm no longer
> able to create any more sessions. What is strange here is that the
> error raised is connection refused instead of the Too many open files
> error as show before.
>
> I don't think this is a ulimit problem on the Unix host as I'm
> starting with the following parameter ulimit -n 60000 in
> /etc/init.d/spread which starts spread daemon as root before it
> switches user to a non privilege account called 'spread'
>
> core file size          (blocks, -c) 0
> data seg size           (kbytes, -d) unlimited
> max nice                        (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 139264
> max locked memory       (kbytes, -l) 32
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 60000
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> max rt priority                 (-r) 0
> stack size              (kbytes, -s) 10240
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 139264
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
>
> Any help will be greatly appreciated thanks.
>
>
>
> --
> [ Rodrick R. Brown ]
> http://www.rodrickbrown.com http://www.linkedin.com/in/rodrickbrown
>
> _______________________________________________
> Spread-users mailing list
> Spread-users at lists.spread.org
> http://lists.spread.org/mailman/listinfo/spread-users
>




More information about the Spread-users mailing list