[Spread-users] Monitor on Win32?

Thu Dec 13 02:40:57 EST 2001

Thanks for the quick reply!

One more for you while I have it....
Does performance on a ring improve with unreliable mesages versus reliable
and greater?  Let's say that one daemon is attempting to publish a bunch of
specific group messages at a high rate (say 1000 group messages at 1 Hz).
Also say that 20 or 30 other daemons are listening for some mixture of these
groups (or all of them).  Does ring performance vary depending on how these
messages are published (reliable versus unreliable)?

Will take a closer look at the the threaded spuser source and daemon
interaction calls in monitor.

All for now.
--Mike

----- Original Message -----
From: "Jonathan Stanton" <jonathan at cnds.jhu.edu>
To: "Mike Stagnaro" <mstagnaro at hotmail.com>
Cc: <spread-users at lists.spread.org>
Sent: Thursday, December 13, 2001 12:07 AM
Subject: Re: [Spread-users] Monitor on Win32?

> On Wed, Dec 12, 2001 at 11:42:08PM -0700, Mike Stagnaro wrote:
> > Hi Jonathan,
> >
> > Your questions first....
> >
>
> Thanks.
> > Follow-on questions from me.....
> >
> > Do you have any plans to port monitor to Win32?  If not, how difficult
would
> > this be?  Could it be something like spawning a dedicated UI thread with
> > some form of mutex protected globals?  Monitor (or something like it)
seems
> > like a critical tool to system tuning and admin.  I primarily target
systems
> > based on NT (4, 5, 5.1), so I'd be interesting in having a monitor.
Might
> > consider building my own (in copius spare time), but I don't know its
> > internals (and how it communicates to the daemons) well enough to know
if
> > this is even something I could/should do.  Any thoughts or plans?
>
> We have not planned yet to fix it because we there has been very little
> demand. It should not be complex at all. If I were to do it, I would
> probably use the same approach as the threaded spuser and spawn off a
> read_message thread to handle messages received from the daemons, and keep
> the main thread for UI with the user. The actual interaction code to
> send/receive messages from the daemon is very simple (just UDP packets).
>
>  >
> > Also, can you elaborate on the daemon limits in a system.  I've read a
few
> > things about 128 as a limit within a broadcast segement (ring?)...and a
> > limit of 128 total among all segments.  I'm reading this to mean that
the
> > total number of nodes in spread.conf can be no greater than 128 in a
> > multi-segment config.....or a single segment config can have no more
than
> > 128 nodes.  True?  If so, why?  Is this a limitation imposed by the ring
> > and/or hop protocol?  Any insight appreciated.
>
> You are correct on both conclusions. Each segment can have no more then
128
> processes, and the total number of processes overall is 128. Two reasons
> for the limitations exist. First, a soft limit is that the ring protocol
> starts to have problems when the number of nodes in the ring grows. I've
> seen rings with 50 nodes or so work ok (if the load is not too high so
they
> get scheduled quickly), but the latency to cycle a token around 100 nodes
> will be pretty high. Second, a hard limit is the way membership messages
> are formatted and sent limits the number of actual nodes in the system to
a
> little over 128. Removing this is possible, but complex which is why it
> currently remains.
>
>  >
> > Last one for now...
> > Is there a practical (or hardwired) limit on the number of groups an
> > application can join?  Let's say that the app lives on the same box as
the
> > daemon.  With Rendezvous, we have apps with multiple thousands of
subject
> > subscriptions running with no problem....just curious how Spread
compares
> > from what you've seen.
>
> Spread is designed to support a large, sortof unlimited number of groups
> efficiently. Currently if the membership does not change, i.e. you run
> daemons on a fixed set of nodes and they do not crash, we have tested it
> with 10,000+ groups. The basic datastructure we use (skiplist) was tested
> upto much larger numbers of groups and it scales for lookups/inserts as
> log(n). The membership code does have a current limitation of a bit over
> 1000 groups because of the way it synchronizes state. This limitation is
> not hard conceptually to fix, just a bit of time to work on it.
>
> The cost of each group is entirely local (a record in memory) during
normal
> operation (sending messages), a join or leave of a member to/from a group
> costs one ordered message amoung all of the daemons, when a network merge
> occurs between a number of daemons, a synchronization message with the
> state of all of the groups has to be sent. This is the only costly part of
> handling large numbers of groups.
>
> Jonathan
> --
> -------------------------------------------------------
> Jonathan R. Stanton         jonathan at cs.jhu.edu
> Dept. of Computer Science
> Johns Hopkins University
> -------------------------------------------------------
>