[Spread-users] Monitor on Win32?

Thu Dec 13 02:07:23 EST 2001

On Wed, Dec 12, 2001 at 11:42:08PM -0700, Mike Stagnaro wrote:
> Hi Jonathan,
> 
> Your questions first....
> 

Thanks. 
> Follow-on questions from me.....
> 
> Do you have any plans to port monitor to Win32?  If not, how difficult would
> this be?  Could it be something like spawning a dedicated UI thread with
> some form of mutex protected globals?  Monitor (or something like it) seems
> like a critical tool to system tuning and admin.  I primarily target systems
> based on NT (4, 5, 5.1), so I'd be interesting in having a monitor.  Might
> consider building my own (in copius spare time), but I don't know its
> internals (and how it communicates to the daemons) well enough to know if
> this is even something I could/should do.  Any thoughts or plans?

We have not planned yet to fix it because we there has been very little
demand. It should not be complex at all. If I were to do it, I would
probably use the same approach as the threaded spuser and spawn off a
read_message thread to handle messages received from the daemons, and keep
the main thread for UI with the user. The actual interaction code to
send/receive messages from the daemon is very simple (just UDP packets).

 > 
> Also, can you elaborate on the daemon limits in a system.  I've read a few
> things about 128 as a limit within a broadcast segement (ring?)...and a
> limit of 128 total among all segments.  I'm reading this to mean that the
> total number of nodes in spread.conf can be no greater than 128 in a
> multi-segment config.....or a single segment config can have no more than
> 128 nodes.  True?  If so, why?  Is this a limitation imposed by the ring
> and/or hop protocol?  Any insight appreciated.

You are correct on both conclusions. Each segment can have no more then 128
processes, and the total number of processes overall is 128. Two reasons
for the limitations exist. First, a soft limit is that the ring protocol
starts to have problems when the number of nodes in the ring grows. I've
seen rings with 50 nodes or so work ok (if the load is not too high so they
get scheduled quickly), but the latency to cycle a token around 100 nodes
will be pretty high. Second, a hard limit is the way membership messages
are formatted and sent limits the number of actual nodes in the system to a
little over 128. Removing this is possible, but complex which is why it
currently remains.

 > 
> Last one for now...
> Is there a practical (or hardwired) limit on the number of groups an
> application can join?  Let's say that the app lives on the same box as the
> daemon.  With Rendezvous, we have apps with multiple thousands of subject
> subscriptions running with no problem....just curious how Spread compares
> from what you've seen.

Spread is designed to support a large, sortof unlimited number of groups
efficiently. Currently if the membership does not change, i.e. you run
daemons on a fixed set of nodes and they do not crash, we have tested it
with 10,000+ groups. The basic datastructure we use (skiplist) was tested
upto much larger numbers of groups and it scales for lookups/inserts as
log(n). The membership code does have a current limitation of a bit over
1000 groups because of the way it synchronizes state. This limitation is
not hard conceptually to fix, just a bit of time to work on it.

The cost of each group is entirely local (a record in memory) during normal
operation (sending messages), a join or leave of a member to/from a group
costs one ordered message amoung all of the daemons, when a network merge
occurs between a number of daemons, a synchronization message with the
state of all of the groups has to be sent. This is the only costly part of
handling large numbers of groups.

Jonathan
-- 
-------------------------------------------------------
Jonathan R. Stanton         jonathan at cs.jhu.edu
Dept. of Computer Science   
Johns Hopkins University    
-------------------------------------------------------