[Spread-users] Bug in spread - message counter

John Lane Schultz jschultz at spreadconcepts.com
Tue Apr 8 11:28:18 EDT 2008


For Spread 4.0, first, in your configuration file you must uncomment
and set the parameter:

DangerousMonitor = true

Then, run spmonitor on one of the machines running Spread, making
sure it uses the same configuration file as the Spread instantiation
in question.

Inside spmonitor, use option 1 to define a partition by assigning
each of the daemons to an arbitrary partition number.  If all of your
daemons are up and connected, then you can simply assign one of them
to partition 1 and the rest to partition 2.

Then use option 2 to send the partition to the daemons.  Wait until
the daemons have finished handling the partition and have installed a
new daemon membership.  You can see this either by watching the
output of a daemon, or you can use spmonitor to query the daemons and
note when their membership ID changes.  Alternatively, if you wait
for 60 seconds or so (at the longest) they should have handled the
partition by then.

Finally, use option 4 to remove the artificial partition.  If user
traffic is being sent, then the daemons will discover one another
again quickly, reconnect and install a new membership.  If the
network is quiet, then they may take several minutes to discover that
the partition is gone.

Cheers!
John

---
John Lane Schultz
Spread Concepts LLC
Phn: 443 838 2200 
Fax: 301 560 8875

Tuesday, April 8, 2008, 8:31:16 AM, you wrote:

> Monday 07 April 2008 17:05:28 John Lane Schultz napisał(a):
>> I can think of two short term work arounds that are less invasive
>> than changing the counters to 64b.  The first, and easiest, would be
>> to simply cause a network membership, administratively using spmonitor,
>> if/when the counters approach approximately 2^30, or if you like to
>> live dangerously, approximately 2^31.  Doing this should be far less
>> work than waiting for the lock up and then administratively having to
>> reboot your entire system.
> And how to do this using spmonitor?
> -- 
> WK





More information about the Spread-users mailing list