[Spread-users] Clock skew and spread

Vsevolod Vlaskin vlaskine at yahoo.com
Sun May 13 08:12:50 EDT 2007


Hi,

Good considerations. Did I understand correct: you say
that it was not Spread, but the OS-specific event
scheduling?

(I have not checked the Spread source code, but it
would be strange to use absolute time or time
durations to schedule events, if only FIFO ordering
(or even virtual synchrony) is used, as only the order
of the messages counts, not their timing.

I guess, NTP does the time adjustments, when the
computer time goes out of sync with the time on the
LAN. These jumps probably are of order of milliseconds
or seconds, but in our case it proved to be enough to
make Spread to stop delivering messages a couple of
times. Our system runs daily every day, so
probabilistically there was enough chance for us to
hit the problem. And the effect was BAD.)

Thank you very much,

Vsevolod Vlaskine



--- Nilo Rivera <nrivera at cs.jhu.edu> wrote:

> Hi,
> 
> I had similar problems with another software that
> uses the same event 
> system.
> 
> In general, when you need to schedule a callback
> function 5 seconds in 
> the future, you schedule the event to the current
> time + 5 seconds (look 
> at E_queue in events.c).  You may have a lot of
> events in between, but 
> when you reach that time (based on the system time),
> the event system 
> will call that event before any other that was
> schedule for a future time. 
> 
> If your clock jumps into some future time, and stays
> in that time, the 
> problem is that a lot of events will look as expired
> and will start 
> firing.  The protocols may not behave properly.
> 
> But the bad case happens when it jumps into the
> future and comes back 
> into the past.  In this case,  you may schedule
> events at the future 
> time + 5 seconds.  When the clock comes back to the
> current time, it 
> will not hit the event until reaching the expected
> time.  In that case, 
> your program may be stock for quite a while. 
> 
> I avoided the problem by blocking NTP port from my
> network, and allowing 
> NTP to set clocks when I knew it was safe (when my
> program was not 
> running).  Then again: (1) why are some NTP daemons
> making clock jumps 
> (my pure guess at the time was that it was setting
> the system time to 
> GMT and then back to EST, but I never looked at the
> NTP code), (2) is 
> there any easy/pretty solution to avoid this problem
> in an event system.
> 
> 
> Cheers,
> Nilo
> 
> 
> 
> Vsevolod Vlaskin wrote:
> > Hi,
> >
> > A while ago, we seemed to consistently see a
> similar
> > problem in our configuration, which was only one
> > Spread daemon with a number of clients all on the
> same
> > LAN on Linux. We used just FIFO ordering for all
> our
> > Spread clients.
> >
> > A few times, the Spread communication failed
> > altogether: messages stopped being delivered
> (which
> > was quite tragic) and we noticed that our NTP
> service
> > did noticeable clock jumps at the time of failure.
> >
> > We posted the question on this list, but there was
> no
> > reply. Maybe now there will be more response.
> >
> > Best regards,
> >
> > Vsevolod Vlaskine
> >
> >
> >
> > --- John Robinson <jr at vertica.com> wrote:
> >
> >   
> >> We lost our T1 connection to the world for a
> while
> >> today, and I think 
> >> some of our servers' clocks may have drifted (no
> >> internal NTP source...).
> >>
> >> Can this cause oddities among a subnet of spread
> >> daemons?  Do they have 
> >> to drop connections to their clients for reasons
> >> related to clock drift 
> >> amongst the host machines?  If so, is there some
> >> logging I can enable to 
> >> track this?
> >>
> >> I think I have seen similar things happen when we
> >> try to run spread 
> >> daemons on a "cluster" under VMWare, which is
> known
> >> to introduce clock 
> >> issues.
> >>
> >> thanks,
> >> /jr
> >>
> >>
> >> _______________________________________________
> >> Spread-users mailing list
> >> Spread-users at lists.spread.org
> >>
> >>     
> >
>
http://lists.spread.org/mailman/listinfo/spread-users
> >   
> >
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam?  Yahoo! Mail has the best spam
> protection around 
> > http://mail.yahoo.com 
> >
> > _______________________________________________
> > Spread-users mailing list
> > Spread-users at lists.spread.org
> >
>
http://lists.spread.org/mailman/listinfo/spread-users
> >   
> 
> 
> 



       
____________________________________________________________________________________Be a better Heartthrob. Get better relationship answers from someone who knows. Yahoo! Answers - Check it out. 
http://answers.yahoo.com/dir/?link=list&sid=396545433




More information about the Spread-users mailing list