[Spread-users] spread as a shared library?
Theo Schlossnagle
jesus at omniti.com
Wed Jun 29 23:53:34 EDT 2005
Neil Conway wrote:
> I'm working on an application that will use Spread. However, I don't
> really need a separate Spread daemon process AFAICS: each node in the
> group will only have a single connection to Spread. I'd also rather
> not burden my users with the need to install and configure the Spread
> daemon.
going through the effort of bulding a library that imposes on users that
they can only have one "connection to spread" would be disappointing.
There are many legimate situations in which it can be advantageous to
have multiple sessions from within a single app -- you may find yourself
in such a situation, so I'd be careful about limiting that possibility.
> I'm wondering if it would be possible to refactor Spread to provide a
> shared library that implements the Spread protocol. An application
> linking against the Spread library would need to call into Spread's
> event loop regularly (or perhaps just dedicate a thread to this task
> and have it block inside the Spread event loop). The spread daemon
> could then be implemented using this library.
One of the more acute pains of Spread is it's lack of dynamic
configurability combined with it's general configuration fragility.
There are many simple configuration file "mistakes" (like leaving one
server out of the list on a single machine) that will cause Spread to
not work with a non-obious error message. Also, many common network
configuration errors that don't cause problems for other protocols can
reek havoc on a Spread cluster.
The fact that libraries, in general, allow multiple simulatneous uses of
their functions poses a serious problem w.r.t. Spread's coding style
(with static variables and the like). To be a "proper" library, the
whole she-bang would have to be contextualized. That way you could
start two Spread rings within a single process and enforce fine grained
quality of service in very busy systems (where token loss is chronic to
to network saturation). You could tune one ring that handles important
messages with very very aggressive retry settings an the other "bulk
ring" to be normal.This accidentally saturating your network fabric on
one ring will not (or is less likely to) collapse the "vital" ring.
The Spread guys can correct me if I'm wrong here, but as far as I know
no effort has gone into making any of the functions in there
thread-safe, reentrant, async-signal-safe. So, you couldn't guarantee
any of the stuff in those functions would actually complete correctly
assuming another program was duing threading and signal management.
You'd need to completely rewrite the SESSION layer to not "connect to
itself". You'd want a session context that pushed directly into
messaging stack (thread-safely). Go digging in the Spread source, and
you'll see the overhaul would be pretty tremendous.
> Does this sound feasible? It would obviously require some fairly major
> surgery, I'm just wondering if there's some reason it's not possible.
If you do surgery that dramatic, be prepared for a Frankenstein. I'd
recommend a rewrite based on the protocols and concepts. It would allow
a different messaging API to be used and allow much higher performance
as you could effectively do zero-copy messaging in many cases.
I would figure it would require a few days to hack up what you
described. However, I think it would be prone to problems. On the
other hand, a professional C senior software engineer could do a
complete rewrite in one month -- easily. Spread's only 26k lines of
code after all :-) If you wanted to just read the academic papers and
build a new implementation from scratch, I think you're looking a 2-3
months for one person or 1 month for 4 people.
This is on our todo list here at OmniTI (along with an epic laundry list
of enterprise enhancements).
--
// Theo Schlossnagle
// Principal Engineer -- http://www.omniti.com/~jesus/
// Ecelerity: Run with it. -- http://www.omniti.com/
More information about the Spread-users
mailing list