[Spread-users] scalability question involving groups...

Yair Amir yairamir at cnds.jhu.edu
Fri Jan 30 14:21:05 EST 2004


Hi Greg,

This is actually interesting.

The basic group search algorithms in Spread are very scalable.
We checked the data structures with millions of groups getting
very good results for lookup, which is the costly operation
that is done per message. This means that the cost of analyzing
messages stayed almost the same whether you had 2 groups or a million
groups. (We are using skiplists so the complexity is logarithmic).
This assumes that you have the needed memory on the machine.

However, if you have 700K groups, then you will need 700K messages
just to have each group joined by one member. Thats a lot of overhead
to set the system (will probably take around about 2 minutes to
set up!!!).

Also, if your network configuration is not stable, every time there
will be a daemon membership change adding or merging with other
daemons, the group table will be re-sent (so that the merged daemons
will have the same group table state). Assuming every group has at
least one member, you have another 2 minutes here.

Cheers,

       :) Yair.


On Friday, January 30, 2004 1:12 PM
Greg Shebert gshebert at efs-us.com wrote:

Greg> reading through the documentation on spread, i noticed that it mentions
Greg> that spread should be able to handle lots of different groups...
Greg> thousands even...

Greg> my question is how far that will scale?

Greg> can spread handle hundreds of thousands of groups?

Greg> what issues would one face in attempting to handle this many groups? 

Greg> i'm considering creating an event notification system using spread that
Greg> is driven off of a database (or several)... i plan to use the key values
Greg> of each record in the database to define the group that updates will be
Greg> sent on...

Greg> in this manner, clients can get updates only for records that they are
Greg> interested in by joining a group using the key of the record in question
Greg> as the group to join...

Greg> the table could have upwards of 700k records...

Greg> i know there are LOTS of other ways to do this... i was just wondering
Greg> if i can get some filtering for free by using spread's groups as my
Greg> event filtering mechanism?

Greg> thanks
Greg> -greg-



Greg> _______________________________________________
Greg> Spread-users mailing list
Greg> Spread-users at lists.spread.org
Greg> http://lists.spread.org/mailman/listinfo/spread-users






More information about the Spread-users mailing list