[ejabberd] Cluster check problem
zbyszek at toliman.pl
Mon May 15 03:31:34 MSD 2006
There is problem with clustered setup. I did not fill bug report due to
this problem - i think need discussion how to resolve it.
Problem: If You are running 2 or more cluster setup of ejabberd and
there is phisical (or other) network issue (failure) that connecting all
then nodes (like internal network that interconnect nodes), act as other
node failed. But this node is not failing - this cause to very strange
situation.: For example: cluster setup with 2 nodes: A and B. A and B
are connected via private network, both have independent internet
access. Now when private network fails, node A and node B "thinks" that
node A,B have failed - and A remove B from cluster. The same thing
happening on B. Now - users can still connect to both nodes - but the
nodes are not in "sync". That can result in strange situations like s2s
issues, or 2 users connected to different nodes don't see each other...
Now when private network start to work - nodes will not "see" each other
(you need to restart one of the node) - i think that must be fixed
(maybe some check intervals if failed node have back to life?).
So 2 problems:
1) interconnection checks
2) problem with working nodes, but not inter exchanging info (?)
More information about the ejabberd