[ejabberd] node interconnection - very rare issues, bug/not bug?

Zbyszek Żółkiewski zbyszek at toliman.pl
Thu Sep 14 02:13:06 MSD 2006


Second time i experience strange situation with ejabberd running on multiple
nodes (3). I have describe that on bugtraq few months ago.
If we have 3 nodes (ejd) that work as cluster and are interconnected, and we
break connection to - let say - first node for some period of time, and fix
that connection again, nodes will not reconnect. That situation lead to very
very unpredictable behaviours - because - f.e.load balancer "see" that all
nodes are working correctly, and it directs connections to that nodes, that
lead to strange situations like, wrong s2s negotiations and many others.

So my question is: is that bug in ejabberd or in erlang itself?

If in erlang - is there any way to "force" nodes to periodically check if
offline nodes (that look like they are offline - but they are not) come back
to life?

Zbyszek Żółkiewski
