[ejabberd] Clustered servers not reconnecting

Matthew Harrell lists-sender-c29da0 at bittwiddlers.com
Sun Dec 10 06:27:10 MSK 2006


Hi. I have two ejabberd 1.1.2 servers clustered together with the
documentation I've found on the website here and in the manual. Things seem
to work fine for a while but it seems like every night (possibly when they're
doing nightly maintenance) the servers disconnect from eachother. The main
one logs this

=ERROR REPORT==== 2006-12-05 00:16:45 ===
** Node ejabberd at server2 not responding **
** Removing (timedout) connection **

and the backup

=ERROR REPORT==== 2006-12-05 00:16:54 ===
** Node ejabberd at server1 not responding **
** Removing (timedout) connection **

while it's possible the vpn between them is having a hiccup at that
point I know it's not going down long because no other processes complain
at all. When I connect to the erl session on server2 it is able to ping
server1 without a problem (and vice versa) and when I restart server2 then
everything is fine again for a while although I do get this mesage

=ERROR REPORT==== 2006-12-05 10:29:30 ===
Mnesia(ejabberd at server1): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, ejabberd at server2}

So, is there a way I can find out what's causing the problem in the
first place?

And, is there a way I can get mnesia / ejabberd to continue to attempt
the connection until it reconnects?

-- 
  Matthew Harrell                          Nondeterminism means never
  Bit Twiddlers, Inc.                       having to say you are wrong.
  mharrell at bittwiddlers.com     



More information about the ejabberd mailing list