[ejabberd] Clustering question...

Felix GV felix at mate1inc.com
Tue May 1 23:10:00 MSK 2012


Hello :)

I have a question about clustering. Here are the details of my current
setup:

   - I use ejabberd 2.1.10 on Debian squeeze, 64-bits.
   - I have two nodes (jabber1.dev and jabber2.dev) serving the jabber
   domain called jabber.dev.
   - jabber.dev is setup with a simple round-robin DNS that redirects all
   traffic equally between jabber1.dev and jabber2.dev.
      - When I ping jabber.dev repeatedly, I do get the two IP addresses
      half of the time each, so that seems to work.
      - I can connect to my jabber domain and and I can chat.
   - I pretty much followed this guide to set up the mnesia table
   replication:
   http://lists.jabber.ru/pipermail/ejabberd/2009-December/005535.html
      - Basically, I started off with jabber2.dev being entirely made up of
      remote copies table, which I then changed in the following way:
         - Every table on jabber1 that was RAM only became RAM only on
         jabber2.
         - Every table on jabber1 that was RAM and disc became RAM and disc
         on jabber2.
         - Every table on jabber1 that was disc only remained as a Remote
         copy on jabber2.
         - I left all of the muc tables in Remote copy.

Now, my problem is that I do not really get any additional fault-tolerance
from this setup.

When I bring down jabber2.dev, I can still connect and chat to the
jabber.dev domain. But when I bring down jabber1.dev, I get disconnected
and cannot reconnect (even though jabber2.dev is still up).

I see two things that are weird:

   - I can bring up the web admin interface using: jabber.dev:5280/admin as
   well as jabber1.dev:5280/admin but not jabber2.dev:5280/admin
      - I tried to connect repeatedly (and always successfully) on
      jabber.dev and I know that I should end up on jabber2.dev half
of the time
      thanks to the DNS round robin, so I assume that the admin web interface
      must be working on jabber2.dev, but when I access it directly it doesn't
      work (it's not an authentication problem, the page doesn't come up at
      all)...
   - On jabber2.dev, I can see the jabber process when I do "ps aux | grep
   jabber", but when I do sudo ./ejabberdctl status I get:

*The node ejabberd at jabber2 is started with status: started
ejabberd is not running in that node
Check for error messages: /opt/ejabberd-2.1.10/logs/ejabberd.log
or other files in that directory.

Commands to start an ejabberd node:
  start  Start an ejabberd node in server mode
  debug  Attach an interactive Erlang shell to a running ejabberd node
  live   Start an ejabberd node in live (interactive) mode*

Whereas on jabber1.dev, the same command gives me:

*The node ejabberd at jabber1 is started with status: started
ejabberd 2.1.10 is running in that node*

Is there someone that can help me debug this problem...?

I can provide other informations if the above is not sufficient...!

Thanks a lot :) !!

--
Felix
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jabber.ru/pipermail/ejabberd/attachments/20120501/2f9f50fe/attachment.html>


More information about the ejabberd mailing list