[ejabberd] Notes on on ejabberd learning

comain chen comain at gmail.com
Sun Jan 15 17:28:03 MSK 2012

I am investigating ejabberd work, I think the documentation is rather
limited. Here is my learning notes on this and hope that someone can
help clarify misunderstandings.

#1. How does ejabberd distribute incoming client connections?

Typically we use DNS balancing to choose one ejabberd node for an
incoming client. There is a cluster module take care of the whole
cluster of all nodes and when new node joins it will redistribute
existing c2s connection to the new node (ejabberd_c2s:migrate).

 What happens when existing node when down?
I believe all clients distributed on that node will be disconnected.
There seems have some basic infrastructure work trying to fix this
(i.e. the frontend/backend node work), but it is not finished yet.

#2. How can one client connected on node1 talking to client connected on node2?
The process is as follows:
1) c2sA queries global_route and route to find corresponding sm node
smA' handling this *virtual host*
2)  smA' find corresponding sm node smB handling client B by jid hash
(ejabberd_sm:do_hash, because we have client distribution via #1)
3). smB routing message to c2sB by querying its local session table

Why do we need global route?
I do not get the point of having route and global route both, I can
see the need of a route module to handling the jabberd component
protocol. But why do we need global route?

#3. The storage layer
ets tables: These tables are cluster unaware and in memory (must be
volatile non-global data)

mnesia tables: Some tables are ram backed and some are disc backed.
Some are even local-content on (and thus cluster unaware). Since
mnesia tables have max-size limit (with out sharding), data stored on
these tables must have limited size.

gen_storage tables
Some modules access data via the gen_storage API and the underlying
database can be configured to be using external SQL servers or
internal mnesia system. Data in these module tend to be long-lived and
large-sized, such as roster-group.

1. What's the difference between ets table and mnesia ram backed table
with local-content on?

2. How can mnesia nodes make up a cluster?
>From what I have learned, we have to first set up one node as mnesia
master and all other nodes starting up as slave by default (They don't
store data by its own but querying the master node instead). After
this we have to manually configure which tables need to be replicated
on which nodes.

3. How can mnesia data be sharded?
I konw mnesia have a 2G max data file limit per table, thus to support
large amount of data we have to shard the data between tables. Mnesia
has  this support but I don't see the ejabberd code used this. Does
this mean currently ejabberd with mnesia cannot support large data? In
a production service, we have to configure the gen-storage to use an
external SQL service instead?

Thank you!
Best regrads,

More information about the ejabberd mailing list