[ejabberd] ejabberd_router blocking

Badlop badlop at gmail.com
Thu Dec 3 23:00:35 MSK 2009


2009/12/1 Andy Skelton <skeltoac at gmail.com>:

I've investigated ejabberd code with Alexey Shchepin, see the summary:

ejabberd_router.erl is the main stanza router, and there are two
methods to send him a stanza:
A) call the function route/3
B) send an erlang message to the erlang process 'ejabberd_router'.

> When the memory problem occurs there is
> only one process that is eating RAM: ejabberd_router. It builds up a
> huge message queue which requires gigabytes of RAM.

> the
> filter_packet hook blocks ejabberd_router. If anything on that hook
> ever gets slow the entire router queue will wait. It must be one of my
> packet filters blocking the router and causing the pile-up.

As you noticed, method B blocks a unique process in all the ejabberd node.
That is a bad idea. Your initial solution was to paralellize method B.

Method A blocks the calling process, which may be a c2s associated to
a client session,
a s2s associated to a remote server... That is paralellized by design.

Consequently, using method A is preferable over B.


> Our throughput is almost all pubsub events.

Method A is commonly used, but there are still some instances of method B:
$ git grep "ejabberd_router:route(" | wc -l
235
$ git grep "ejabberd_router \! {route" | wc -l
19

There are 18 in mod_pubsub, and one in mod_vcard.
It's very easy to change those to method A, just replace
  ejabberd_router ! {route, From, To, Stanza}
with:
  ejabberd_router:route(From, To, Stanza)

I'll check with Christophe Romain about the mod_pubsub
if it's completely safe to make those changes.

Tracked in: https://support.process-one.net/browse/EJAB-1114

For testing, you can make those changes yourself, or wait for a patch.


---
Badlop
ProcessOne


More information about the ejabberd mailing list