[ejabberd] Really bad performance with ejabberd cluster

Badlop badlop at gmail.com
Fri Oct 19 14:34:05 MSD 2007


2007/10/19, Greg <greg at easyflirt.com>:
> OS: Debian Etch 4.0, ejabberd 1.1.3 patched with flash hack, TLS
> disabled, mysql native driver.

May the flash hack be problematic? Did you try to run your experiment
with the stock 1.1.3?

May the problem be in Mysql, or mysql drivers, or ejabberd's
implementation of mysql storage? You can try an experiment using the
internal database.


> /usr/bin/erl -mnesia extra_db_nodes "['ejabberd at node2']" -pa
> /var/lib/ejabberd/ebin -s ejabberd -sname ejabberd -ejabberd config
> \"/etc/ejabberd/ejabberd.cfg\" log_path \"/dev/null\" -kernel inetrc
> \"/etc/ejabberd/inetrc\" -env ERL_MAX_PORTS 64000 +K true +P 256000 -smp
> +A 256

I guess you know that ERL_MAX_PORTS modifies the Erlang limit of
ports, but you still need to modify your operating system limit (with
ulimit -n or whatever).

This is the first time I see the option: +A 256
The Erlang documentation says:
''Sets the number of threads in async thread pool, valid range is
0-1024. Default is 0.''
Are you sure it is really required for a high-performance server?

The -smp Erlang feature is quite recent, did you try disabling it? You
use Erlang/OTP R11B-2, which is almost a year old. Maybe some features
(like SMP) were not yet optimized enough at that time.


> node1 reach 3500 connected users max.
> node2 reach 2000 connected users max.

Just curiosity: what software do you use to generate the load? Jabsimul?


> node1 have only connected users, no messages or really a few.
> With 4500 users, both nodes use 250% CPU (of 400).

What system processes use that CPU: 'beam'?

Does that process consumption remain stable all the time during several minutes?
If 5000 clients try to login almost at the same time, the server will
obviously become a bottleneck, and eventually will come back to a
stable state. In the real world, clients do not connect all in just a
timeframe of 0 seconds: did you try to connect clients in small
batches?

Are the rosters filled with items, or are empty?

Did you check memory consumption? Maybe there's a bug in ejabberd or
Erlang/OTP and consumes all it, even if you have so much. Does the
memory consumption grow at a linear rate with the number of logged
users in your experiment?

Did you check disk activity? I guess there shouldn't be at all, since
all the important data is available in memory.


> What's the hell ??!!!
> I need to be able to reach at least 20000 users, benchmarks says that
> with only 1 node i will be able to connect 50000 users !

Which benchmarks do you refer to?
I only remember http://www.ejabberd.im/benchmark
and it only mentions 3.000 concurrent users with high activity in a
low-power machine.


Did you check ejabberd log files (ejabberd.log and sasl.log)? Maybe
they report some kind of error or warnings that may explain ejabberd
behavior in your experiment.


More information about the ejabberd mailing list