[ejabberd] ejabberd crashed with emfile; how to diagnose

Jonas Ådahl jadahl at gmail.com
Fri Apr 20 12:08:41 MSK 2012


On Tue, Apr 17, 2012 at 12:07 AM, Daniel Dormont
<dan at greywallsoftware.com> wrote:
> Hello all,
>
> Today my ejabberd crashed quite suddenly with the following error:
>
> =ERROR REPORT==== 2012-04-16 15:59:31 ===
> Mnesia('ejabberd at 10.0.0.100'): ** ERROR ** (could not write core file:
> emfile)
>  ** FATAL ** Cannot open log file "/var/lib/ejabberd/roster.DCL":
> {file_error,
>
>  "/var/lib/ejabberd/roster.DCL",
>                                                                    emfile}
>
> =ERROR REPORT==== 2012-04-16 15:59:32 ===
> ** Generic server ejabberd_mod_muc_sostest terminating
> ** Last message in was {mnesia_system_event,
>                            {mnesia_down,'ejabberd at 10.0.0.100'}}
> ** When Server state == {state,"conference.sostest","sostest",
>                                {muc,muc_admin,muc_admin,muc},
>                                20,[],none}
> ** Reason for termination ==
> ** {badarg,[{ets,lookup,
>                  [local_config,
>
> {domain_balancing_component_number,"conference.sostest"}]},
>             {ejabberd_config,get_local_option,1},
>             {ejabberd_router,get_component_number,1},
>             {ejabberd_router,unregister_route,1},
>             {mod_muc,terminate,2},
>             {gen_server,terminate,6},
>             {proc_lib,init_p_do_apply,3}]}
>
> =INFO REPORT==== 2012-04-16 15:59:32 ===
>     application: mnesia
>     exited: shutdown
>     type: permanent
>
> =ERROR REPORT==== 2012-04-16 15:59:32 ===
>     application_master: shutdown_error
>     ejabberd_app: {prep_stop,[[]]}
>     error_info: {badarg,[{ets,lookup,[config,hosts]},
>                          {ejabberd_config,get_global_option,1},
>                          {ejabberd_app,stop_modules,0},
>                          {ejabberd_app,prep_stop,1},
>                          {application_master,prep_stop,2},
>                          {application_master,loop_it,4}]}
>
>
> Followed by a whole bunch of knock-on errors in various sessions,
> connections and other processes due to being unable to read various mnesia
> tables.
>
> It's hard for me to guess what could have caused this. The node was running
> in a two-node cluster on Linux, and should have had at most 20-30 actual
> user sessions, and a couple of hundred mostly idle MUC processes, two
> ejabberd_service connections. The MUCs should have been each running an
> instance of mod_muc_log but that doesn't persist any open filehandles as far
> as I can see.
>
> So I have three questions:
>
> 1) Has anyone else encountered this crash type with ejabberd, and what was
> the root cause?
> 2) Is there any other place I should look for more information about the
> crash? I don't see anything that looks like a dump file in
> /var/log/ejabberd, but I can look further.
> 3) The node restarted just fine, but is there any way I can monitor it to
> see what the open files are looking like in a live manner so I can prevent
> this in the future?
>

Hi

The error "emfile" is a POSIX error and means "Too many open
files"[0]. You can increase the default limit for the user running
ejabberd by editing the file /etc/security/limits.conf[1]. See the
"nofile" item.

Jonas

[0] http://www.erlang.org/doc/man/file.html
[1] http://linux.die.net/man/5/limits.conf


More information about the ejabberd mailing list