[ejabberd] ejabberd mod_muc bottleneck

Evgeniy Khramtsov xramtsov at gmail.com
Thu Jul 1 07:54:50 MSD 2010


01.07.2010 12:40, Karthik Kailash wrote:
> *Finally I found the source of ejabberd_router:route's occasional extreme
> slowness is a single mnesia call in ejabberd_sm:do_route.*  The line is:
> mnesia:dirty_index_read(session, USR,  #session.usr).  Sampling using
> timer:tc every few thousand messages, I found that 80-90% of the time, this
> call is fast (<  100 us), but the remaining 10-20% it takes tens of ms to
> complete.  This is exactly the same as the ejabberd_router:route behavior.
>
> So I'm pretty certain I've narrowed the bottleneck problem down to this
> line.  However, the session table is in RAM and there is an index on the usr
> field, so I can't figure out why this problem is happening!
>
> Any thoughts or ideas?

Hello, Karthik. First of all, your investigation is interesting :) 
However, I cannot reproduce the issue. This is how I reproduce it:
1) Connect with 2 clients and join to some muc conference.
2) Start eprof on the room pid.
3) Route messages from one of the connected users to the room in erlang 
shell:
 > lists:foreach(fun(_) -> ejabberd_router:route(jlib:make_jid("user", 
"domain.com", "resource"), jlib:make_jid("room", 
"conference.domain.com", ""), {xmlelement, "message", [{"type", 
"groupchat"}], [{xmlelement, "body", [], [{xmlcdata, "a"}]}]}) end, 
lists:seq(1, 1000)).
4) Analyze eprof data.

According to your report, there should be 
mod_muc_room:process_groupchat_message (and maybe some mnesia functions) 
in the top of eprof output, but I don't see this.
On the other hand, according to your symptoms and looking into mnesia 
sources I've found that dirty functions call to mnesia:do_dirty_rpc/5 
function, which calls rpc:call, here is a snippet:
do_dirty_rpc(Tab, Node, M, F, Args) ->
     case rpc:call(Node, M, F, Args) of
     {badrpc, Reason} ->
     timer:sleep(20), %% Do not be too eager, and can't use yield on SMP
     ...
     do_dirty_rpc(Tab, NewNode, M, F, Args)

This might be a problem in your case. Try to add debug output there to 
check if you don't get {badrpc, Reason} in this function.

-- 
Regards,
Evgeniy Khramtsov, ProcessOne.
xmpp:xram at jabber.ru.



More information about the ejabberd mailing list