[ejabberd] MySQL delay/lock

Andy Skelton skeltoac at gmail.com
Mon Sep 28 21:35:43 MSD 2009

> Below is the code I've narrowed our issue down to.
> ejabberd_odbc.erl (lines 69-71):
>        sql_query(Host, Query) ->
>                gen_server:call(ejabberd_odbc_sup:get_random_pid(Host),
>                        {sql_query, Query}, 60000).
> I am curious if anyone has an idea on how to remedy this situation. I'd like
> to minimize collisions as well as lower the delay time. I am curious how
> gen_server:call() affects queued items. Do newer queued items go to the
> front of the line, thus pushing items that have been waiting back further?
> And would lowering the timeout (60000) time help long delayed queries?

gen_server:call() uses a standard erlang message queue, so newer items
go to the back of the line.

No matter how many connections you have open, the get_random_pid
method can't avoid sending queries to busy connections. In theory you
could have 199 idle connections and one connection with a long queue
and get_random_pid could still send it queries.

To avoid this problem you can write a process to manage the
connections. I've written something similar to that [1]. Here's the
basic gen_server logic:

#state{idle=[pid()], busy=[], waiting=[]}

When a connection is requested by gen_server:call(con_man, get_pid)
  if the idle list is not empty, take the head of the idle list (Pid)
and put it in the busy list, {reply, Pid,
  else append the requester to the waiting list, {noreply,

When a connection (Pid) is released by gen_server:cast(con_man,
{release_pid, Pid})
  if the waiting list is empty, add Pid to the idle list;
  else send the released Pid to the head of the waiting list with

The last issue to solve is connection death. This is tricky because
currently the supervisor, not the manager, is the one that restarts
connections. I solved this by wrapping my volatile connections in
non-volatile processes that trap exits so they can restart crashed
connections. I could have linked them to the manager or programmed the
manager to check their health but it turned out to be simpler (shorter
code and no race conditions) to use wrappers.

The problem I solved was similar but different. My service [2] uses
PHP to retrieve and process data from caches and clustered databases.
(Replicating the db and cache access logic in erlang, and then
synchronizing it, would have taken ages.) The startup overhead for my
PHP codebase is huge, sometimes seconds, so I created a pool of
persistent, reusable PHP processes that wait for commands in the form
of PHP code. The pool manager works pretty much as above.

I'd like to see this implemented in ejabberd_odbc but as I'm already
committed to querying MySQL via PHP it's not very valuable to me. Feel
free to borrow from php_app if you decide to patch ejabberd.

Andy Skelton

[1] http://github.com/skeltoac/php_app
[2] http://im.wordpress.com/

More information about the ejabberd mailing list