[ejabberd] mnesia lock, dead-lock or what?

Igor Goryachev igor at goryachev.org
Fri Nov 14 15:26:24 MSK 2008


Hello, everybody!

We run a high-loaded cluster of several ejabberd (with some
proprietary modules which do not use mnesia) nodes. The half of nodes is
placed in one data center, the rest is in the another one. Some times
(may be during network failures) the cluster suffers of locks inside
mnesia subsystem which causes service degradation at all. I have already
checked the code inside of mnesia transactions, for me -- it's
fine. Well, we use R12B-3 (amd64, Debian GNU/Linux, package was
rebuilded w/o smp and async-threads) on all our machines and here is the
output of mnesia:info/0 during the deal ("myserver" and "otherserver\d+"
are renamed real hosts for the security purposes):

(ejabberd at viper)2> mnesia:info().
---> Processes holding locks <---
Lock: {{s2s,{"myserver","otherserver1"}},
       write, 
       {tid,370791,<4079.5909.10>}}
Lock: {{s2s,{"myserver","otherserver2"}},
       write, 
       {tid,370784,<4077.8264.8>}}
Lock: {{s2s,{"myserver","otherserver3"}},
       read,  
       {tid,370787,<3910.21053.1>}}
Lock: {{s2s,{"myserver","otherserver3"}},
       write, 
       {tid,370787,<3910.21053.1>}}
Lock: {{s2s,{"myserver","otherserver4"}},write,{tid,370791,<4077.4651.0>}}
Lock: {{s2s,{"myserver","otherserver5"}},
       write, 
       {tid,370780,<4080.959.9>}}
Lock: {{s2s,{"myserver","otherserver6"}},
       write, 
       {tid,370790,<4080.27328.7>}}
Lock: {{s2s,{"myserver","otherserver7"}},
       write, 
       {tid,370790,<4078.29634.7>}}
Lock: {{s2s,{"myserver","otherserver8"}},
       write, 
       {tid,370790,<4077.5830.9>}}
Lock: {{s2s,{"myserver","otherserver9"}},
       write, 
       {tid,370790,<4079.25514.8>}}
---> Processes waiting for locks <---
---> Participant transactions <---
Tid: 370780 (owned by <4080.959.9>)
with participant objects {commit,ejabberd at viper,presume_commit,
                             [{{s2s,{"myserver","otherserver5"}},
                               {s2s,
                                   {"myserver","otherserver5"},
                                   <4080.959.9>,"551643360"},
                               write}],
                             [],[],[],[]}
Tid: 370784 (owned by <4077.8264.8>)
with participant objects {commit,ejabberd at viper,presume_commit,
                             [{{s2s,{"myserver","otherserver2"}},
                               {s2s,
                                   {"myserver","otherserver2"},
                                   <4077.8264.8>,"2852318864"},
                               delete_object}],
                             [],[],[],[]}
Tid: 370790 (owned by <4080.27328.7>)
with participant objects {commit,ejabberd at viper,presume_commit,
                             [{{s2s,{"myserver","otherserver6"}},
                               {s2s,
                                   {"myserver","otherserver6"},
                                   <4080.27328.7>,"860186058"},
                               delete_object}],
                             [],[],[],[]}
Tid: 370790 (owned by <4078.29634.7>)
with participant objects {commit,ejabberd at viper,presume_commit,
                             [{{s2s,{"myserver","otherserver7"}},
                               {s2s,
                                   {"myserver","otherserver7"},
                                   <4078.29634.7>,"1948941028"},
                               delete_object}],
                             [],[],[],[]}
Tid: 370790 (owned by <4079.25514.8>)
with participant objects {commit,ejabberd at viper,presume_commit,
                             [{{s2s,{"myserver","otherserver9"}},
                               {s2s,
                                   {"myserver","otherserver9"},
                                   <4079.25514.8>,"325351070"},
                               delete_object}],
                             [],[],[],[]}
Tid: 370790 (owned by <4077.5830.9>)
with participant objects {commit,ejabberd at viper,presume_commit,
                             [{{s2s,{"myserver","otherserver8"}},
                               {s2s,
                                   {"myserver","otherserver8"},
                                   <4077.5830.9>,"1699221599"},
                               delete_object}],
                             [],[],[],[]}
---> Coordinator transactions <---
Tid: 370787 (owned by <3910.21053.1>)
Tid: 370788 (owned by <3910.32551.1>)
.......

What does it mean? Why does it occur? How could we resolve this
behaviour? Is there enough information for the investigation?

Thank you very much for the attention.


-- 
    Igor Goryachev              E-Mail/Jabber: igor at goryachev.org


More information about the ejabberd mailing list