[ejabberd] Problem: IQ Query for MUC rooms sometimes does not receive full list

Armando Di Cianno armando.dicianno at gmail.com
Thu Mar 17 19:43:55 MSK 2011


I have an issue with ejabberd 2.1.6 and MUC / room enumeration that is
turning out to be frustrating to nail down.

Upon some load (though no where near overloaded / high loadavg / etc),
an IQ Query to the MUC component on a specific virtual host / domain
doesn't return the entire list on an ejabberd cluster of 4 machines.

 * Since it was with "some load", I thought that the Query wasn't
enumerating rooms that had max capacity -- this doesn't seem to be the
case.
 * I cannot recreate the issue locally with a similar about of users
and rooms -- i.e. 5-6 permanent rooms, ~30 temporary rooms, a handful
of the permanent rooms full at max capacity of 100 users.
 * I believe I've traced it down to iq_disco_items/4 (or
get_vh_rooms/2) in mod_muc.erl, which the
gen_fsm:sync_send_all_state_event has a timeout of 100ms -- this might
cause odd behavior on a cluster where, even only local network, time
maybe greater than 100ms. Locally, I can't be certain of this, because
even if I set the timeout to 1ms, it still works.

If this timeout was the culprit, would /some/ of the rooms be
returned, but not all? (That is the behavior I'm seeing.) Or is it
more likely that the mnesia:dirty_select in get_vh_rooms/2 is acting
quite /dirty/ in this situation?

I can easily increase the timeout, if someone can confirm that this is
the problem. For the application I'm building, a "half right" response
isn't acceptable. Any thoughts or help would be appreciated.

Cheers,
__armando


More information about the ejabberd mailing list