[ejabberd] problem: ejabberdctl restore. solution: ejabberdctl install_fallback

Badlop badlop at gmail.com
Mon Dec 28 15:54:16 MSK 2009


2009/12/22 Jan Koum <jan.koum at gmail.com>:
> so next thing we do is 'ejabberdctl restore' and this is where everything
> breaks:
>
> what actually happens is beam will eat up all available RAM (7GB), eat up
> all avaiable swap (2GB) and get killed by the OS.
>
> my guess this is because ejabberd/erlang/mnesia is trying to load everything
> into memory first before writing it into the DCD/DAT/DCL files, correct?  is
> there any way to modify this behavior or work around it somehow?


Thanks, that's interesting. Let's see the Mnesia documentation:
http://www.erlang.org/doc/apps/mnesia/Mnesia_chap7.html#id2277384

It seems 'restore' is only useful when the admin wants Mnesia to
continue online after the backup push:

> Tables can be restored on-line from a backup without restarting Mnesia.
> ...
> If the database is very large, it may not be possible to restore it online.
> In such a case the old database must be restored by installing a fallback,
> and then restart.


> AHA! there is install_falback command which says:
>
> install_fallback ejabberd.backup The binary backup file is installed as
> fallback: it will be used to restore the database at the next ejabberd
> start. Similar to restore, but requires less memory.
> perfect -- just tried it and seems to have worked
>
> i guess the really one question i have is: why does 'restore' not act like
> 'install_fallback' when it comes to memory consumption?

It seems the are completely different methods of pushing a backup,
with their specific benefts and drawbacks.

install_fallback is designed for what we want to do in ejabberd:
> A fallback is typically used when a system upgrade is performed.

In case of ejabberd, reducing memory consumption is far more important
than allowing online restoration.

As you pointed, the preferable method for pushing a full DB backup in
Mnesia in case of ejabberd is 'install_fallback', not 'restore'.


> =ERROR REPORT==== 2009-12-22 01:25:44 ===
> Mnesia('ejabberd at master.xmpp.example.net'): ** ERROR ** (ignoring core) **
> FATAL ** A fallback is installed and Mnesia must be restarted. Forcing
> shutdown after mnesia_down from 'ejabberd at master.xmpp.example.net'...
>
> [this fatal errors comes with either 'ejabberdctl restart' or 'ejabberdctl
> stop' commands after install_fallback command -- is this scary fatal error
> expected?]

I get that too, even with a small 1-user DB.

I think that's an acceptable message, because an additional
ejabberdctl stop + start is needed for the DB to be fully restored:
> The fallback is used to restore the database the next time the system is started.


> and more
> importantly:  maybe it makes sense to modify the documentation guide to
> recommend people use install_fallback when doing cluster renames in
> production.

Right. So I've improved those steps in the Guide to use install_fallback:

http://svn.process-one.net/ejabberd/branches/ejabberd-2.1.x/doc/guide.html#changeerlangnodename

and also the commands explanations:

> restore ejabberd.backup
> Restore immediately from a binary backup file the internal Mnesia database.
> This will consume a lot of memory if you have a large database,
> so better use install_fallback.

> install_fallback ejabberd.backup
> The binary backup file is installed as fallback: it will be used to restore
> the database at the next ejabberd start.
> This means that, after running this command, you have to restart ejabberd.
> This command requires less memory than restore.


---
Badlop
ProcessOne


More information about the ejabberd mailing list