[ejabberd] ejabberd crash

Matthew Reilly matthew.reilly at sipphone.com
Wed Aug 31 22:08:48 MSD 2005


My ejabberd server crashed due to an out of memory error.

Software:
ejabberd: 0.9.8
otp: R10B-6

OS:
Linux: 2.6.9

Hardware:
RAM: 4GB
Duel: 3.4Gz Xeon

Command line starting ejabberd:
erl -pa /var/lib/ejabberd/ebin  +P 250000 -env ERL_MAX_ETS_TABLES 20000
-env ERL_MAX_PORTS 15000 -name ejabberd -s ejabberd


Yesterday, I was running jab_simul with about 3000 simulated users
(logging in,loggin out, sending messages, adding/removing rosters, 
and changing presence).

Today I re-ran the same test without restarting ejabberd. About 30 min
into the test, ejabberd dumped core. The top of the core file is below.

It appears that I ran into the 2GB process space limit of my kernel.


The tuning page gives two options for reduced memory consumption
    erl -s ejabberd -shared ... 
        erl -s ejabberd -env ERL_FULLSWEEP_AFTER 0 ...

I'm putting the server under pretty constant high load. The tuning page
states that ERL_FULLSWEEP_AFTER is only useful if not under high load.

Q1) The -shared option states that this is experimental. Does anyone
have experience with how stable it is?


Q2) Erlang is a garbage collection language. Could a dump be caused by 
having free memory, but erlang has garbage collected it yet? Or will
elang on dump on no memory only if it can not allocate memory even
after garbage collecting. (i.e. was ejabberd really using 2GB of
RAM or was it just not garbage collecting fast enough due to
the load?)



The dump file stated that there were 1929538497 processes,
but then it listed only 6366 explicitly in the dump file:

  [ejabberd at xml04 ~]$ grep ^=proc: *dump  | wc -l
  6366

Q3) Did I really have 1929538497 processes? (or is this maybe the
total num of processes created over the lifetime of the server?)
If so, why did the dump file only have 6366 processes listed.

Q4) Is this memory problem inherent in the number of active users,
or can furthur tuning eliminate this problem.

Thank you,
Matt Reilly

******** TOP of ejabberd DUMP file ***********

=erl_crash_dump:0.1
Wed Aug 31 09:58:25 2005
Slogan: eheap_alloc: Cannot allocate 97953500 bytes of memory (of type
"old_heap").
System version: Erlang (BEAM) emulator version 5.4.8 [source] [hipe]
Compiled: Mon Aug 22 13:18:33 2005
Atoms: 8938
=memory
total: 2263662103
processes: 1929538497
processes_used: 1929425009
system: 334123606
atom: 409233
atom_used: 385436
binary: 8672908
code: 3498146
ets: 316854504

... a bunch of hash tables. All had less used than max size ...

=allocated_areas
processes: 1929425009 1929538497
ets: 316854504
sys_misc: 1440050
static: 3244153
atom_space: 131088 107339
binary: 8672908
atom_table: 61773
module_table: 1640
export_table: 30904
register_table: 252
fun_table: 3250
module_refs: 2048
loaded_code: 3186996
dist_table: 2243
node_table: 131
bits_bufs_size: 2
bif_timer: 13596
link_lh: 1080
dist_buf: 0
proc: 2205496 2140312
atom_entry: 216372 216324
export_entry: 214032 213264
module_entry: 8064 7304
reg_proc: 1960 1840
monitor_sh: 20928 1088
nlink_sh: 644228 615884
proc_list: 24 24
fun_entry: 52852 52740
db_tab: 280724 260724
=allocator:sys_alloc
option e: true
option m: libc
option tt: 131072
option tp: 0
=allocator:temp_alloc
versions: 0.9 2.1
option e: true
option sbct: 524288
option asbcst: 4145152
option rsbcst: 90
option rsbcmt: 80
option mmbcs: 65536
option mmsbc: 256
option mmmbc: 10
option lmbcs: 5242880
option smbcs: 1048576
option mbcgs: 10
option as: af
mbcs blocks: 0 9 9
mbcs blocks size: 0 392360 392360
mbcs carriers: 1 2 2
mbcs mseg carriers: 0
mbcs sys_alloc carriers: 1
mbcs carriers size: 65568 1318944 1318944
mbcs mseg carriers size: 0
mbcs sys_alloc carriers size: 65568
sbcs blocks: 0 0 0
sbcs blocks size: 0 0 0
sbcs carriers: 0 0 0
sbcs mseg carriers: 0
sbcs sys_alloc carriers: 0
sbcs carriers size: 0 0 0
sbcs mseg carriers size: 0
sbcs sys_alloc carriers size: 0
temp_alloc calls: 6605397
temp_free calls: 6605397
temp_realloc calls: 15056688
mseg_alloc calls: 40865
mseg_dealloc calls: 40865
mseg_realloc calls: 0
sys_alloc calls: 1
sys_free calls: 0
sys_realloc calls: 0
=allocator:sl_alloc
versions: 2.1 2.1
option e: true
option sbct: 524288
option asbcst: 4145152
option rsbcst: 80
option rsbcmt: 80
option mmbcs: 131072
ption mmsbc: 256
option mmmbc: 10
option lmbcs: 5242880
option smbcs: 1048576
option mbcgs: 10
option mbsd: 3
option as: gf
mbcs blocks: 444552 444552 444552
mbcs blocks size: 7120768 7120768 7120768
mbcs carriers: 6 6 6
mbcs mseg carriers: 5
mbcs sys_alloc carriers: 1
mbcs carriers size: 9576480 9576480 9576480
mbcs mseg carriers size: 9445376
mbcs sys_alloc carriers size: 131104
sbcs blocks: 0 0 0
sbcs blocks size: 0 0 0
sbcs carriers: 0 0 0
sbcs mseg carriers: 0
sbcs sys_alloc carriers: 0
sbcs carriers size: 0 0 0
sbcs mseg carriers size: 0
sbcs sys_alloc carriers size: 0
sl_alloc calls: 16377177
sl_free calls: 15932625
sl_realloc calls: 0
mseg_alloc calls: 10
mseg_dealloc calls: 5
mseg_realloc calls: 0
sys_alloc calls: 1
sys_free calls: 0
sys_realloc calls: 0
=allocator:std_alloc
versions: 0.9 2.1
option e: true
option sbct: 524288
option asbcst: 4145152
option rsbcst: 20
option rsbcmt: 80
option mmbcs: 131072
option mmsbc: 256
option mmmbc: 10
option lmbcs: 5242880
option smbcs: 1048576
option mbcgs: 10
option as: bf
mbcs blocks: 26002 26331 26331
mbcs blocks size: 1150128 1169904 1169904
mbcs carriers: 2 2 2
mbcs mseg carriers: 1
mbcs sys_alloc carriers: 1
mbcs carriers size: 1179680 1179680 1179680
mbcs mseg carriers size: 1048576
mbcs sys_alloc carriers size: 131104
sbcs blocks: 0 0 0
sbcs blocks size: 0 0 0
sbcs carriers: 0 0 0
sbcs mseg carriers: 0
sbcs sys_alloc carriers: 0
sbcs carriers size: 0 0 0
sbcs mseg carriers size: 0
sbcs sys_alloc carriers size: 0
std_alloc calls: 147725
std_free calls: 121723
std_realloc calls: 722
mseg_alloc calls: 1
mseg_dealloc calls: 0
mseg_realloc calls: 0
sys_alloc calls: 1
sys_free calls: 0
sys_realloc calls: 0
=allocator:ll_alloc
versions: 0.9 2.1
option e: true
option sbct: 4294967295
option asbcst: 0
option rsbcst: 0
option rsbcmt: 0
option mmbcs: 2097152
option mmsbc: 0
option mmmbc: 0
option lmbcs: 5242880
option smbcs: 1048576
option mbcgs: 10
option as: aobf
mbcs blocks: 2797 2797 2797
mbcs blocks size: 12958984 12958984 12958984
mbcs carriers: 9 9 9
mbcs mseg carriers: 0
mbcs sys_alloc carriers: 9
mbcs carriers size: 13631520 13631520 13631520
mbcs mseg carriers size: 0
mbcs sys_alloc carriers size: 13631520
sbcs blocks: 0 0 0
sbcs blocks size: 0 0 0
sbcs carriers: 0 0 0
sbcs mseg carriers: 0
sbcs sys_alloc carriers: 0
sbcs carriers size: 0 0 0
sbcs mseg carriers size: 0
sbcs sys_alloc carriers size: 0
ll_alloc calls: 2797
ll_free calls: 0
ll_realloc calls: 387
mseg_alloc calls: 0
mseg_dealloc calls: 0
mseg_realloc calls: 0
sys_alloc calls: 9
sys_free calls: 0
sys_realloc calls: 0
=allocator:eheap_alloc
versions: 2.1 2.1
option e: true
option sbct: 524288
option asbcst: 4145152
option rsbcst: 50
option rsbcmt: 80
option mmbcs: 524288
option mmsbc: 256
option mmmbc: 10
option lmbcs: 5242880
option smbcs: 1048576
option mbcgs: 10
option mbsd: 3
option as: gf
mbcs blocks: 11780 13607 13607
mbcs blocks size: 546294344 546617152 546617152
mbcs carriers: 600 602 602
mbcs mseg carriers: 10
mbcs sys_alloc carriers: 590
mbcs carriers size: 660086816 662183968 662183968
mbcs mseg carriers size: 41951232
mbcs sys_alloc carriers size: 618135584
sbcs blocks: 781 783 783
sbcs blocks size: 1274986808 1341749880 1341749880
sbcs carriers: 781 783 783
sbcs mseg carriers: 256
sbcs sys_alloc carriers: 525
sbcs carriers size: 1536073728 1600516096 1600516096
sbcs mseg carriers size: 530489344
sbcs sys_alloc carriers size: 1005584384
eheap_alloc calls: 14530749
eheap_free calls: 14518187
eheap_realloc calls: 1146997
mseg_alloc calls: 106413
mseg_dealloc calls: 106146
mseg_realloc calls: 59576
sys_alloc calls: 4912
sys_free calls: 3795
sys_realloc calls: 1039
=allocator:binary_alloc
versions: 0.9 2.1
option e: true
option sbct: 524288
option asbcst: 4145152
option rsbcst: 20
option rsbcmt: 80
option mmbcs: 131072
option mmsbc: 256
option mmmbc: 10
option lmbcs: 5242880
option smbcs: 1048576
option mbcgs: 10
option as: bf
mbcs blocks: 24844 28886 28886
mbcs blocks size: 8947840 10007848 10007848
mbcs carriers: 7 7 7
mbcs mseg carriers: 6
mbcs sys_alloc carriers: 1
mbcs carriers size: 12722208 12722208 12722208
mbcs mseg carriers size: 12591104
mbcs sys_alloc carriers size: 131104
sbcs blocks: 0 3 3
sbcs blocks size: 0 18502680 18502680
sbcs carriers: 0 3 3
sbcs mseg carriers: 0
sbcs sys_alloc carriers: 0
sbcs carriers size: 0 18644992 18644992
sbcs mseg carriers size: 0
sbcs sys_alloc carriers size: 0
binary_alloc calls: 18481822
binary_free calls: 18456978
binary_realloc calls: 3996096
mseg_alloc calls: 1471
mseg_dealloc calls: 1465
mseg_realloc calls: 5
sys_alloc calls: 1
sys_free calls: 0
sys_realloc calls: 0
=allocator:ets_alloc
versions: 0.9 2.1
option e: true
option sbct: 524288
option asbcst: 4145152
option rsbcst: 20
option rsbcmt: 80
option mmbcs: 131072
option mmsbc: 256
option mmmbc: 10
option lmbcs: 5242880
option smbcs: 1048576
option mbcgs: 10
option as: bf
mbcs blocks: 364559 367015 367015
mbcs blocks size: 319406624 322389144 322389144
mbcs carriers: 292 293 293
mbcs mseg carriers: 10
mbcs sys_alloc carriers: 282
mbcs carriers size: 324157472 325206048 325206048
mbcs mseg carriers size: 29376512
mbcs sys_alloc carriers size: 294780960
sbcs blocks: 0 0 0
sbcs blocks size: 0 0 0
sbcs carriers: 0 0 0
sbcs mseg carriers: 0
sbcs sys_alloc carriers: 0
sbcs carriers size: 0 0 0
sbcs mseg carriers size: 0
sbcs sys_alloc carriers size: 0
ets_alloc calls: 12548290
ets_free calls: 12183731
ets_realloc calls: 41048
mseg_alloc calls: 10
mseg_dealloc calls: 0
mseg_realloc calls: 0
sys_alloc calls: 302
sys_free calls: 20
sys_realloc calls: 0
=allocator:fix_alloc
option e: true
proc: 2205496 2140312
atom_entry: 216372 216324
export_entry: 214032 213264
module_entry: 8064 7304
reg_proc: 1960 1840
monitor_sh: 20928 1088
nlink_sh: 644228 615884
proc_list: 24 24
fun_entry: 52852 52740
db_tab: 280724 260724
=allocator:mseg_alloc
version: 0.9
option amcbf: 4194304
option rmcbf: 20
option mcs: 5
option cci: 1000
cached_segments: 0
cache_hits: 68668
segments: 289
segments_watermark: 289
mseg_alloc calls: 148770
mseg_dealloc calls: 148481
mseg_realloc calls: 59581
mseg_create calls: 80103
mseg_destroy calls: 113696
mseg_recreate calls: 4
mseg_clear_cache calls: 4
mseg_check_cache calls: 6966
=allocator:alloc_util
option mmc: 1024
option ycs: 1048576
=allocator:instr
option m: false
option s: false
option t: false

... A bunch of procs ...
=proc:<0.0.0>
State: Waiting
Name: init
...

and so on



-- 
Matthew Reilly
matthew.reilly at sipphone.com
Gizmo Project name: matt 




More information about the ejabberd mailing list