[ejabberd] BOSH requests occasionally silently fail in ejabberd_http; connection stays open

Daniel Dormont dan at greywallsoftware.com
Fri Jul 29 01:00:32 MSD 2011


Just an update. After doing some further testing including adding a bunch
more debug messages to ejabberd_http (thank God for hot code loading!) I now
suspect that the problem is not internal to Ejabberd after all. Rather, it
looks like if the user's machine is under sufficiently heavy load, the
particular BOSH client library we're using (JsJac) doesn't send a new
http-bind/ request after the previous one completes, or at any rate doesn't
send one within max_inactivity, which is 30 seconds by default.

So I owe an apology for spamming you guys with a premature conclusion. But I
am still facing a dilemma and could use some advice.

I could increase max_inactivity, but then the problem is it's harder to
detect when people really *do* disconnect.

So let me ask the group a different question: for those of you who work with
http_bind, have you ever experienced a problem like this? Namely, sessions
timing out prematurely due to posts from the client either not being sent or
not arriving in time. Also, out of curiosity, do you generally use JSJac,
strophe, or something else?

thanks,
Dan

On Thu, Jul 21, 2011 at 6:09 PM, Daniel Dormont <dan at greywallsoftware.com>wrote:

> Hi folks,
>
> I'm running Ejabberd 2.1.5 since that's the latest available on Ubuntu. I
> apologize if this was a known issue that was later fixed, but I didn't see a
> reference to it.
>
> Here's the problem: from time to time, http-bind requests that come in seem
> to disappear without getting processed. For example, here is a snippet of a
> successful http-bind request from sid
> 289ce391663d9cdd0805a71c63d9ec78ba95d6f1
>
> -------
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> I(<0.472.0>:ejabberd_listener:232) : (#Port<0.97636>) Accepted connection
> {{10,0,0,16},53683} -> {{10,0,0,100},5280}
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10341.1>:ejabberd_http:136) : S: [{["rest"],mod_rest},
>                                        {["admin"],ejabberd_web_admin},
>                                        {["http-bind"],mod_http_bind},
>                                        {["http-poll"],ejabberd_http_poll}]
>
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> I(<0.10341.1>:ejabberd_http:138) : started: {gen_tcp,#Port<0.97636>}
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10341.1>:ejabberd_http:255) : (#Port<0.97636>) http query: 'POST'
> /http-bind/
>
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10341.1>:ejabberd_http:421) : client data: "<body rid='230203'
> sid='289ce391663d9cdd0805a71c63d9ec78ba95d6f1' xmlns='
> http://jabber.org/protocol/httpbind'
> key='36bc2bd0cb4bdcae4dc9745dc2905316c679e61f' />"
>
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10341.1>:ejabberd_http:324) : ["http-bind"] matches ["http-bind"]
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10341.1>:mod_http_bind:67) : Incoming data: <body rid='230203'
> sid='289ce391663d9cdd0805a71c63d9ec78ba95d6f1' xmlns='
> http://jabber.org/protocol/httpbind'
> key='36bc2bd0cb4bdcae4dc9745dc2905316c679e61f' />
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10341.1>:ejabberd_http_bind:1123) : --- incoming data ---
> <body rid='230203' sid='289ce391663d9cdd0805a71c63d9ec78ba95d6f1' xmlns='
> http://jabber.org/protocol/httpbind'
> key='36bc2bd0cb4bdcae4dc9745dc2905316c679e61f' />
>  --- END ---
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10341.1>:ejabberd_http_bind:806) : Looking for session:
> "289ce391663d9cdd0805a71c63d9ec78ba95d6f1"
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10177.1>:ejabberd_http_bind:428) : New request: {http_put,230203,
>                                                       [{"rid","230203"},
>                                                        {"sid",
>
>  "289ce391663d9cdd0805a71c63d9ec78ba95d6f1"},
>                                                        {"xmlns",
>                                                         "
> http://jabber.org/protocol/httpbind"},
>                                                        {"key",
>
>  "36bc2bd0cb4bdcae4dc9745dc2905316c679e61f"}],
>                                                       [],159,1,[],
>                                                       {{10,0,0,16},53683}}
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10177.1>:shaper:61) : State: {maxrate,1000,3.0481885112989318,
>                                       1311280274884429}, Size=159
> M=79.62135044283698, I=60220.02
>
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10177.1>:ejabberd_http_bind:568) : New request: {http_put,230203,
>                                                       [{"rid","230203"},
>                                                        {"sid",
>
>  "289ce391663d9cdd0805a71c63d9ec78ba95d6f1"},
>                                                        {"xmlns",
>                                                         "
> http://jabber.org/protocol/httpbind"},
>                                                        {"key",
>
>  "36bc2bd0cb4bdcae4dc9745dc2905316c679e61f"}],
>                                                       [],159,1,[],
>                                                       {{10,0,0,16},53683}}
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10177.1>:ejabberd_http_bind:869) : Previous rid / New rid:
> 230202/230203
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10177.1>:ejabberd_http_bind:605) : Actually processing request:
> {http_put,
>
>  230203,
>
>  [{"rid",
>
>  "230203"},
>
> {"sid",
>
>  "289ce391663d9cdd0805a71c63d9ec78ba95d6f1"},
>
> {"xmlns",
>                                                                         "
> http://jabber.org/protocol/httpbind"},
>
> {"key",
>
>  "36bc2bd0cb4bdcae4dc9745dc2905316c679e61f"}],
>
>  [],159,
>                                                                       1,[],
>
>  {{10,0,
>
>  0,16},
>
> 53683}}
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10177.1>:ejabberd_http_bind:621) : Key/OldKey/NextKey:
> 36bc2bd0cb4bdcae4dc9745dc2905316c679e61f/bf98e8c332e3d098bb0227d3bad081c63852b3ad/bf98e8c332e3d098bb0227d3bad081c63852b3ad
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10177.1>:ejabberd_http_bind:668) :  -- SaveKey:
> 36bc2bd0cb4bdcae4dc9745dc2905316c679e61f
>
>
> =INFO REPORT==== 2011-07-21 16:32:15 ===
> D(<0.10177.1>:ejabberd_http_bind:686) : reqlist: [{hbr,230203,
>
> "bf98e8c332e3d098bb0227d3bad081c63852b3ad",
>                                                    []},
>                                                   {hbr,230202,
>
> "bf98e8c332e3d098bb0227d3bad081c63852b3ad",
>                                                    []}]
>
>
> ----------
>
> Other similar requests preceded this one. But then, there's one last
> request that has a very different behavior. The request headers as reported
> by the client are:
>
> POST /http-bind/ HTTP/1.1
> Accept */*
> Content-Type text/xml; charset=utf-8
> Referer  [hidden]
> Accept-Language en-us
> Accept-Encoding gzip, deflate
> User-Agent Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64;
> Trident/5.0)
> Host 10.0.2.2
> Content-Length 159
> Connection Keep-Alive
> Cache-Control no-cache
>
> And the body is:
>
> <body rid='230204' sid='289ce391663d9cdd0805a71c63d9ec78ba95d6f1' xmlns='
> http://jabber.org/protocol/httpbind'
> key='89cfd62cf24f72deffe5f0409a4d52611b71335e' />
>
> I know it's the same session because the IP address and timestamp matches
> exactly, and the rid is the next in the sequence. Here's the corresponding
> log from the ejabberd side:
>
> --------
>
>  =INFO REPORT==== 2011-07-21 16:33:15 ===
> I(<0.472.0>:ejabberd_listener:232) : (#Port<0.97650>) Accepted connection
> {{10,0,0,16},53685} -> {{10,0,0,100},5280}
>
> =INFO REPORT==== 2011-07-21 16:33:15 ===
> D(<0.10349.1>:ejabberd_http:136) : S: [{["rest"],mod_rest},
>                                        {["admin"],ejabberd_web_admin},
>                                        {["http-bind"],mod_http_bind},
>                                        {["http-poll"],ejabberd_http_poll}]
>
>
> =INFO REPORT==== 2011-07-21 16:33:15 ===
> I(<0.10349.1>:ejabberd_http:138) : started: {gen_tcp,#Port<0.97650>}
>
> ---------
>
> That's it. As you can see, it seems that ejabberd_http just stops
> processing the connection at this point. 30 seconds later,
> ejabberd_http_bind times out the session.
>
> This request is not actually coming directly from a client; it's being
> proxied by Apache. But the BOSH "wait" parameter I'm using is 60 seconds,
> and the Apache proxy timeout is configured at 300, so I don't think that's
> the issue. Interestingly though, it seems at least from Apache's standpoint
> that ejabberd never closes the connection and in fact five minutes later
> reports an error like:
>
> [Thu Jul 21 16:38:14 2011] [error] (70007)The timeout specified has
> expired: proxy: prefetch request body failed to 10.0.0.100:5280(10.0.0.100) from 127.0.0.1 ()
>
> Any advice or  pointers would be much appreciated. Also, I checked
> erlang.log for a crash report and there was nothing there even remotely
> close to the relevant timestamp.
>
> thanks,
> Dan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.jabber.ru/pipermail/ejabberd/attachments/20110728/d6869db9/attachment-0001.html>


More information about the ejabberd mailing list