[ejabberd] Weird s2s problems in 2.0.1

Ben Lavender blavender at gmail.com
Tue Aug 26 17:18:12 MSD 2008


Hi all.  I'm having weird s2s problems in 2.0.1.

Basically, after a restart, everything s2s is wrong.  Our users
connect to my older, 1.1.4 server, or to google, and neither works
properly.  Sometimes you can get messages but not presence, sometimes
just presence but no messages, sometimes both, sometimes none.  Any of
these situations can also occur one-way: one client can send presence
and messages, and the other end can receive, but replies are not seen.
 Other functions work sporadically.  Sometimes messages can be sent
and received but MUC access doesn't work.  The two primary remote
systems do not always behave the same way:  sometimes one is fully
while one is only partially functional etc.

Sometimes messages sent to the 2.0.1 server by a client on the 1.1.4
server will get back 404 messages about 10 minutes later.

Our 1.1.4 server has had similar problems with gmail in the past, but
they usually disappeared and were done with by about two hours after a
reboot.  These 2.0.1 problems have been going about 7 hours since our
last restart.

The log files don't show anything particularly earth-shattering.  We
get a lot of stuff like this:

I(<0.1565.0>:ejabberd_s2s_out:1010) : Trying to open s2s connection:
example.com -> example.net
I(<0.1535.0>:ejabberd_s2s:362) : New s2s connection started <0.1566.0>
I(<0.1566.0>:ejabberd_s2s_out:1010) : Trying to open s2s connection:
example.com -> gmail.com
I(<0.262.0>:ejabberd_listener:112) : (#Port<0.5302>) Accepted
connection {{193,159,163,131},30850} -> {{10,0,20,155},5269}
I(<0.1565.0>:ejabberd_s2s_out:319) : Connection established:
example.com -> example.net
I(<0.1566.0>:ejabberd_s2s_out:319) : Connection established:
example.com -> gmail.com

Note in the middle, we can see this:
I(<0.262.0>:ejabberd_listener:112) : (#Port<0.5302>) Accepted
connection {{193,159,163,131},30850} -> {{10,0,20,155},5269}

That's my old 1.1.4 server (staged site connecting to live) connecting.

It doesn't seem to matter what the server says for connections being
established or not in terms of things working.

Something that definitely doesn't work is:
I(<0.1556.0>:ejabberd_s2s_out:1010) : Trying to open s2s connection:
example.com -> example.net
I(<0.1556.0>:ejabberd_s2s_out:306) : Closing s2s connection:
example.com -> example.net (close in
wait_for_stream)

I get this from gmail as well.

The 1.1.4 server (with a more or less identical config, differing only
in virtual hosts) has more information:

I(<0.229.0>:ejabberd_http:76): started: {gen_tcp,#Port<0.57915>}
I(<0.6848.1>:ejabberd_s2s_out:106): started: {"example.net",
                                             "example.com",
                                             {new,"xxxxx"}}
I(<0.6848.1>:ejabberd_s2s_out:466): starttls: {"example.net",
                                              "example.com"}
I(<0.234.0>:ejabberd_listener:90): (#Port<0.57921>) Accepted
connection {{193,159,163,157},8349} -> {{10,0,6,74},5269}
I(<0.6850.1>:ejabberd_s2s_in:105): started: {gen_tcp,#Port<0.57921>}
I(<0.6850.1>:ejabberd_s2s_in:222): starttls
I(<0.6850.1>:ejabberd_s2s_in:317): GET KEY: {"example.net",
                                            "example.com",
                                            [],
                                            "xxxxx"}
I(<0.6852.1>:ejabberd_s2s_out:106): started: {"example.net",
                                             "example.com",
                                             {verify,
                                              <0.6850.1>,
                                              "xxxx",
                                              "xxxxx"}}
I(<0.6852.1>:ejabberd_s2s_out:466): starttls: {"example.net",
                                              "example.com"}
I(<0.6852.1>:ejabberd_s2s_out:260): recv verify: {"example.com",
                                                 "example.net",
                                                 "xxxx5",
                                                 "valid"}

This log snippet may or may not be followed by correct behavior.  No
real rhyme or reason to it.

My SRV records are correct.

Is there something obvious I'm missing here?  Relevant configuration
bits follow.

Any help appreciated,
Ben



Config snippets:

% Used modules:
{modules,
 [
  {mod_register,   [{access, register}]},
  {mod_roster,     []},
  {mod_privacy,    []},
  {mod_adhoc,      []},
  {mod_configure,  []}, % Depends on mod_adhoc
  {mod_configure2, []},
  {mod_disco,      []},
  {mod_stats,      []},
  {mod_vcard,      []},
  {mod_offline,    []},
  {mod_announce,   [{access, announce}]}, % Depends on mod_adhoc
  {mod_echo,       [{host, "echo.jabber2.cp"}]},
  {mod_private,    []},
  {mod_irc,        []},
  {mod_http_bind,        []},
  {mod_archive,        []},
 {mod_presence_redirect, [
                           {server, "website.com"},
                           {port, 80},
                           {uri, "/xmlrpc.php"},
                           {method, xmpp_relationships.update_presence}
                          ]}, % redirect the users presence

   {mod_muc_msg_redirect, [
                           {server, "website.com"},
                           {port, 80},
                           {uri, "/xmlrpc.php"},
                           {method, xmpp_node_muc.room_log}
                          ]},
% Default options for mod_muc:
%   host: "conference." ++ ?MYNAME
%   access: all
%   access_create: all
%   access_admin: none (only room creator has owner privileges)
  {mod_muc,        [{access, muc}, {access_persistent, muc},
{access_create, muc}, {access_admin, muc_admin},
                   {default_room_options, [
                   {allow_change_subj, true},
                   {public, true},
                   {public_list, true},
                   {persistent, true}
                  ]}]},
%  {mod_muc_log,    []},
%  {mod_shared_roster, []},
  {mod_pubsub,     [{access_createnode, pubsub_createnode}]},
  {mod_time,       []},
  {mod_last,       []},
  {mod_xmlrpc,[{port, 4560},{timeout, 5000}]},
  {mod_version,    []}
 ]}.

% Listened ports:
{listen, [
	  %% Use those lines instead for TLS support:
	  {5222, ejabberd_c2s,     [{access, c2s}, {shaper, c2s_shaper},
starttls, {certfile, "/opt/ejabberd/etc/ejabberd/server.pem"}]},

	  %% Remove this line if you want to prevent s2s connections:
      {5269, ejabberd_s2s_in,  [{shaper, s2s_shaper},
{max_stanza_size, 131072}]},
	
	  %% remove http_poll to remove support for http polling
	  %% remove web_admin to disable admin interface:
	  {5280, ejabberd_http,    [http_bind,http_poll, web_admin]}
	  %% This is an example on how to define an external service/transport:
	  %%{8888, ejabberd_service, [{access, all},
	  %%        {hosts, ["icq.jabber2.cp", "sms.jabber2.cp"],
	  %%        [{password, "secret"}]}]}
         ]}.

% If SRV lookup fails, then port 5269 is used to communicate with remote server
{outgoing_s2s_port, 5269}.

%% testing
{s2s_use_starttls,true}.
{s2s_certfile,"/opt/ejabberd/etc/ejabberd/server.pem"}.

{hosts,["example.com","example.net"]}.


{host_config, "example.com", [{auth_method, ldap},
                              {ldap_servers, ["ldap"]},
                              {ldap_uidattr, "uid"},
                              {ldap_base, "ou=example.com,dc=domain,dc=com"},
                              {ldap_rootdn, "cn=admin,dc=domain,dc=com"},
                              {ldap_password, "passitywordity"}]}.


{host_config, "example.net", [{auth_method, ldap},
                              {ldap_servers, ["ldap"]},
                              {ldap_uidattr, "uid"},
                              {ldap_base, "ou=example.net,dc=domain,dc=com"},
                              {ldap_rootdn, "cn=admin,dc=domain,dc=com"},
                              {ldap_password, "passitywordity"}]}.


More information about the ejabberd mailing list