views:

100

answers:

3

I am calling ::connect() on a special application port in my application and it works fine in general, however, between two particular machines, from one to the other, it fails with EHOSTUNREACH which means "No route to host."

If I can ssh on port 22 with no problems, what might be going on here that ::connect() always fails for this particular machine pair?

Running ssh in verbose mode yields:

[localMachine ~] ssh -v -p 22 remoteMachine
OpenSSH_3.9p1, OpenSSL 0.9.7a Feb 19 2003
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug1: Connecting to remoteMachine [10.34.49.107] port 22.
debug1: Connection established.
debug1: identity file /home/WilliamKF/.ssh/identity type -1
debug1: identity file /home/WilliamKF/.ssh/id_rsa type 1
debug1: identity file /home/WilliamKF/.ssh/id_dsa type -1
debug1: Remote protocol version 2.0, remote software version OpenSSH_4.3
debug1: match: OpenSSH_4.3 pat OpenSSH*
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_3.9p1
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: server->client aes128-cbc hmac-md5 none
debug1: kex: client->server aes128-cbc hmac-md5 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug1: Host 'remoteMachine' is known and matches the RSA host key.
debug1: Found key in /home/WilliamKF/.ssh/known_hosts:47
debug1: ssh_rsa_verify: signature correct
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,gssapi-with-mic,password
debug1: Next authentication method: gssapi-with-mic
debug1: Authentications that can continue: publickey,gssapi-with-mic,password
debug1: Authentications that can continue: publickey,gssapi-with-mic,password
debug1: Next authentication method: publickey
debug1: Trying private key: /home/WilliamKF/.ssh/identity
debug1: Offering public key: /home/WilliamKF/.ssh/id_rsa
debug1: Server accepts key: pkalg ssh-rsa blen 149
debug1: read PEM private key done: type RSA
debug1: Authentication succeeded (publickey).
debug1: channel 0: new [client-session]
debug1: Entering interactive session.

Here is the function on the client side:

void // virtual
Sender::connectTCP()
{
  // First build the feedback channel's socket & make it reuseable
  // so we don't get the nasty message.
  if (0 > (setFbSocket(socket(AF_INET, SOCK_STREAM, 0)))) {
    THROW_ERR("failed to create the command socket: ");
  }

  setSocketOptions();

  // Build the localIp address and bind it to the feedback socket.
  // Although it's not traditional for a client to bind the sending socket
  // to a the local address, we do it to prevent connect() from using an
  // ephemeral port which (our site's firewall may block). Also build the
  // remoteIp address.
  buildAddr(getTCPcommandLocalAddr(), getLocalHost().c_str(),
            getLocFbPort());
  deepBind(getFbSocket(), getTCPcommandLocalAddr());
  buildAddr(getTCPcommandRemoteAddr(), getRemoteHost().c_str(),
            getRemFbPort());

  // Connect to the receiver at the remote addr.  Make multiple attempts
  // when we get a connection refused errno (ECONNREFUSED).  ECONNREFUSED
  // means no one is listening at the other end ... which my be the result
  // of a race condition (i.e., we're calling connect before the server has
  // gotten to listen.)
  const int timeoutMinutes = 5;
  const int timeoutSeconds = timeoutMinutes * 60;
  int conCount = timeoutSeconds;

  while ((conCount > 0) &&
         (0 > ::connect(getFbSocket(),
                        (sockaddr*)&getTCPcommandRemoteAddr(),
                        sizeof(sockaddr)))) {
    switch (errno) {
      case ECONNREFUSED: {
        ::sleep(1);
        --conCount;
        // Warn every 15 seconds, but don't warn at 5 minutes exactly.
        if ((conCount % 15) == 0 && conCount) {
          clog << "Warning: The server connection"
               << " has been refused for "
               << timeFromSeconds(timeoutSeconds - conCount)
               << ", will continue to retry  for up to "
               << timeoutMinutes << " minutes.\n"
               << "Perhaps ports " << getRemFbPort() << " and "
               << getRemDataPort()
               <<
            " are not being routed properly to the server or alternatively "
            "perhaps nothing on the server is listening to those ports?\n";
        }
        continue;
      }
      case EHOSTUNREACH: {
        clog << "Error: Command connect failed: No route to host '"
             << getRemoteHost() << "'." << endl;
        throw;
      }
      default: {
        clog << "Error: Command connect failed: "
             << strerror(errno) << endl;
        throw;
      }
    }
  }
  if (conCount == 0) {
    clog << "Error: Command connect"
         << " continually refused after retrying for " << timeoutMinutes
         << " minutes: "
         << strerror(errno) << endl;

    throw;
  }

  setCmdBlocking();
  setDataBlocking();
  setFbIsConn(true);

  clog << "Application has connected to "
       << getRemoteHost() << ":" << getRemFbPort() << endl;
}
+2  A: 

Could you be running into a firewall filtering on destination port? Are you attempting a connection to port 22 or the port your sshd is running on; or some other port?

Heath Hunnicutt
First it connects via ssh on port 22, where it negotiates the port for which ::connect() is called.
WilliamKF
The destination port was being blocked by a firewall for the remote server.
WilliamKF
Sweet; glad you found it. :) I guessed that because I associate "host unreachable" and "network unreachable" with firewalls these days.
Heath Hunnicutt
A: 

Something strange with name resolution? (Might you be getting tripped up by a hosts file? by an AAAA (IPv6) record in the DNS which ssh is ignoring? ssh_config?) Might be worthwhile running ssh in verbose mode to see what host it's connecting to.

crazyscot
+1  A: 

It appears that you're explicitly binding client's end of the socket -- see if the address you're binding to happens to be unreachable from the server box (e.g. due to a routing problems). To be able to tell this from a potential firewall problem (i.e. firewall allowing port 22, but denying your application port), try to telnet to the target host:port rather than ssh-ing to port 22.

caseq
Telnet to the port upon which connect fails also gets the "No route to host." error.
WilliamKF