views:

97

answers:

1

hi all, I'm using the apache commons 4.x HTTPClient to make HEAD requests to URIs only to get the final post 302 URL location of that link. E.g: http://bit.ly/test1231 really points to cnn.com or something. What would be the best and most efficient way using HttpClient to achieve this in a server that could run for months with out leaking? Right now I'm running into the issue that every x minutes all the threads freeze while trying to pull a connection out of the pool and they all time out.

I'm planning on having 100 worker threads doing the fetching, so I was using the Threaded connection manager.

UPDATE Here is the Code I'm using to get an httpClient object

HttpParams httpParams = new BasicHttpParams();

HttpConnectionParams.setConnectionTimeout(httpParams, 5000);

HttpConnectionParams.setSoTimeout(httpParams, 5000);

ConnManagerParams.setMaxTotalConnections(httpParams, 5000);

HttpProtocolParams.setVersion(httpParams, HttpVersion.HTTP_1_1);



ConnManagerParams.setMaxConnectionsPerRoute(httpParams, new ConnPerRoute() {

   @Override

   public int getMaxForRoute(HttpRoute route) {

     return 35;

   }

 });

emptyCookieStore = new CookieStore() {

    @Override

    public void addCookie(Cookie cookie) {



    }

    ArrayList<Cookie> emptyList = new ArrayList<Cookie>();



    @Override

    public List<Cookie> getCookies() {

      return emptyList;

    }

    @Override

    public boolean clearExpired(Date date) {

      return false;

    }



    @Override

    public void clear() {

    }

  };



  // set request params

  httpParams.setParameter("http.protocol.cookie-policy", CookiePolicy.BROWSER_COMPATIBILITY);

  httpParams.setParameter("http.useragent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)");

  httpParams.setParameter("http.language.Accept-Language", "en-us");

  httpParams.setParameter("http.protocol.content-charset", "UTF-8");

  httpParams.setParameter("Accept", "application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5");

  httpParams.setParameter("Cache-Control", "max-age=0");

  SchemeRegistry schemeRegistry = new SchemeRegistry();

  schemeRegistry.register(new Scheme("http", PlainSocketFactory.getSocketFactory(), 80));

  schemeRegistry.register(new Scheme("https", PlainSocketFactory.getSocketFactory(), 443));

  final ClientConnectionManager cm = new ThreadSafeClientConnManager(httpParams,schemeRegistry);



  DefaultHttpClient httpClient = new DefaultHttpClient(cm, httpParams);

  httpClient.getParams().setParameter("http.conn-manager.timeout", 120000L);

  httpClient.getParams().setParameter("http.protocol.wait-for-continue", 10000L);

  httpClient.getParams().setParameter("http.tcp.nodelay", true);
A: 

Most likely you have too many worker threads contending for very few connections. Please make sure the maximum connections per route limit is set to a reasonable value (Per default the limit is set to two concurrent connections as required by the HTTP specification)

oleg
thanks oleg, I have updated to show my code... I currently have it set to 35 per host, however if I have 100 threads all trying to hit bit.ly at the same time perhaps I should use 100?
It is usually recommended to have the same number of worker threads as maximum connections per route.
oleg
should I set setConnectionStaleCheckingEnabled to false? If so how is that done using httpclient 4.x?
think I found it: httpParams.setParameter("http.connection.stalecheck", false);