Apache Commons HTTPClient: "Response content length is not known" Warning from HttpMethodBase.readResponseBody()

| No TrackBacks
This post is an attempt to document a strange occurrence when using the Apache Commons HTTPClient to establish an HTTPS connection through a web-proxy.

In my travels at work, I recently came across an interesting situation using the Apache Commons HTTPClient library.  For a project, I'm using the Commons HTTPClient to open a TCP based tunnel through a web-proxy.  Unfortunately, I have to use a proxy because my employer forces me to use one for all outgoing HTTP traffic.  This means that when I want to establish a secure tunnel to a web-server outside of my employer's corporate firewall, the Apache Commons HTTPClient must open a TCP tunnel through my employer's web-proxy.  In other words, when I need the HTTPClient to connect to a secure web-site with HTTPS, it needs to open a TCP tunnel through the proxy.

In doing so, I see a ton of these casual INFO messages from the HTTPClient library in my log files:

INFO: Response content length is not known Apr 10, 2009 1:12:26 AM
org.apache.commons.httpclient.HttpMethodBase readResponseBody
INFO: Response content length is not known Apr 10, 2009 1:12:29 AM
org.apache.commons.httpclient.HttpMethodBase readResponseBody

It looks like the warning is coming from the readResponseBody() method of the HttpMethodBase class.  What gives, man?
From what I understand, the HTTPClient is whining that the HTTP response doesn't contain a Content-Length header.  Initially, I thought that the web-server I was communicating with via HTTPS wasn't obeying some standard HTTP rules.  However, it wasn't the web-server.  Rather the warning message was generated when the HTTPClient tries to open a TCP tunnel through the proxy using the CONNECT method.  Based on my analysis, here's what happens when you use HTTPS through a web-proxy:

:: browser sends a CONNECT request to the configured web-proxy

CONNECT server.example.com:443 HTTP/1.1
User-Agent: Mozilla/5.0

:: web-proxy establishes TCP tunnel to server.example.com:443
:: web-proxy returns Connection established to browser

HTTP/1.1 200 Connection established
Content-Length: 0
Connection: Keep-Alive

:: the Content-Length header above is missing in the
:: actual response. Plus, the proxy is returning a
:: Connection: Keep-Alive header which supposedly violates
:: the protocol here.

:: browser starts sending data through tunnel to secure server

So, the problem here appears to be a missing "Content-Length" header and an invalid "Connection: Keep-Alive" header on the HTTP/1.1 response.  The corporate web-proxy I'm using does NOT return a Content-Length header when it establishes a connection.  I claim that when the HTTPClient processes the response (tries to read the response body) it doesn't know how many bytes to read, because the Content-Length header is missing.  Or, it's seeing the Connection: Keep-Alive header (expecting a Connection: close instead?) and kinda freaks out.

In this case, it's possible that one of two things are happening:

  1. The web-proxy is violating HTTP and isn't returning a Content-Length header on the response when it should be.  However, I couldn't find any official specs or documentation on opening TCP tunnels through a web-proxy other than this draft spec dated August 1998.  This spec does not state if a HTTP/1.1 200 Connection established response must include a Content-Length header.

  2. The HTTPClient is wrong, and lazily checking all responses for a Content-Length header following a successful HTTP/1.1 200 Connection established.  If the Content-Length header doesn't exist, then it logs the warning.

That's about all I know on this problem, and I'm not immediately sure who is wrong: the proxy, or the HTTPClient?  If you know more about this than I do, please let me know.

Cheers.

Did You Find this Helpful?

Did you find this post helpful, or at least, interesting?

  

Send Mark a Direct Message

If you'd like to send me a direct message, please do so below. However, I do not publicly post comments or messages submitted directly to me. So, if you're going to try to SPAM me, or my blog, you're pretty much wasting your time.

400 characters remaining

Error

About Mark

A Silicon Valley native, Mark Kolich is a full-time Software Engineer, a casual entrepreneur, and a consultant for hire. A web technologies expert, his current focus is on building powerful and robust cloud-driven web-applications using Java, PHP, Perl, AJAX, DHTML, CSS, and JavaScript. His favorite programming languages are PHP, Java and JavaScript. He uses Linux, enjoys biking to work, loves building great software, and always writes elegant, readable, and maintainable code.

No TrackBacks

No trackbacks attached to this entry.

Twitter (@markkolich)

Translate

About this Entry

This page contains a single entry by Mark Kolich published on July 9, 2009 10:30 PM.

Use JavaScript to Hide the iPhone Safari Toolbar/Addressbar was the previous entry in this blog.

MySQL Trigger: ERROR 1442 (HY000): Cant update table 'tbl' in stored function/trigger because it is already used by statement which invoked this stored function/trigger is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.