Content tagged: apache

Configuring Apache to Tunnel SSH Through an HTTP Web-Proxy with Proxytunnel

c46fbf62b310fb6baa264336e9c4815199d3284a

Sat Dec 31 13:00:00 2011 -0800

Here’s the situation, I’m often on a network that does not allow outbound traffic on port 22. Meaning, I cannot directly “SSH out” from that network to my Linux box at home. Fair enough. However, this network does allow outbound traffic on ports 80, 443, and 8443 via a web-proxy. That said, if I want to “SSH out” from this network to my Linux box at home, I can do so with a little tweaking of my remote Apache server and my local SSH client.

Here’s how …

Overview

First, you’ll need to configure your Apache web-server to accept traffic on a port that’s acceptable to the web-proxy. In my case, I don’t have anything running on port 8443, and the web-proxy allows traffic through port 8443, so that’s perfect. Apache will be configured to listen on 8443, and act as a “proxy” between an SSH client and an SSH server (usually the SSH server running on the box you’re trying to connect to).

Second, on the client side, you’ll be using something like Proxytunnel to punch a hole through the web-proxy allowing your SSH client to connect to an SSH server of your choice.

Putting it all together, the basic flow is …

  1. Your local SSH client uses Proxytunnel to connect to web-proxy.corp.example.com:3128
  2. Web-proxy.corp.example.com connects to Apache running at yourwebserver:8443
  3. Your Apache server, acting as yet another proxy, connects to yoursshserver:22
  4. It works!

Install and Configure Proxytunnel

If you’re on Ubuntu, you can install Proxytunnel with the following command:

#/> sudo apt-get install proxytunnel

Once installed, edit your ~/.ssh/config file to instruct your SSH client to use Proxytunnel when connecting to the destination host:

## ~/.ssh/config

Host kolich.com
  Hostname kolich.com
  ProtocolKeepAlives 30
  ProxyCommand /usr/bin/proxytunnel \
    -p web-proxy.corp.example.com:3128 \
    -r kolich.com:8443 -d %h:%p \
    -H "User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Win32)"

In this example, when my SSH client makes an attempt to connect to kolich.com:22, it will spawn Proxytunnel which will then route my connection through web-proxy.corp.example.com:3128 and on to kolich.com:8443. This seems really convoluted, but it actually works quite well.

Note that I’m spoofing a somewhat real User-Agent to prevent suspicion from the system administrators running web-proxy.corp.example.com. If you’re a system administrator that runs such a web-proxy, please accept my apologies for making your life even more difficult.

Configure mod_proxy on Apache

Now that you’ve got the client part figured out, you’ll need to configure Apache’s mod_proxy module to proxy traffic between yourwebserver:8443 and yoursshserver:22. In all likelihood, your web-server and SSH server are the same box. At least, in my home, they are.

Oh, and I assume you already have Apache and mod_proxy installed, and working. There are ton of other tutorials and nice blog posts online about how to install and setup Apache if you don’t already have it installed and functional.

In my Apache virtual host configuration, I’ve added another V-host listening on port 8443 that will only accept CONNECT requests bound for kolich.com on port 22:

## Load the required modules.
LoadModule proxy_http_module modules/mod_proxy_http.so
LoadModule proxy_connect_module modules/mod_proxy_connect.so

## Listen on port 8443 (in addition to other ports like 80 or 443)
Listen 8443

<VirtualHost *:8443>

  ServerName youwebserver:8443
  DocumentRoot /some/path/maybe/not/required
  ServerAdmin admin@example.com

  ## Only ever allow incoming HTTP CONNECT requests.
  ## Explicitly deny other request types like GET, POST, etc.
  ## This tells Apache to return a 403 Forbidden if this virtual
  ## host receives anything other than an HTTP CONNECT.
  RewriteEngine On
  RewriteCond %{REQUEST_METHOD} !^CONNECT [NC]
  RewriteRule ^/(.*)$ - [F,L]

  ## Setup proxying between youwebserver:8443 and yoursshserver:22

  ProxyRequests On
  ProxyBadHeader Ignore
  ProxyVia Full

  ## IMPORTANT: The AllowCONNECT directive specifies a list
  ## of port numbers to which the proxy CONNECT method may
  ## connect.  For security, only allow CONNECT requests
  ## bound for port 22.
  AllowCONNECT 22

  ## IMPORTANT: By default, deny everyone.  If you don't do this
  ## others will be able to connect to port 22 on any host.
  <Proxy *>
    Order deny,allow
    Deny from all
  </Proxy>

  ## Now, only allow CONNECT requests bound for kolich.com
  ## Should be replaced with yoursshserver.com or the hostname
  ## of whatever SSH server you're trying to connect to.  Note
  ## that ProxyMatch takes a regular expression, so you can do
  ## things like (kolich\.com|anothersshserver\.com) if you want
  ## to allow connections to multiple destinations.
  <ProxyMatch (kolich\.com)>
    Order allow,deny
    Allow from all
  </ProxyMatch>

  ## Logging, always a good idea.
  LogLevel warn
  ErrorLog logs/yourwebserver-proxy_error_log
  CustomLog logs/yourwebserver-proxy_request_log combined

</VirtualHost>

Once you get everything integrated, restart Apache and you should be golden.

Under the Hood

To prove that everything works, let’s try a few things.

First I’m going to telnet to web-proxy.corp.example.com:3128. Then, I’m going to tell it to connect to kolich.com:8443. Finally, I’m going to tell Apache on kolich.com:8443 to connect to kolich.com:22. This is exactly the same flow used by Proxytunnel under the hood.

(mark@ubuntu)~> telnet web-proxy.corp.example.com 3128
Trying 10.10.10.10...
Connected to web-proxy.corp.example.com (10.10.10.10).
Escape character is '^]'.
CONNECT kolich.com:8443 HTTP/1.1
Host: kolich.com

HTTP/1.0 200 Connection Established

CONNECT kolich.com:22 HTTP/1.1
Host: kolich.com

HTTP/1.0 200 Connection Established
Proxy-agent: Apache

SSH-2.0-OpenSSH_4.3

Sweet! Notice the raw “SSH-2.0-OpenSSH_4.3” response from the SSH server, indicating a successful connection. Now, If I was a real SSH client, I’d continue the handshake and away we go.

So, from a real SSH client with Proxytunnel enabled …

(mark@ubuntu)~> ssh mark@kolich.com
Via web-proxy.corp.example.com:3128 -> kolich.com:8443 -> kolich.com:22
mark@kolich.com's password:

Last login: Sat Dec 31 12:53:22 2011 from gateway.kolich.local
(mark@server)~>

It works! Notice the intermediate “Via web-proxy.corp.example.com:3128 -> kolich.com:8443 -> kolich.com:22” output from Proxytunnel telling me what it’s doing to connect. And of course, look at that beautiful shell prompt.

SSH through a web-proxy, I love it.

Enjoy.

Set the Cache-Control and Expires Headers on a Redirect with mod_rewrite

f13c4dfa40a5e39b6a36d8627d67ff6148580096

Sat Dec 11 12:55:00 2010 -0800

In many web-service infrastructures, it’s often desirable to disable the caching of redirects. Specifically, you might want to set the Expires or Cache-Control headers so that your 301 or 302 redirects from Apache’s mod_rewrite are never cached upstream. Off the top of my head, I can think of a number of reasons why you might want to prevent the caching of a redirect:

  • Your redirect may change from one request, to the next. Disable caching so the client (the browser) isn’t redirected to the same destination every time.
  • Your web-application is behind a reverse caching proxy, and you don’t want the caching proxy to cache the redirect.
  • In development, you’re sitting behind a corporate web-proxy that is notorious for caching content when it really shouldn’t. Disable caching on the redirects so you can verify that your web-application is working as expected during testing (assuming the web-proxy obeys your Cache-Control and Expires headers).
  • Your web-application counts how many times someone is redirected. Disable caching so your click-through statistics are a bit more accurate.

Surprisingly, this seemingly common need isn’t well documented in the official Apache docs. So, here’s how to do it.

In this example, I’m redirecting based on the Host. If the incoming request does not match the Host I require, mod_rewrite triggers a 301 redirect to the correct Host. Of course, your RewriteCond’s might be different.

RewriteCond %{HTTP_HOST} !^mark\.koli\.ch [NC]
RewriteRule ^/(.*)$ http://mark.koli.ch/$1 [R=301,L,E=nocache:1]

## Set the response header if the "nocache" environment variable is set
## in the RewriteRule above.
Header always set Cache-Control "no-store, no-cache, must-revalidate" env=nocache

## Set Expires too ...
Header always set Expires "Thu, 01 Jan 1970 00:00:00 GMT" env=nocache

In this example, when the RewriteRule is fired the “nocache” environment variable is set. Note the E=nocache:1 rewrite flag in the RewriteRule. Subsequently, mod_headers will set the Cache-Control and Expires headers only if this “nocache” environment variable is set. In other words, “nocache” is only set on a 301 redirect from the RewriteRule.

This works nicely.

GET /wombat HTTP/1.1
Host: koli.ch

HTTP/1.1 301 Moved Permanently
Date: Sat, 11 Dec 2010 19:36:09 GMT
Location: http://mark.koli.ch/wombat
Server: Apache
Cache-Control: no-store, no-cache, must-revalidate
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-Length: 230
Content-Type: text/html; charset=utf-8
Connection: close

Yay for HTTP.

Understanding the HTTP Vary Header and Caching Proxies (Squid, etc.)

49849b2ace9d855308d505b5fca86d6d9006542d

Sat Sep 25 01:10:00 2010 -0700

I never paid much attention to the HTTP Vary header. In fact, I’ve been fortunate enough to avoid it for this long and never really had to care much about it. Well, it turns out when you’re configuring a high-performance reverse proxy, understanding the Vary header and what it means to your reverse proxy caching policies is absolutely crucial.

Here’s an interesting problem I recently solved that dealt with Squid, Apache, and that elusive Vary response header.

The Vary Basics

Popular caching proxies, like Squid, usually generate a hash of the request from a number of inputs including the URI and the contents of the Vary response header. When a caching proxy receives a request for a resource, it gathers these inputs, generates a hash, then checks its cache to see if it already has a resource sitting on disk, or in memory, that matches the computed hash. This is how Squid, and other caching proxies, fundamentally know if they have a cache HIT or MISS (e.g., can Squid return the content it has cached or does it need to revalidate the request against the destination server).

That in mind, you can probably see how the Vary header is quite important when a caching proxy is looking for a cache HIT or MISS. The Vary header is a way for the web-server to tell any intermediaries (caching proxies) what they should use, if necessary, to figure out if the requested resource is fresh or stale. Sample Vary headers include:

Vary: Accept-Encoding
Vary: Accept-Encoding,User-Agent
Vary: X-Some-Custom-Header,Host
Vary: *

According to the HTTP spec, “the Vary field value indicates the set of request-header fields that fully determines, while the response is fresh, whether a cache is permitted to use the response to reply to a subsequent request without revalidation.” Yep, that’s pretty important (I discovered this the hard way).

The Caching Problem

I configured Squid to act as a round-robin load balancer and caching proxy, sitting in front of about four Apache web-servers. Each Apache web-server was running a copy of my web-application, which I intended to have Squid cache where possible. Certain requests, were for large JSON objects, and I explicitly configured Squid to cache requests ending in .json for 24-hours.

I opened a web-browser and visited a URL I expected to be cached (should have already been in the cache from a previous request, notice the HIT):

GET /path/big.json HTTP/1.1
Host: app.kolich.local
User-Agent: Firefox

HTTP/1.0 200 OK
Date: Fri, 24 Sep 2010 23:09:32 GMT
Content-Type: application/json;charset=UTF-8
Content-Language: en-US
Vary: Accept-Encoding,User-Agent
Age: 1235
X-Cache: HIT from cache.kolich.local
X-Cache-Lookup: HIT from cache.kolich.local:80
Content-Length: 25090
Connection: close

Ok, looks good! I opened a 2nd web-browser on a different machine (hint: with a different User-Agent) and tried again. This time, notice the X-Cache: MISS:

GET /path/big.json HTTP/1.1
Host: app.kolich.local
User-Agent: Chrome

HTTP/1.0 200 OK
Date: Fri, 24 Sep 2010 23:11:45 GMT
Content-Type: application/json;charset=UTF-8
Content-Language: en-US
Vary: Accept-Encoding,User-Agent
Age: 4
X-Cache: MISS from cache.kolich.local
X-Cache-Lookup: MISS from cache.kolich.local:80
Content-Length: 25090
Connection: close

Wow, look at that. I requested exactly the same resource, just from a different browser, and I saw a cache MISS. This is obviously not what I want, I need the same cached resource to be served up from the cache regardless of who’s making the request. If left alone, this is only caching a response per User-Agent, not globally per resource.

Solution: Check Your Vary Headers

Remember how I said the contents of the Vary header are important for caching proxies?

In both requests above, note the User-Agent request headers and the contents of the Vary response headers. Although each request was for exactly the same resource, Squid determined that they were very different as far as its cache was concerned. How did this happen? Well, take a peek at a Vary response header:

Vary: Accept-Encoding,User-Agent

This tells Squid that the request URI, the Accept-Encoding request header, and the User-Agent request header should be included in a hash when determining if an object is available in its cache, or not. Obviously, any reasonable hash of (URI, Accept-Encoding, “Firefox”) should not match the hash of (URI, Accept-Encoding, “Chrome”). Hence why Squid seemed to think the request was for different objects.

To fix this, I located the source of the annoying User-Agent addition to my Vary response header, which happened to come from Apache’s very own mod_deflate module. The recommended mod_deflate configuration involves appending User-Agent to the Vary response header on any response that is not compressed by mod_deflate. I don’t really see why this is necessary, but the Apache folks seemed to think this was important. Here’s the relevant lines from the Apache suggested mod_deflate configuration:

SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|png|ico)$ no-gzip dont-vary
Header append Vary User-Agent env=!dont-vary

In any event, I removed the 2nd line above, restarted Apache and Squid began caching beautifully regardless of which client issued the request. Essentially, I told Squid to stop caring about the User-Agent by removing User-Agent from my Vary response header, and problem solved!

The joys of HTTP.

HTTP Digest Access Authentication using MD5 and HttpClient 4

83e158023b85f1d9bec507a18516b1a6552e8b3b

Tue May 04 14:30:00 2010 -0700

Dealing with HTTP’s Digest authentication mechanism isn’t too bad once you have the basic building blocks in place. Luckily HttpClient 4 can automatically solve many types of authentication challenges for you, if used correctly. Using HttpClient 4, I built an app that authenticates against a SOAP based web-service requiring WWW-Authenticate Digest authentication. In a nutshell, the fundamental principal behind HTTP Digest authentication is simple:

  • The client asks for a page that requires authentication.
  • The server responds with an HTTP 401 response code, providing the authentication realm and a randomly-generated, single-use value called a “nonce”. The authentication “challenge” itself is encapsulated inside of the WWW-Authenticate HTTP response header.
  • The client “solves” the authentication challenge and a solution is sent back to the web-server via the HTTP Authorization header on a subsequent request. The solution usually contains some type of MD5 hashed mess of your username, password, and “nonce”.
  • Assuming the solution is acceptable the server responds with a successful type response, usually an HTTP 200 OK.

Here’s a sample with a bit of pseudo code mixed in (so, you get the idea):

// A org.apache.http.impl.auth.DigestScheme instance is
// what will process the challenge from the web-server
final DigestScheme md5Auth = new DigestScheme();

// This should return an HTTP 401 Unauthorized with
// a challenge to solve.
final HttpResponse authResponse = doPost(url, postBody, contentType);

// Validate that we got an HTTP 401 back
if(authResponse.getStatusLine().getStatusCode() == HttpStatus.SC_UNAUTHORIZED) {
  if(authResponse.containsHeader("WWW-Authenticate")) {
    // Get the challenge.
    final Header challenge = authResponse.getHeaders("WWW-Authenticate")[0];
    // Solve it.
    md5Auth.processChallenge(challenge);
    // Generate a solution Authentication header using your
    // username and password.
    final Header solution = md5Auth.authenticate(
      new UsernamePasswordCredentials(username, password),
      new BasicHttpRequest(HttpPost.METHOD_NAME,
          new URL(url).getPath()));
    // Do another POST, but this time include the solution
    // Authentication header as generated by HttpClient.
    final HttpResponse goodResponse =
      doPost(url, postBody, contentType, solution);
    // ... do something useful with goodResponse, which assuming
    // your credentials were valid, should contain the data you
    // requested.
  } else {
    throw new Error("Web-service responded with Http 401, " +
      "but didn't send us a usable WWW-Authenticate header.");
  }
} else {
  throw new Error("Didn't get an Http 401 " +
    "like we were expecting.");
}

Enjoy.

Setting Up Your Own SVN Server (Using Apache and mod_dav_svn)

4d093beb39aeb2a75968339b6bf7f02eabe2dc6d

Tue Mar 16 15:30:00 2010 -0700

Setting up your own SVN source control server is surprisingly easy. At home, I recently setup an SVN server in a CentOS 5.4 virtual machine with Apache 2.2 and mod_dav_svn. With a little work, I had a secure and fully functional SVN server up and running in about 20 minutes.

Note that this HOWTO is specific to CentOS/RHEL/Fedora. The location of configuration files, and other tools, might be different depending on your Linux distro. For the most part though, everything should be pretty similar and you should be able to figure it out.

Install Apache, Subversion, mod_dav_svn and mod_ssl

On CentOS, installing the Apache web-server, Subversion, and the Apache mod_dav_svn and mod_ssl modules are a snap with yum:

#(root)/> yum -y install httpd subversion mod_dav_svn mod_ssl openssl

If you’re on Ubuntu you can probably install the required packages using apt-get install. Note that you need to install mod_ssl if you plan on securing your SVN server with HTTPS. If you don’t care about HTTPS, then you can ignore mod_ssl and skip to “Configure mod_ssl and Setup HTTPS” below.

Create your SVN Root Directory Structure

On my SVN server, I created a new SVN root at /svn. From here on out, all of my SVN repositories will live under /svn/repos:

#(root)/> mkdir -p /svn/repos
#(root)/> chown -R apache:apache /svn

Once done, you’re ready to create your first SVN repository.

Create your First Repository

Using the svnadmin command, create a repository under /svn/repos. For the sake of this example, the repository I’m creating is named myproject. Of course, you can name your own repository whatever you’d like. Oh, and you can create as many repositories as you’d like under /svn.

#(root)/> cd /svn/repos
#(root)/svn/repos> svnadmin create --fs-type fsfs myproject
#(root)/svn/repos> chown -R apache:apache myproject
#(root)/svn/repos> chmod -R g+w myproject
#(root)/svn/repos> chmod g+s myproject/db

You’ll notice that the svnadmin command created a new directory named myproject/. If you look inside myproject/ you’ll see a bunch of SVN repository data and configuration files.

#(root)/svn/repos> ll myproject
total 28
drwxrwxr-x 2 apache apache 4096 Mar 13 11:49 conf
drwxrwxr-x 2 apache apache 4096 Mar 13 12:01 dav
drwxrwsr-x 5 apache apache 4096 Mar 13 12:23 db
-r--rw-r-- 1 apache apache 2    Mar 13 11:49 format
drwxrwxr-x 2 apache apache 4096 Mar 13 11:49 hooks
drwxrwxr-x 2 apache apache 4096 Mar 13 11:49 locks
-rw-rw-r-- 1 apache apache 229  Mar 13 11:49 README.txt

Great, looks like our new SVN repository was setup correctly!

Configure mod_dav_svn

Now that you have an SVN repository setup and ready to go, let’s configure Apache and the mod_dav_svn module. Open /etc/httpd/conf.d/subversion.conf in your favorite text editor, and tweak the configuration to match your installation. My subversion.conf file looks like this:

LoadModule dav_svn_module     modules/mod_dav_svn.so
LoadModule authz_svn_module   modules/mod_authz_svn.so

<Location /svn>

   DAV svn
   SVNParentPath /svn/repos

   # Require SSL connection for password protection.
   SSLRequireSSL

   AuthType Basic
   AuthName "Marks SVN Server"
   AuthUserFile /svn/repos/users
   Require valid-user

</Location>

First, note that when you install mod_dav_svn using yum, the installation process will create a standard cookie cutter template /etc/httpd/conf.d/subversion.conf for you. This template has a LimitExcept directive in it, and a few other things. For security, I think it’s best to require a user to authenticate before they are able to issue any request. Hence, why I removed the LimitExcept directive and did my own thing. If you want your SVN server to be read-only for anonymous users, and read-write for authenticated users, then my subversion.conf file is not for you. My subversion.conf file shown above allows no anonymous access; all users must authenticate (enter a valid username and password) before they can do anything with the SVN server.

Second, note that I have enabled the SSLRequireSSL directive. This triggers mod_dav_svn to reject all non-HTTPS requests. This ensures that any communication between the server and my SVN client will be sent via HTTPS; usernames, passwords, and source code will be reasonably secured. I’ll show you how to setup HTTPS here in a moment. Note that if you don’t want to enable HTTPS on your SVN server, then you can comment out or remove the SSLRequireSSL line in your subversion.conf configuration file.

Finally, note that my AuthUserFile is /svn/repos/users. This is a standard Apache htpasswd file that we’ll create in the next step.

Create your SVN Users File

Create your SVN users file using the htpasswd command. This is the file that stores a list of usernames and passwords declaring who is allowed to access your SVN server.

#(root)/> htpasswd -c /svn/repos/users mark

Replace “mark” above with your desired username. Repeat this command for however many users you need to add access.

Configure mod_ssl and Setup HTTPS

If you have decided to make your SVN sever an HTTPS only server, we’ll need to setup Apache’s HTTPS configuration. This involves tweaking /etc/httpd/conf.d/ssl.conf and creating a new self-signed SSL certificate. For your convenience, I’ve included the same set of instructions below. Note that my SVN server is named svn.kolich.local — yours will obviously be different. Whatever it is, make sure that you enter the correct server name when openssl prompts you for a “Common Name” in your certificate. The “Common Name” in your SSL certificate should match the fully qualified name of your SVN server.

Note that if you have an SSL certificate signed by a legitimate Certificate Authority (Network Solutions, Verisign, Thawte) you shouldn’t need to generate a new SSL key and self-signed certificate. You can simply use the one issued to you by your CA.

First, create a new SSL key with the openssl command:

#(root)/> mkdir /etc/httpd/ssl
#(root)/> cd /etc/httpd/ssl
#(root)/etc/httpd/ssl> openssl genrsa 4096 > svn.kolich.local.key

Now that you have a private key, create a self-signed certificate:

#/etc/httpd/ssl> openssl req -new -key svn.kolich.local.key -x509 \
    -days 1095 -out svn.kolich.local.crt

You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [GB]:US
State or Province Name (full name) [Berkshire]:California
Locality Name (eg, city) [Newbury]:My Town
Organization Name (eg, company) [My Company Ltd]:Mark Kolich
Organizational Unit Name (eg, section) []:
Common Name (eg, your name or your server's hostname) []:svn.kolich.local
Email Address []:

Finally, edit /etc/httpd/conf.d/ssl.conf to point to your newly generated SSL key and SSL certificate. This involves updating the SSLCertificateFile and SSLCertificateKeyFile directives accordingly:

##
## SSL Virtual Host Context
##

<VirtualHost _default_:443>
 ## Required, see http://serverfault.com/a/440452
 SSLEngine On
 ...
 SSLCertificateFile /etc/httpd/ssl/svn.kolich.local.crt
 SSLCertificateKeyFile /etc/httpd/ssl/svn.kolich.local.key
 ...
</VirtualHost>

Note that you should not place your SSL private key and certificate in a location accessible by the web-server. Usually placing them under /etc/httpd is sufficient. It would be less desirable and quite insecure to place them under /var/www/html for example.

Configure HTTP to HTTPS Redirection

If you’ve bothered to setup HTTPS in the previous step, you probably want Apache to gracefully redirect clients from HTTP to HTTPS. If you don’t automatically redirect, and you have SSLRequireSSL enabled in your subversion.conf file, when clients try to communicate with your SVN server via HTTP they’ll see a 403 Forbidden error. Instead, let’s 301 Moved Permanently redirect them to HTTPS. Open /etc/httpd/conf/httpd.conf in your favorite text editor, jump to the bottom of the file, and edit your VirtualHost configuration. Mine is as follows:

NameVirtualHost *:80

<VirtualHost *:80>

  ServerAdmin example@example.com
  DocumentRoot /var/www/html
  ServerName svn.kolich.local
  ServerAlias svn

  RewriteEngine On
  RewriteCond %{HTTPS} !=on
  RewriteRule ^/(.*)$ https://svn.kolich.local/$1 [R=301,L]

  ErrorLog logs/svn.kolich.local-error_log
  CustomLog logs/svn.kolich.local-access_log common

</VirtualHost>

Save it, and you’re done. Now when a client tries to communicate with my SVN server via HTTP, it’ll see a 301 Moved Permanently redirect to HTTPS. If my SVN client is smart enough, it will gracefully follow this redirect to HTTPS, and all is well. Of course, you’ll need to change the HTTPS URL shown above in the RewriteRule directive to match your server hostname (your SVN server is not svn.kolich.local).

Start Apache, and Enjoy

That’s it! Start Apache and checkout your new repository.

#(root)/> /etc/init.d/httpd start

On another machine, try to checkout the repository:

#(mark)~> svn co http://svn.kolich.local/svn/myproject
svn: PROPFIND request failed on '/svn/myproject'
svn: PROPFIND of '/svn/myproject': 301 Moved Permanently (http://svn.kolich.local)

Yep, HTTP to HTTPS redirection is working as expected. Unfortunately my SVN client isn’t smart enough to follow the redirect on its own. Oh well, change that repository URL to HTTPS, and try again:

#(mark)~> svn co https://svn.kolich.local/svn/myproject
Error validating server certificate for 'https://svn.kolich.local:443':
 - The certificate is not issued by a trusted authority. Use the
   fingerprint to validate the certificate manually!
Certificate information:
 - Hostname: svn.kolich.local
 - Valid: from Mar 15 20:17:38 2010 GMT until Mar 14 20:17:38 2013 GMT
 - Issuer: Mark Kolich, California, US
 - Fingerprint: ff:ee:b6:9c:d8:d7:78:3b:ce:9e:09:dd:4a:99:93:11:3e:12:07:85
(R)eject, accept (t)emporarily or accept (p)ermanently? p
Authentication realm: <https://svn.kolich.local:443> Marks SVN Server
Password for 'mark': ...
A    myproject
Checked out revision 0.

It worked! Note that the “Error validating server certificate” warning is because I’m using a self-signed SSL certificate. When SVN asks if you want to accept the certificate, if you permanently accept it you will not be prompted about this again. If you use an SSL certificate issued by a real Certificate Authority like Network Solutions, Verisign, or Thawte, you shouldn’t see this warning.

Time to start hacking — cheers!

Apache Tip: Deny TRACE and TRACK Requests with mod_rewrite

a5cc3e41966c6f947263ab05d3e3866eace62490

Sat Nov 14 10:41:46 2009 -0800

It’s long been rumored that exposing the HTTP TRACE and TRACK methods on your web-server can open the door to a number of miscellaneous vulnerabilities, including cookie thefts and other cross-site tracing attacks. Many resources out there claim you should configure you web-server to flat-out reject TRACE and TRACK requests, and I agree with them. Generally speaking, there’s really no good need (that I’ve found) that would require or make use of TRACE or TRACK. With that said, if you’re running Apache, it’s fairly easy to reject TRACE and TRACK using mod_rewrite:

RewriteCond %{REQUEST_METHOD} ^TRACE [NC,OR]
RewriteCond %{REQUEST_METHOD} ^TRACK [NC]
RewriteRule ^/(.*)$ - [F,L]

You can prove to yourself that this works, by using a tool like curl to issue an HTTP TRACE and TRACK to your newly secured web-server. Use the -X option with curl to specify the HTTP request type:

#/> curl -v -X TRACE mark.koli.ch
* About to connect() to mark.koli.ch port 80 (#0)
*   Trying 24.130.215.240... connected
* Connected to mark.koli.ch (24.130.215.240) port 80 (#0)
> TRACE / HTTP/1.1
> User-Agent: Curl
> Host: mark.koli.ch
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Date: Sat, 14 Nov 2009 18:53:06 GMT
< Server: Apache
< Content-Length: 202
< Connection: close
< Content-Type: text/html; charset=iso-8859-1
<
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access / on this server.</p>
</body></html>
* Closing connection #0

Yep, works nicely. One thing that slightly annoys me, however, is that the HTTP OPTIONS method still reports that my server supports TRACE, even though I clearly don’t anymore. A quick Google search reports that many other folks have had the same concern, with no clear resolution.

Apache: Setting the Content-Disposition Header with mod_rewrite

8da99cc51ed0d5c71a549b214bcf44644bb7deba

Thu Sep 24 09:34:00 2009 -0700

The title of this post is slightly misleading, since the Content-Disposition header cannot be set directly using mod_rewrite. As I understand it, there are a set of headers that mod_rewrite understands, and Content-Disposition is not one of them.

To accomplish this, I used an interesting combination of mod_rewrite and mod_headers:

## Internally redirect, then..
RewriteRule ^/build/latest /build/dist/build-v1.0.jar [L]
SetEnvIfNoCase Request_URI build\/latest$ build-latest
## ..set the header.
Header set Content-Disposition "attachment; filename=build-latest.jar" env=build-latest

Using this method, here’s what the Content-Disposition header looks like according to HttpFox:

Cheers.

Apache, Don't Log Yourself (Don't Log Specific IP Address and User-Agents)

8aee85aff4ac561ab7ca25c26863cc90fa29c3b4

Sun Jun 07 11:30:00 2009 -0700

In some Apache web-server configurations, it might be useful avoid logging requests from specific IP addresses or User-Agent’s. For example, if you regularly check your own site from your home network you probably don’t want to record your own visit in your Apache access_log’s. For mark.koli.ch, I stopped logging requests from my home network since I was filling up my own log files with redundant junk.

httpd.conf Specific Configuration

Here’s how you can stop logging requests from a specific IP address, or a request for a specific resource (goes in your httpd.conf file):

## Dont log myself (requests from my own network), requests
## for my robots.txt file.
SetEnvIf Remote_Addr "192\.168\.1\." dontlog
SetEnvIf Request_URI "^/robots\.txt$" dontlog

## I don't use this, but I have it here as an example.  Why
## I avoid using this is explained below.
##SetEnvIfNoCase User-Agent "(msnbot|googlebot|slurp)" dontlog

## Normal logging directives.  Note the env!=dontlog at the
## end of CustomLog.
ErrorLog logs/mark.koli.ch-error_log
CustomLog logs/mark.koli.ch-access_log combined env=!dontlog

Why Should I NOT Use SetEnvIfNoCase User-Agent to Avoid Logging Requests From Specific User-Agents?

You can use SetEnvIfNoCase User-Agent if you want to stop logging requests from specific User-Agent’s. However, I would recommend that you avoid using this feature because it is extremely easy to forge/fake the User-Agent header of an HTTP request. If a hacker tries to probe or attack your site disguised as the “GoogleBot”, and your Apache server is configured to not log requests from clients that claim they are the “GoogleBot”, you won’t see the probe attacks in your log files. In short, Apache will think the request is from the GoogleBot, when in fact, it could be a hacker or malicious user masquerading as a web-crawler.

Cheers.

Use Apache mod_deflate To Compress Web Content (Accept-Encoding: gzip)

ae709913dc44233d6fbcbda48f58d19a9a24601b

Sat Apr 04 21:45:00 2009 -0700

After setting up my mobile blog at http://mobi.koli.ch, I started looking for ways to improve the content delivery speed of this service to mobile devices. On the server-side, I’m fairly sure I implemented just about every hack imaginable to squeeze every bit of performance I could from the PHP engine. At this point, my only bottleneck was the actual content delivery chain. I can’t control how fast my data is transferred over a wireless (2G/3G?) network because of factors outside of my control, but I can help grease the skid. To do so, I activated Apache’s mod_deflate extension which compresses the content of an HTTP response before its delivered to the client.

The mod_deflate module uses gzip to deflate the HTTP response body. Obviously, since the response is compressed, that means there’s less data to transfer. Hence, it takes less “cycles” (packets, octets, whatever) to get the data to the client. In short, using compression within the confines of HTTP can often dramatically improve the “speed” of your site or mobile portal on a wireless network. Of course, not only wireless clients benefit from the performance improvement, but typically wireless devices see more of an improvement than an average broadband user.

First, I verified that mod_deflate is installed and pre-activated with Apache on my CentOS 5 box:

#/~> cat /etc/httpd/conf/httpd.conf | grep mod_deflate
LoadModule deflate_module modules/mod_deflate.so

Second, I hacked my http://mobi.koli.ch VirtualHost configuration a bit to activate the mod_deflate output filter. Using the instructions on the mod_deflate homepage, I configured my mobi.koli.ch VirtualHost to use the highest compression level possible on every response except for binary image data. I also defined a custom “deflate” log so I can check the compression ratio on any responses:

<VirtualHost *:80>

  DocumentRoot /my/server/root/kolich.mobi/
  ServerName www.kolich.mobi
  ServerAlias kolich.mobi
  ServerAdmin support@example.com

  DeflateCompressionLevel 9
  SetOutputFilter DEFLATE
  BrowserMatch ^Mozilla/4 gzip-only-text/html
  BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
  SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|png|ico)$ no-gzip dont-vary
  Header append Vary User-Agent env=!dont-vary

  DeflateFilterNote ratio
  LogFormat '"%r" %b (%{ratio}n%%) "%{User-agent}i"' deflate
  CustomLog logs/kolich.mobi-deflate_log deflate

  ErrorLog logs/kolich.mobi-error_log
  CustomLog logs/kolich.mobi-access_log combined

</VirtualHost>

This works quite nicely. Looking at the custom deflate log, my mobi.koli.ch content compression ratio is about 40-50% on average. Not bad for less than 5 minutes of work!

Some things to consider when using mod_deflate:

  • It wasn’t immediately clear to me if mod_deflate looked for the Accept-Encoding HTTP request header to verify that the client supports gzip compression. At least, there was no mention of this on the mod_deflate homepage — but as it turns out, yes, mod_deflate does look through the HTTP request headers for Accept-Encoding: gzip and will not compress the response if the client does not support it.
  • Of course, not all clients support compression.
  • Compression will use additional CPU cycles on the client and the server. If a large majority of your target audience is using slow mobile devices, then you might want to think twice about HTTP compression. Generally speaking though, todays mobile clients should have plenty of CPU power to handle simple gzip compression.
  • High-traffic sites will want to think twice about compression. You wouldn’t want the added CPU consumption from compression to impact your overall site experience.

Cheers.

Configure Apache to Return a HTTP 204 (No Content) for AJAX

9ac8bd83b15f2d5fe2da14ef4b5ffecd4b2eb5a7

Sun Mar 22 11:52:32 2009 -0700

When dealing with AJAX, you might need to configure Apache to return a HTTP 204 No Content. This is useful when your AJAX scripts need to “ping” the server, but you don’t want the server to actually return any data in the response (e.g., just acknowledge the request and return and empty response body). The server might do something behind the scenes though (like log the request) before it returns a 204. As I understand it, the only difference between a 200 and a 204, is that a 204 response means that “the server has fulfilled the request but does not need to return an entity-body”.

I tried to figure out how to configure Apache to return a 204 No Content using one of the built in modules, like mod_actions, or mod_headers.

It turns out, Andrew Grangaard alerted me that you can configure RedirectMatch to return an HTTP 204 No Content with mod_alias.

So there are a few options:

Use RedirectMatch

As explained by Andrew Grangaard, you can use Apache’s RedirectMatch directive to trigger on a specific URL pattern and reply with an HTTP 204 No Content response:

RedirectMatch 204 tracker(.*)$

This triggers on any URI starting with tracker.

Use a Perl Script (Hack)

You could also use Apache’s SetHandler and Action directives to intercept the request and then call a Perl/CGI script to return a 204 No Content. For example, in your VirtualHost configuration, use the Location directive to define the URI you want Apache to trigger on:

<Location /tracker>
  SetHandler nocontent-handler
  Action nocontent-handler /gen_204.cgi virtual
</Location>

So this tells Apache to call gen_204.cgi each time a request comes starting with /tracker. Note the virtual modifier at the end of the Action directive — defining the action as virtual is important because it tells Apache not to check if the requested file/resource actually exists. For example, /tracker/foobar doesn’t actually exist (it’s not a real resource on the server), so use the virtual modifier so Apache will ignore this and return a 404 Not Found.

Here’s a hacky gen_204.cgi Perl/CGI script I once used:

#!/usr/bin/perl -w

use strict;
use warnings;
use CGI;

my $cgi = CGI->new();
print $cgi->header('text/html','204 No Content');

exit;

This seems to work nicely, but is quite kludgy.

Better to use RedirectMatch instead.

Generate Your Own Self-Signed SSL Certificates for Apache HTTPS

c2aa9feeaf2f9abd60560031b2f9ea05eb782546

Sat Mar 21 23:01:56 2009 -0700

If you’d like to generate your own self-signed SSL certificates for use with Apache, the openssl command makes it easy.

At home, I run a few HTTPS dev Apache instances that use my own self-signed SSL certificates. Granted these certificates are not signed by a legitimate Certificate Authority (like Verisign, Thawte, or Network Solutions), but they get the job done if you want quick and cheap SSL security. Keep in mind that if you use a self-signed certificate, a web-browser will complain. You shouldn’t use these instructions to setup SSL in a real production environment, however, for development stuff at home, this is perfect.

Generate your own self-signed SSL certificates using the openssl command:

openssl genrsa 4096 > example.com.key
openssl req -new -key example.com.key -x509 -days 365 -out example.com.crt

The first command will generate a new private key with a specified size of 4096-bits.

The second command will produce a certificate worthy of inclusion into Apache.

Now that you’ve got a key (a .key file) and certificate (a .crt file), you can integrate them into Apache. This involves using the SSLCertificateFile and SSLCertificateKeyFile directives in your Apache configuration file that defines an HTTPS VirtualHost. You need to configure these directives to point to your certificate and key files, respectively. In my environment, this configuration goes into /etc/httpd/conf.d/ssl.conf:

##
## SSL Virtual Host Context
##

<VirtualHost _default_:443>
 ...
 SSLCertificateFile /path/to/crt/file/example.com.crt
 SSLCertificateKeyFile /path/to/key/file/example.com.key
 ...
</VirtualHost>

Remember, your private key (your .key file) is important. You should keep it in a secure/private place on your server, and certainly not in a public readable directory.

Hide Apache Server Version for Security using ServerTokens and ServerSignature

e027f0e442649646fce91f2179a46bec0d616981

Tue Oct 28 22:11:14 2008 -0700

On the web, malicious hackers typically try to exploit bugs or holes in un-patched versions of public web-servers. The Apache web-server is an obvious target, given that as of June 2008 Apache served 49.12% of all websites on the Internet. In fact, the Apache web-server is powering this blog and my network of other domains.

When a client (most often a browser) makes an HTTP request to a web-server, the server responds with an HTTP response. The response contains a status line with a status code (e.g., HTTP/1.1 200 OK) and a set of response headers. Surprisingly, the Apache web-server embeds version information about itself in these HTTP response headers. If you are concerned about exposing the version of Apache you are running to the world, you may want to disable this. Hackers often look for specific versions of Apache with known bugs to pick-on, then target the site with various attack methods. Blocking this Apache version information in the HTTP response headers can make it more difficult for hackers to identify the version of Apache you are running and compromise your system(s).

The trick is to adjust or add a few Apache directives (a.k.a. options) to your httpd.conf file. On a standard Fedora/Red Hat/CentOS install, the httpd.conf file can be found at /etc/httpd/conf/httpd.conf. Set ServerSignature Off and ServerTokens Prod in your httpd.conf file:

ServerSignature Off
ServerTokens Prod

From the Apache documentation:

“The ServerSignature directive allows the configuration of a trailing footer line under server-generated documents (error messages, mod_proxy ftp directory listings, mod_info output, …). The ServerTokens directive controls whether the Server response header field which is sent back to clients includes a description of the generic OS-type of the server as well as information about compiled-in modules.”

I have yet to encounter a need to actually enable a Server Signature or provide information about the Apache version in the HTTP response headers.

Oh, if you want to read up on why most admins hate the Apache web-server, take a look at Why I Hate The Apache Web Server.

Cheers.