
At work, I sit behind a cluster of
BlueCoat 800-2 Web Cache proxy appliances. For security and efficiency, all of our outbound web-traffic is sent through this cluster. This is fine, except that the BlueCoat appliances in this cluster do NOT obey the
configuration guidelines in my robots.txt. And, they unnecessarily ping many of the requested resources sent through them. In other words, if you load a web-page using this web-proxy cluster, the cluster will continuously "ping" the site and its resources over and over again at a fixed interval. I assume this is the cluster checking the site to see if anything has changed. From a system administrators standpoint, this is slightly irritating because the extra "ping" adds to the load of the server and unnecessarily consumes bandwidth (the cluster appears to repeatedly request the same resources with a
GET).
So, I took matters into my own hands and tweaked my root .htaccess file to block these pings with a 403 Forbidden. Strangely enough, BlueCoat appliances appear to use an HTTP user-agent of "
Mozilla/4.0 (compatible;)". This is definitely not a legit user-agent string; the user-agent
should identify the device that's making the request. Lucky for me, this simplifies my .htaccess configuration a bit:
SetEnvIf User-Agent "^Mozilla\/4\.0 \(compatible\;\)$" block=1
Order allow,deny
Allow from all
Deny from env=block
And that, takes care of that.
Did you find this post helpful, or at least, interesting?