The removal process went fine (thank you Movable Type), but after analyzing my server logs a bit, I noticed that some clever folks were using Google to look up a cached copy of the post in question. Unfortunately, there was nothing I could do about cached copies of the post floating out there on the web. The "damage" (if you would call it that) was already done.
- You can add a "Noarchive: /" record to your robots.txt file. Ideally, bots who care about robots.txt (most well established bots do), will see this and won't archive/cache your content. Of course, Noarchive: lets you be as specific or generic as you want. If you only want to block a specific page or section of your site, you can say "Noarchive: /block/this/relative/from/root/".
- If you don't like the Noarchive: robots.txt option, you can add a "noarchive" <meta> tag to each page you'd like to exclude from caching. This is the approach I took for this blog (kolich.com):
<meta name="googlebot" content="noarchive" />
<meta name="robots" content="noarchive" />
In Movable Type, I added these <meta> tags to my HTML Head template. Any new post that I publish (like this one) will automatically contain these tags. This works nicely. With these meta tags in place, Google will no longer keep cached copies of my content.
On another note, I question the usefulness of search engines, like Google, keeping cached copies of web-sites and other content found on the web. In general, I suspect most bloggers and webmasters don't want cached copies of their content floating around the net. With cached copies, you lose control over who views your content and in what context. In my opinion, a better approach is to let bloggers and webmasters "opt-in" to site caching and archiving. Instead of assuming everyone wants their content archived on Google, it would be nice if Google assumed the opposite. Instead, assume we don't want our stuff cached, but if someone does, then they can opt-in individually by adding a <meta content="archive"> tag, or similar directive to their robots.txt file.
Oh well, you live you learn.


Did you find this post helpful, or at least, interesting?