Continue reading for the analysis ...
First, let's count how many unique referrers generated traffic into kolich.com:
(root@skull)/var/log/httpd> cat access_log* | \
awk '{print $11}' | \
grep -v "\"-\"" | \
grep -v "mark\.kolich\.com" | \
grep http | sort -u | wc -l
837
So, I received 837 hits from unique referrers. This command chain is relatively simple. First, I'm using awk to only capture the referrer field of the Apache access_log files. Secondly, I'm using grep -v to strip out any empty referrer strings, and any requests originating my from my domain. For example, a click from mark.koli.ch to mark.koli.ch/page2.html does not count. Third, I'm using sort -u to only show me the set of unique strings. And finally, I'm using wc -l to count the number of lines in the result.
Now that I have an aggregate total, let's find out how many referrers were NOT from Google:
(root@skull)/var/log/httpd> cat access_log* | \
awk '{print $11}' | \
grep -v "\"-\"" | \
grep -v "mark\.kolich\.com" | \
grep http | sort -u | \
grep -v google | wc -l
70
So, only 70 unique hits to my site were from non-Google sources. This is basically the same command as before, but I added a grep -v google near the end of the chain to ignore any referrers containing the word "google".
Now, if we do the math on the numbers we have so far, it's clear that Google accounts for 91.6% of all traffic coming into kolich.com:
(root@skull)/var/log/httpd> bc -lq
837-70
767
(767/837)*100
91.63679808841099163600
quit
Just for grins, let's see how much traffic originates from Yahoo!:
(root@skull)/var/log/httpd> cat access_log* | \
awk '{print $11}' | \
grep -v "\"-\"" | \
grep -v "mark\.kolich\.com" | \
grep http | sort -u | \
grep -i yahoo | wc -l
7
Wow, a huge 7-hits! Let's do the math on that:
(root@skull)/var/log/httpd> bc -lq
(7/837)*100
.83632019115890083600
quit
So, only 0.8% of all traffic into kolich.com based on unique referrer originates from Yahoo!. Google is over 91%. The remaining 7.6% of traffic was from other non-Google and non-Yahoo sources.
Now you see how dominant Google really is.


Did you find this post helpful, or at least, interesting?