11.17.06
SEO Tip – Using Error Pages and the Robots.txt file
A customer called me today and asked this question: “why is my error page in my web site my consistently #1 ranked page in my web site?”. This is a great question. Before I answer it, I should point out some important facts about this customer’s web site.
- He designed his site, we didn’t. It’s a pretty basic, small site.
- He does not use a high-end analytics tool to monitor traffic.
First you need decent traffic reporting
Every web site needs an analytics program (traffic reporting software) to track how many visitors your web site receives. All hosting companies, including mine (Delaware.Net) provide these services for free. If you don’t care about giving your web site’s traffic statistics to Google, you can use their Google Analytics service. This free service allows you to gain very simple traffic reporting results, like page views, referring sites, etc. For most small web site owners, like the one that asked me the questions above, this is a good choice. Google Analytics is very limited though, if you plan to track many campaigns and you want to see the IP addresses of your web site visitors, you will need something that has more power. Google analytics is really based on Urchin, which is a great product that Google purchased in March of 2005 for around $30 million. But what you get from Google for free is just part of Urchin.
Your top visitor to your web site isn’t a person at all – its a search engine spider
So, your site has an analytics program. When you start looking at the top pages that are viewed, you might see wh
at this customer sees – that your error pages come up as your number one page. This shouldn’t happen all the time, but if it does, you should look for patterns in WHO it is that is visiting your site. This is where Google Analytics comes up short. In the traffic reports, you can see the browser type that is accessing your site (also known as the User Agent). You will probably see spikes in visits from search engine bots, like MSNBot and GoogleBot. These are the programs that come to your web site to scour it and rank your site. You will also see MANY other user agents visiting your web site. I have noticed that MSN’s bot doesn’t stop looking for pages that don’t exist – ever. It is a real problem with MSN. There are MANY MANY nasty user agents out there as well, and many of them are used to scour web sites for email addresses for spammers. There is no sure-fire way to block these bots from hitting your site, but adding a Robots.txt file to your site is a good start.
Look for, and block Bad Bots
A robots.txt file is a file in your site that good robots look at before they scour your site. You can read about them here.
It is also possible to block bad bots from scouring your site by creating a script that looks at the user agent, and then blocks them if they are not legitimate. Since it is also possible to impersonate a user agent, this is not foolproof. But it does work well. I’ve done this and after monitoring the bad bots that hit one of my sites, I ended up blocking about 35 or so of them. There are a lot more than 35, but this reduced the number of hits to my traffic reports from bad bots. Even though MSN’s bot is broken, I wouldn’t block it.
Make your own custom error messages
This is a hot tip that should get you more sales from your web site. You should also look at the requested pages that are causing the 404 messages to happen. If you have an old page that you deleted, and MSN and other legitimate engines want to keep trying to search for it, then you could either recreate that old page, or you could make a custom 404 page, that pushes people to your home page. If a visitor were to type in the direct addresses to pages within your web site incorrectly, should they see a 404 error message? Or should they be redirected to your home page? It IS possible to design custom error pages to your web site, and they don’t have to be redirects (which can hurt you with Google). You could simply make a static 404 page that has links to the main sections of your web site. Remember that if ALL of your bad pages are redirected automatically, then bots won’t know a good page from a bad one. So it can still say “404 Error – Page Not Found”, but you should put links and your logo into it so that people aren’t trapped in your web site.
Don’t assume something is wrong with your site or your web site host because you get high page views on your error pages. Instead, you can actually use this information to get more sales.
If you have questions or comments, post them and I will give you more info.
Digg it
| del.icio.us
| Furl
| reddit
Permalink


