
Last week, I learned that WordPress doesn’t ship with a default robots.txt.
- this is the default file that search engine crawlers parse to see what resources and URL patterns that it allowed and not allowed to crawl; it’s step 1 in every search engine optimization (SEO) guide.
I guess I just stupidly assumed that it was included in WP. Anyways, I thought it to be fair to tell everyone that if you are using WordPress and you care how your site shows up in search results, you should generate a robots.txt and a sitemap.xml.
Robots.txt?
Know that it’s important for search engines. Read this:
** NOTE: Not all web crawlers are guaranteed to read example.com/robots.txt; it serves as a guideline.
I Feel Dumb…
I feel like an idiot, and I should. The other day I just happened to search for “engfers” on Google, and the result that came back was my site with an indented sub-result that was some error from a file in the WP-Super-Cache plugin. I thought to myself, why is the plugins/ directory being crawled?
Needless to say, I shortly thereafter found Google’s Webmaster Tools to help rectify my situation. It’s a pretty nice web-app that allows you to remove content from Google’s search (which I then used).
I also noticed that the webmaster tools had sections for analyzing your robots.txt and sitemap.xml. Well, I was surprised to find out that this site didn’t have a robots.txt.
Most of you are probably think that I’m an idiot because that’s SEO 101. Well yes, it is; however, I didn’t realize that WordPress doesn’t ship with a default robots.txt! Don’t ask me why I didn’t see that before because I don’t know. Nevertheless, I think WP should ship with a robots.txt that AT LEAST eliminates plugins/ and wp-include/ from being crawled.
Our Shiny, New robots.txt
There seems to be a billion and one SEO blogs out there; however, I was looking for resources for a robots.txt optimized for WordPress.
I found a couple of articles and examples at askapache.com and an example from the WordPress.org Codex.
The final version of our robots.txt (http://www.engfers.com/robots.txt) was pulled from the WordPress Codex page.
User-agent: * Disallow: /cgi-bin Disallow: /wp-admin Disallow: /wp-includes Disallow: /wp-content/plugins Disallow: /wp-content/cache Disallow: /wp-content/themes Disallow: /trackback Disallow: /feed Disallow: /comments Disallow: /category/*/* Disallow: */trackback Disallow: */feed Disallow: */comments Disallow: /*?* Disallow: /*? Allow: /wp-content/uploads # Google Image User-agent: Googlebot-Image Disallow: Allow: /* # Google AdSense User-agent: Mediapartners-Google* Disallow: Allow: /* # Internet Archiver Wayback Machine User-agent: ia_archiver Disallow: / # digg mirror User-agent: duggmirror Disallow: / # Sitemap Sitemap: http://www.engfers.com/sitemap.xml
**NOTE: This file must to be at the ROOT of your web server!
Final Note: sitemap.xml
The big-daddy search engines like Google, Yahoo, Microsoft, etc use your site’s sitemap.xml (example.com/sitemap.xml) to make it easier crawl your website. It’s also a very important point of SEO; just do a bit of searching on it.
The final line in our robots.txt points to the sitemap:
Sitemap: http://www.engfers.com/sitemap.xml
For WordPress, use a plugin like the Google Sitemap Generator, to have it automacially generate the sitemap for you.
+1 = Moreover, It will automatically regenerate the sitemap.xml when you publish or edit a new article or page. =)

Great article engfer! Most people and bloggers have never heard about robots.txt files, and that isn’t good for anyone.
The newer WordPress versions show a default robots.txt “file” by using internal rewrites, which IMHO is not nearly as good as using an actual file. Keep it up..
Great post, thank you. I really had no idea about how I should be using robots.txt files with WordPress, just assumed they had it done the best way for me.
Hi. Your site displays incorrectly in Firefox, but content excellent! Thanks for your wise words
Finally someone who can write a good blog ! . This is the kind of information that is useful to those want to increase their SERP’s. I loved your post and will be telling others about it. Subscribing to your RSS feed now. Thanks
i have been using WordPress for 2 years but i still dont know how to do SEO using WordPress, is there an SEO pluggin for WordPress?.
Nice looking blog, might I ask you what template you are running and how much it costs? I’ve been using cheap ones but can’t find one that I actually like.
3column2k — it’s free. but it’s old and probably needs to be updated for WP 2.8