-
Website
http://www.connectedinternet.co.uk/ -
Original page
http://www.connectedinternet.co.uk/2007/03/25/creating-a-wordpress-robotstxt-to-improve-seo/ -
Subscribe
All Comments -
Community
-
Top Commenters
-
Thilak Rao
6 comments · 13 points
-
AndyBeard
11 comments · 4 points
-
Michael Lankton
102 comments · 1 points
-
Michael Lankton
92 comments · 2 points
-
TheBuzzSaw
3 comments · 1 points
-
-
Popular Threads
-
Netbooks on their way out… official!
22 hours ago · 1 comment
-
Mobile phone GSM encryption cracked
1 day ago · 1 comment
-
What I’d Like In Windows 8
1 week ago · 4 comments
-
Top 2009 Money Makers- and Duds
4 days ago · 1 comment
-
The Number One Way To Amazing Headlines
4 days ago · 1 comment
-
Netbooks on their way out… official!
why would you want to exclude the monthly archives, for instance?
plus there are some non clear attributes that might end up messing your indexation.
I would rather stick to a more simple and clear robots.txt, you can see then one i am using http://www.dailyblogtips.com/robots.txt
@Daniel; I don't see why you'd want to block your /feed/ page from Google though; Google is doing more and more with blog posts, RSS feed parsing is one of these things.
rather than saying the list is 'too big' it'd be more useful if you said what should be included, or what shouldn't be included
The example at http://www.dailyblogtips.com/robots.txt is good, you don't need more than that.
For a blog directory, disallowing the SE-bots from visiting the main three folder(wp- admin,-content, -includes) and the wp-"files" in the main directory, will be sufficient! as all .php, .css, .js..etc files will be included in these directories
So a robots.txt file like Daniel or mine will be good.
Anyway, waiting for more comments or asking an expert is the best way to minimize your robots file!
Putting meta name="robots" content="noindex,follow" at the top of your archive and tag templates will allow the links to be followed but should prevent the actual archive page from being indexed.
In your robots.txt you should probably only disallow anything that you want a well behaved robot to completely ignore. If files or directories aren't specifically linked to (and your wp-includes, wp-admin, and wp-content directories are among these) then you can leave them out altogether.
A well behaved spider will only follow links. Any spider, well behaved or not, won't go anywhere that isn't specifically referred too. If there's no link to a directory it doesn't know that directory exists.
thx u thx u
What this has to do with the robots file?!
And why should it be a waste of time to crawl the feeds?
Thanks
WordPress robots.txt optimized for SEO
Because you don't want the same posts appearing twice and Google thinking it's duplicate content, or junk appearing in results as Google will then downgrade your real pages
I'm guessing it's best to start with the defaults shown in the linked file on Daily Blog Tips and go from there adding misc pages I have such as pages full of bookmarks etc.
Cheers!
I found out I had my robots.txt file in the wrong place anyway!
Paula
http://www.searchenginepromotionhelp.com/m/robo...
Sue
i use it all unkess :
Disallow: /page/Disallow: /date/
Disallow: /comments/
I installed the KB robots.txt plugin. This what I entered in the robots.txt plugin window.
User-agent: *
Disallow:
But when I do I do www.babygeartoday.com/robots.txt this is what I get, and in the plugin when it says check the the robots.txt file after I submit I get this also:
# BEGIN XML-SITEMAP-PLUGIN
Sitemap: http://www.babygeartoday.com/sitemap.xml.gz
# END XML-SITEMAP-PLUGIN
So uninstalled the google sitemap and analytics, and still the same thing. I was wondering could you help me solve this problem.
Thanks,
Stacey
User-agent: *
Allow: /wp-content/uploads/
Disallow: /wp-content/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-
Disallow: /feed/
Disallow: /trackback/
Disallow: /cgi-bin/
User-agent: Googlebot-Image
Disallow:
Allow: /*
http://codex.wordpress.org/Search_Engine_Optimi...