-
Website
http://www.connectedinternet.co.uk/ -
Original page
http://www.connectedinternet.co.uk/2007/04/12/adding-a-robotstxt-file-has-increased-my-google-traffic-by-16-in-4-days/ -
Subscribe
All Comments -
Community
-
Top Commenters
-
Thilak Rao
6 comments · 13 points
-
AndyBeard
11 comments · 4 points
-
Michael Lankton
102 comments · 1 points
-
Michael Lankton
92 comments · 2 points
-
TheBuzzSaw
3 comments · 1 points
-
-
Popular Threads
-
Netbooks on their way out… official!
22 hours ago · 1 comment
-
Mobile phone GSM encryption cracked
1 day ago · 1 comment
-
What I’d Like In Windows 8
1 week ago · 4 comments
-
Top 2009 Money Makers- and Duds
4 days ago · 1 comment
-
The Number One Way To Amazing Headlines
4 days ago · 1 comment
-
Netbooks on their way out… official!
This is my robots.txt http://alpesh.nakars.com/robots.txt
Any suggestions on that? Or is it forum material?
Cheers!
A
Correct
@Ashwin
Creating a file is simple. Just create a new text file in notepad containing your entries and then upload it to the root directory of your site
Yes. The directory with your wp-admin etc folders is your root directory.
No question is a newbie question - always feel free to fire off anything on this site. We're all friends here!
I'm confused about what robots.txt does ... does this mean that the supplementary pages become main pages?
Thanks for bringing this up - we've been runing with an old bare-bones robots.txt file which paid no attention to wordpress feeds or other such features.
I wonder if it's the robots.txt file change that increased your traffic, or coincidentally the upcoming pagerank change.
We've noticed a very significant increase in Google traffic the past two weeks, which is very similar in magnitude to the traffic increase right before the last pagerank upgrade.
Have an awesome day!
Dan
Anyway, here's the new file I came up with, WAY longer thank I'd initially hoped for.
But does cover some important items, like the "tracback" and comment feed for each post as far as GoogleBot is concerned.
Does anyone see ANYTHING wrong with this file?
Thanks!
Dan
-----
# Robots.txt file
# All robots will spider the domain
User-agent: *
Disallow: /Openads/
Disallow: /wp-
Disallow: /feed/
Disallow: /rss/
Disallow: /trackback/
Disallow: /comments/feed/
Disallow: /wp-content/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /cgi-bin/
Disallow: /styles/
Disallow: /dnld/
Allow: /wp-content/uploads/
Allow: /wp-content/themes/mistylook/img
# GoogleBot
User-agent: Googlebot
Disallow: /*/feed/$
Disallow: /*/feed/rss/$
Disallow: /*/trackback/$
Disallow: /Openads/
Disallow: /wp-
Disallow: /feed/
Disallow: /rss/
Disallow: /trackback/
Disallow: /comments/feed/
Disallow: /wp-content/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /cgi-bin/
Disallow: /styles/
Disallow: /dnld/
Allow: /wp-content/uploads/
Allow: /wp-content/themes/mistylook/img
# allow adsense bot on entire site
User-agent: Mediapartners-Google
Allow: /*
# allow google image bot to search all images
User-agent: Googlebot-Image
Allow: /*
# allow AdWords PPC bot on entire site
User-agent: Adsbot-Google
Allow: /*
I guess it would be useful
How has it affected the overall traffic. i.e. the 16% increase has affected overall site traffic by how much?
the 16% increased overall traffic by around 10%
Although my site is still young and I have only 31 supplemental results (I was shocked that one of my main articles was among them). So I added the robots.txt a moment ago.
Thank you for that great tip!
Eddie
Disallow: /wp-
Does it disallow all directories that start with wp- in the WordPress blog?
I too got bit by Google around 5/1/07 and nealy all of my blog posts went into the Supplemental result decreasing my traffic and sales overall.
Hope it works.
Please reply to vicky316[at]gmail[dot]com
- paste the text you want to use
- save the file as robots.txt
- upload it to your blog's root directory
Could this have anything to do with the robots.txt - or just a big co-incidence?
For the time being I've removed the robots.txt to see if my traffic levels go back up.
Anyone have any ideas? Thanks.
User-agent: Googlebot
Disallow: /*/feed/$
Disallow: /*/feed/rss/$
Disallow: /*/trackback/$
User-agent: *
Disallow: /blog/wp-
Disallow: /blog/feed/
Disallow: /blog/trackback/
Disallow: /blog/rss/
Disallow: /blog/comments/feed/
Disallow: /blog/page/
Disallow: /blog/date/
Disallow: /blog/comments/
Disallow: /rsscb
Nothing in the robots.txt file I posted should decrease your Google traffic as it is pretty basic.
@Glenn
Can't see any problems. Maybe you need to wait a bit longer
I'll see how it goes in the next couple of weeks. Cheers!
User-agent: Googlebot
Disallow: /blog/index.php/feed/$
Disallow: /blog/index.php/feed/rss/$
Disallow: /blog/index.php/trackback/$
User-agent: *
Disallow: /blog/index.php/wp-
Disallow: /blog/index.php/feed/
Disallow: /blog/index.php/trackback/
Disallow: /blog/index.php/rss/
Disallow: /blog/index.php/comments/feed/
Disallow: /blog/index.php/page/
Disallow: /blog/index.php/date/
Disallow: /blog/index.php/comments/
Disallow: /rsscb
Disallow: /index.php
Disallow: /category/
Here is the post about why these are included?
http://www.earnersblog.com/removing-supplementa...
I'm sure you guys will spot numerous errors with my 1st site so be kind!! main problem at the moment is I have several pages in the main index but a hell of a lot in the supplementary results... including some of my main pages!
I need a basic robots.txt to get the stuff out the supplementary into the main index...
I currently have no robots.txt at all
www.jdelectricalgrantham.co.uk
Help would be very much appreciated.
cheers
I am a newbie to blogging. I blog from the new blogger platform. Is it possible to add the robots.txt in that and if so, can you please tell me how to do so? As of now, I think I have nil traffic. I would like to build it up and monetise it.
Eternalsoul
I've never used blogger, so I wouldn't know.
@tom
adding a robots.txt file should be useful for all sites. What you need to do is go through your site, and think through what pages you want appearing and which you don't. You can also use ‘site:www.YOURDOMAIN.COM -view ***‘ to see what pages are currently appearing the index, to see what things Google is currently picking up that you rather they didn't
I love this article, I've been giving the robots issue some thought but have been mostly confused till now. I will probably give your robots file a try on my blog soon.
I 've got a few questions - hopefully you'll have the time to see them:
1. I have currently 732 supp pages (on my other blog) and most of them are Share-this pages generated by the plugin Share this. I'm thinking it'd be a good idea to unindex these. What's your opinion, and how do I include this piece of instruction in robots.txt?
2. I've read that unindexing feeds will prevent my blog from being included in Google Blog Search. Do you know if that's true?
3. What kind of Wordpress page has /date/ in the permalink? Is it daily archive pages?
Thank you, and I really appreciate the way you share your discoveries
Ana
1. Looking at the sahre this url it looks like the structure is: sitename/?p=1923&akst;_action=share-this, so
Disallow: /?p=$shoudl do the trick I think2.Not sure - anyone else?
3. Yes
I ended up putting the robots file back and now the supplemental links has been updated, mine dropped from around 2000 to 250! Whether that will make a big difference to my ranking and traffic, we'll see, but it sure sounds good in terms of supplemental links for what it's worth.
What command should I sue:
1. Disallow: /news/welcome/
2. Disallow: /*/welcome/
3. Disallow: /welcome
I know number 1 should be 100% works. But sometimes, I don't want to type the exact path.
As for number 2, I think it works for google and yahoo bot only. I only these 2 which accept "*".
As for number 3, I don't know whether it works or not. As for as I understand, with or without "/" make a big differences. With "/", you are saying to that directory only. Without "/", you are saying anything start from /welcome. Please correct me if I am wrong.
So the subdirectory news/welcome also start from /welcome, so I can't sure whether Disallow: /welcome will block this subdirectory or not.
User-agent: Googlebot
Disallow: /*/welcome/$
User-agent: *
Disallow: /news/welcome/
Hoepfully someone else will confirm
In three weeks, the number of links in Google's index increased by 5 times!