Robots Dot Text plugin
Plugin details
Documentation
Install the plugin:
ruby script/plugin install git://github.com/GavinM/robots_dot_text.git
Setup Instructions
==================
1. script/plugin install http://github.com/GavinM/robots dot text.git
2. remove robots.txt from your /public directory
3. create a controller for your robots with an action called index
script/generate controller robots index
4. add a route in routes.rb to your robots index action:
map.connect "robots.txt", :controller => "robots"
Examples:
==================
Simple example
---------------------
class RobotsController < ActionController::Base def index respond_to do |format| format.text do log_user_agent # adds the crawler's user_agent to user_agents.log @page_content = robots dot text do |rules| rules.comment "Tell all crawlers to keep out of these pages" rules.add :all, admin_path, customers_path, log_path rules.br rules.sitemap sitemap_url end render :text => @page_content, :layout => false end end end end
will render:
# Tell all crawlers to keep out of these pages User-agent: * Disallow: /admin Disallow: /customers Disallow: /log Sitemap: http://handyrailstips.com/sitemap.xml
Complex Example
----------------------
class RobotsController & ActionController::Base def index respond_to do |format| format.txt do log_user_agent(:short, logger) # :short is the datetime format, logger specifies to use Rails.logger instead @page_content = robots dot text do |rules| rules.add :all rules.sitemap sitemap_url, google_news_sitemap_url rules.br rules.comment "Google ignores most directives so here are some rules for Google" rules.add [:google, :google_image, :google_mobile] rules.allow article_path("*") rules.block articles_path rules.line_break rules.comment "These crawlers respect the Crawl-delay directive" rules.add [:yahoo, :msn, :cuil, :ask], private_path, admin_path rules.rate "1/500s" rules.delay 10 rules.comment < @page_content, :layout => false end end end end
will render:
User-agent: * Sitemap: http://handyrailstips.com/sitemap.xml Sitemap: http://handyrailstips.com/google_news_sitemap.xml # Google ignores most directives so here are some rules for Google User-agent: Googlebot User-agent: Googlebot-Image User-agent: Googlebot-Mobile Allow: /articles/* Disallow: /articles # These crawlers respect the Crawl-delay directive User-agent: Slurp User-agent: MSNBot User-agent: Twiceler User-agent: Teoma Disallow: /private Disallow: /admin Request-rate: 1/500s Crawl-delay: 10 # Request robots only crawl between 2am and 8am. # (Those are our quiet times) Visit-time: 0200-0800
Further Documentation
There is currently no advanced documentation for this plugin.
New documentationEdit plugin | (0 older versions) | Last edited by: hardway, about 1 year ago


