Esri Geoportal Server
1.2.9

Package com.esri.gpt.framework.robots

Provides support for reading and parsing robots.txt

See:
          Description

Interface Summary
Access Access.
Bots Represents access policy from a single "robots.txt" file.
MatchingStrategy Matching strategy.
WinningStrategy Winning strategy.
 

Class Summary
BotsParser Parser of "robots.txt" file.
BotsUtils Robots txt utility class/shortcut methods
 

Enum Summary
BotsMode RobotsTxt mode.
Directive Robots.txt file directive.
 

Package com.esri.gpt.framework.robots Description

Provides support for reading and parsing robots.txt

Standard for Robots Exclusion is a mechanism allowing servers to communicate with web crawlers about it's access policy. This implementation follows recommendations found in the following sources:

http://www.robotstxt.org/orig.html
http://www.robotstxt.org/norobots-rfc.txt
https://en.wikipedia.org/wiki/Robots_exclusion_standard
https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt

Behavior of the "robots.txt" mechanism can be configured through the following parameters in the gpt.xml configuration file:

bot.robotstxt.enabled: use of robots.txt during harvesting is enabled. Default: true
bot.robotstxt.override: allows user to override bot.robotstxt.enabled. Default: true;
bot.agent: name of the user agent for interpreting content of robots.txt. Default: "GeoportalServer"

See Also:
Bots, BotsParser, BotsUtils

Esri Geoportal Server
1.2.9

Copyright 2011 Environmental Systems Research Institute. All rights reserved. Use is subject to license terms.