# robots.txt -- robot exclusion control file # # This is a robot exclusion file. It is here to prevent automatic # agents from visiting certain parts of the web site. # # See for the robot exclusion # standard. # # 2003-07-18 RB Created. # 2003-07-24 RB Excluded import directories after analysis of web logs. # Excluded picsearch.com after further analysis. # 2009-08-11 RB Excluded MSNBot from the issue script. # # $Id: //info.ravenbrook.com/project/www.ravenbrook.com/version/4.0/page/robots.txt#3 $ # Stop crawlers from looking in the "import" directories of our projects. # We've noticed a lot of people downloading stuff we've imported for our # own use, presumably because they found it on Google or whatever. User-agent: * Disallow: /project/p4dti/import/ Disallow: /project/mps/import/ # Picsearch.com hits our site a lot, and there aren't any pictures to be had. # (At least, none that anyone looking there might want.) User-agent: psbot/0.1 Disallow: / # MSNBot crawls the issue script CGI, causing a lot of expensive Perforce # queries, and also filling Raven's process table if Perforce is down. # See http://www.robotstxt.org/db/msnbot.html # The project list was taken from /usr/local/etc/apache/vhosts/ravenbrook.com # on 2009-08-11 and may need updating occasionally. User-agent: MSNBOT/0.1 Disallow: /project/p4dti/issue/ Disallow: /project/mps/issue/ Disallow: /project/alu/issue/ Disallow: /project/scan/issue/ Disallow: /project/snas/issue/ Disallow: /project/ssa-lib/issue/ Disallow: /project/worldview/issue/ Disallow: /project/symclu/issue/ Disallow: /project/pplus/issue/ Disallow: /project/warp/issue/ Disallow: /project/ccc/issue/ Disallow: /project/olb/issue/