Ubuntu Documentation

Bug #122297
Comment #11

Comment 11 for bug 122297

Revision history for this message

Dean Sas (dsas) wrote on 2008-03-02: Re: [Bug 122297] Re: Server Guide draft has higher Google rank than released version

#11

Matthew East wrote:
> Hi,
>
> On Fri, Feb 29, 2008 at 10:15 PM, Jim Campbell <email address hidden> wrote:
>> You can do this through a robots.txt file, through the meta tags on your
>> site... I think you can even do it through modifications to htaccess.
>>
>> ubuntu.com already has a robots.txt file in place, but I'm not sure how
>> robots.txt files applie to subdomains. I also do not know what kind of
>> control we have over the meta tags in the draft documentation. Are the meta
>> tags auto-generated as part of the page creation process?
>
> Yes, although no doubt it is possible to customise them if necessary.
> http://www.sagehill.net/docbookxsl/HtmlHead.html looks like it has the
> relevant instructions and I could take care of that aspect of it. But
> I'm not familiar with robots.txt files.

Either:
doc.ubuntu.com should have a file called 'robots.txt' in the site root
containing the following two lines:
User-agent: *
Disallow: /

(disallow all bots access to all pages)

Or:
The HTML head tag needs to contain a meta tag like so:
<meta name="robots" content="noindex, nofollow">
(noindex means don't index this page, and nofollow means don't crawl any
links on this page)

This should be added to every html page.

http://www.robotstxt.org is a good resource.