views:

61

answers:

3

I am about to create a robots.txt file.

I am using notepad.

How should I save the file? UTF8, ANSI or what?

Also, should it be a capital R?

And in the file, I am specifying a sitemap location. Should this be with a capital S?

  User-agent: *
  Sitemap: http://www.domain.se/sitemap.xml

Thanks

A: 

I think you're over thinking things too much. I always do lowercase, just because it's easier.

You can view SO's robots.txt. http://stackoverflow.com/robots.txt

Robert
Ok, also, do you know if I place a sitemap inside a directory on the server, will I then be able to have urls at higher levels like root in that sitemap? Or does the sitemap have to be on top level?
Camran
@Camran that's an entirely separate question. I'd suggest to start it as such.
Pekka
@Pekka, okay I will post another Q...
Camran
+2  A: 

Since the file should consist of only ASCII characters, it normally doesn't matter if you save it as ANSI or UTF-8. You should choose ANSI because when you save a file as UTF-8, notepad adds the Unicode Byte Order Mark to the front of the file, which may make the file unreadable for interpreters that only know ASCII.

Roland Illig
+1  A: 

As for the encoding: @Roland already nailed it. The file should contain only URLs. Non-ASCII characters in URLs are illegal, so saving the file as ASCII should be just fine.

If you need to serve UTF-8 for some reason, make sure this is specified correctly in the content-type header of the text file. You will have to set this in your web server's settings.

As to case sensitivity:

  • According to robotstxt.org, the robots.txt file needs to be lowercase:

    Remember to use all lower case for the filename: "robots.txt", not "Robots.TXT.

  • The keywords are probably case insensitive - I can't find a reference on that - but I would tend to do what all the others do: Use capitalized versions (Sitemap).

Pekka