Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
532 views
in Technique[技术] by (71.8m points)

user agent - order of directives in robots.txt, do they overwrite each other or complement each other?

User-agent: Googlebot
Disallow: /privatedir/

User-agent: *
Disallow: /

Now, what are disallowed for Googlebot: /privatedir/, or the whole website / ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

According to the original robots.txt specification:

  1. A bot must follow the first record that matches its user-agent name.

  2. If such a record doesn’t exist, it must follow the record with User-agent: * (this line may not appear in more than one record).

  3. If such a record doesn’t exist, it doesn’t have to follow any record.

So a bot never follows more than one record.


For your example this means:

  • A bot that matches the name "Googlebot" is not allowed to crawl URLs with a path that starts with /privatedir/.
  • A bot that doesn’t match the name "Googlebot" is not allowed to crawl any URL.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...