Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
550 views
in Technique[技术] by (71.8m points)

seo - <noindex> tag for Google

I would like to tell Google not to index certain parts of the page. In Yandex (russian SE) there's a very useful tag called <noindex>. How can it be done with Google?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

According to Wikipedia1, there are some rules some spiders follow:

<!--googleoff: all-->
This should not be indexed by Google. Though its main spider, Googlebot,
might ignore that hint.
<!--googleon: all-->

<div class="robots-nocontent">Yahoo bots won't index this.</div>

<noindex>Yandex bots ignore this text.</noindex>
<!--noindex-->They will ignore this, too.<!--/noindex-->

Unfortunately, they could not agree on a single standard it seems – and to my knowledge, there's nothing to keep all spiders off...

The googleoff: comment seems to support different options, though I'm not sure where there's a complete list. There's at least:

  • all: completely ignore the block
  • index: content doesn't go into Google's index
  • anchor: anchor text for links will not be associated with the target page
  • snippet: text will not be used to create snippets for search results

Note as well that (at least for Google) this will only affect the search index, not the page ranking etc. Furthermore, as Stephen Ostermiller correctly pointed out in his comment below, googleon and googleoff only work with the Google search appliance and have no effect on normal Googlebot, unfortunately.

There's also an article on the Yahoo part2 (and an article describing that Yandex also honors <noindex>6). On the googleoff: part, also see this answer, and the article I took most of the related information from.3


Additionally, Google Webmaster Tools recommend using the rel=nofollow attribute4 for specific links (e.g. ads or links to pages not accessible/useful to the bots, such as login/signup). That means, the HTML a rel Attribute should be honored by the Google bots – though that's mainly related to page rank, not to the search index itself. Unfortunately, it seems there's no rel=noindex5,7. I'm also not sure if this attribute could be used for other elements as well (e.g. <DIV REL="noindex">); but unless crawlers honor "noindex", that wouldn't make sense either.


Further references:


1 Wikipedia: Noindex
2 Which Sections of Your Web Pages Might Search Engines Ignore?
3 Tell Google to Not Index Certain Parts of Your Page
4 Use rel="nofollow" for specific links
5 Is it a good idea to use <a href=“http://name.com” rel=“noindex, nofollow”>name</a>?
6 Using HTML tags — Yandex.Help. Webmaster
7 existing REL values


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...