seo - 親ドメインのサブディレクトリのインデックス作成を防止する

Question

私のサイト children.com (インデックスを作成したい) は、 http://mother.com/children/ (インデックスを作成したくない) からもアクセスできるとします。

階層の例: /home/username/mother : http://mother.com |_ children : http://www.children.com

children.com および children.com のすべてのサブディレクトリ内のコンテンツが、mother.com に属するものとしてインデックスに登録されないようにするには、mother.com/robots.txt ファイルに何を入力すればよいですか?

提案をありがとう

score 0 · Accepted Answer

実際には、robots.txt を使用することさえ望まないでしょう。ただし、代わりにrobots メタタグとcanonical タグを組み合わせて使用してください。

すべての mother.com/children ページで、値が「noindex」のメタロボットタグを追加します。検索エンジンはページをクロールできますが、これらのページをインデックスに追加することはできません。現在、これはコンテンツの信頼できる場所に関して混乱を招く可能性があります.

したがって、信頼できるコンテンツが存在する場所を主要な検索エンジンに知らせるために、クロスドメイン正規タグを使用する必要があります。したがって、mother.com/children のページに canonical タグを追加し、children.com で値を指定します。特定のページの場合は、children.com の同じコンテンツに正規化する必要があります。canonical タグは実際には同一のコンテンツのみを対象としているためです。

score 0 · Accepted Answer

私は自分の質問を解決し、phpwebby robots.txt アナライザーで確認しました...次のコードをmother.com/robots.txtファイルに入れました:

User-agent: Googlebot
Disallow: /
User-agent: Mediapartners-Google
Disallow: /
User-agent: Adsbot-Google
Disallow: /
User-agent: Jeeves
Disallow: /
User-agent: Slurp
Disallow: /
User-agent: Yahoo-MMCrawler
Disallow: /
User-agent: msnbot
Disallow: /
User-agent: psbot
Disallow: /
User-agent: *
Disallow: /

そして、chilren.com の robots.txt ファイルに以下を追加しました。

User-agent: *
#block indexing of email and print pages -------
Disallow: /*~email.shtml
Disallow: /*~print.shtml
Sitemap: http://www.children.com/sitemap_index.xml

もちろん、(robots.txt ファイルアナライザーを使用して) トリプルチェックを行って、さまざまなサブディレクトリが Mother.com ドメイン経由でアクセスできず、children.com ドメイン経由でインデックス可能であることを確認しました。

注: 例として、mother.com ドメインと children.com ドメインを使用しています。

seo - 親ドメインのサブディレクトリのインデックス作成を防止する

2 に答える 2

Related

Reference