indexing - Sitecore 8 XP ContentSearch: インデックスからパスを除外する

Question

一般的なインデックス "sitecore_master_index"、"sitecore_web_index" の Sitecore インデックス作成に問題があります。クローラー/インデクサーがデータベース内のすべてのアイテムをチェックするため、永遠に時間がかかります。

大量の仕様を持つ何千もの製品をインポートし、製品リポジトリには文字通り何十万ものアイテムがあります。

インデックス作成からパスを除外できれば、テンプレートの除外のために何百万ものアイテムをチェックする必要はありません。

ファローアップ

パスのリストをインデックスから除外するカスタムクローラーを実装しました。

<index id="sitecore_web_index" type="Sitecore.ContentSearch.SolrProvider.SwitchOnRebuildSolrSearchIndex, Sitecore.ContentSearch.SolrProvider">
  <param desc="name">$(id)</param>
  <param desc="core">sitecore_web_index</param>
  <param desc="rebuildcore">sitecore_web_index_sec</param>
  <param desc="propertyStore" ref="contentSearch/indexConfigurations/databasePropertyStore" param1="$(id)" />
  <configuration ref="contentSearch/indexConfigurations/defaultSolrIndexConfiguration" />
  <strategies hint="list:AddStrategy">
    <strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/onPublishEndAsync" />
  </strategies>
  <locations hint="list:AddCrawler">
    <crawler type="Sitecore.ContentSearch.Utilities.Crawler.ExcludePathsItemCrawler, Sitecore.ContentSearch.Utilities">
      <Database>web</Database>
      <Root>/sitecore</Root>
      <ExcludeItemsList hint="list">
        <ProductRepository>/sitecore/content/Product Repository</ProductRepository>
      </ExcludeItemsList>
    </crawler>
  </locations>
</index>

さらに、SwitchOnSolrRebuildIndex は素晴らしい ootb 機能であるため、有効にしました。

using System.Collections.Generic;
using System.Linq;
using Sitecore.ContentSearch;
using Sitecore.Diagnostics;

namespace Sitecore.ContentSearch.Utilities.Crawler
{
  public class ExcludePathsItemCrawler : SitecoreItemCrawler
  {
    private readonly List<string> excludeItemsList = new List<string>();
    public List<string> ExcludeItemsList
    {
      get
      {
        return excludeItemsList;
      }
    }

    protected override bool IsExcludedFromIndex(SitecoreIndexableItem indexable, bool checkLocation = false)
    {
      Assert.ArgumentNotNull(indexable, "item");
      if (ExcludeItemsList.Any(path => indexable.AbsolutePath.StartsWith(path)))
      {
        return true;
      }
      return base.IsExcludedFromIndex(indexable, checkLocation);
    }
  }
}

score 2 · Accepted Answer

SitecoreItemCrawler変更するインデックスで使用されるクラスをオーバーライドできます。

<locations hint="list:AddCrawler">
  <crawler type="Sitecore.ContentSearch.SitecoreItemCrawler, Sitecore.ContentSearch">
    <Database>master</Database>
    <Root>/sitecore</Root>
  </crawler>
</locations>

その後、独自のパラメータを追加できます。たとえばExcludeTree、ExcludedBranches.

そして、クラスの実装ではメソッドをオーバーライドするだけです

public override bool IsExcludedFromIndex(IIndexable indexable)

除外ノードの下にあるかどうかを確認します。

score 0 · Accepted Answer

大量のデータをインポートする場合は、データのインデックス作成を一時的に無効にしてみてください。そうしないと、クローラーが追いつかないという問題が発生します。

データのインポート中にインデックスを無効にすることに関する素晴らしい投稿がここにあります。これは Lucene 用ですが、Solr でも同じことができると確信しています。

http://intothecore.cassidy.dk/2010/09/disabling-lucene-indexes.html

もう 1 つのオプションは、マスターデータベースではなく、別の Sitecore データベースに製品を保存することです。

コアへの別の投稿：

http://intothecore.cassidy.dk/2009/05/working-with-multiple-content-databases.html

indexing - Sitecore 8 XP ContentSearch: インデックスからパスを除外する

2 に答える 2

Related

Reference