c# - CommentRangeStartとCommentRangeEndの間のOpenXmlElementsの取得

Question

私がやろうとしているのは、aCommentRangeStartと対応するの間にあるOpenXMLElementsを見つけることCommentRangeEndです。

私はこれを達成するために2つの方法を試しましたが、問題はCommentRangeEnd開始と同じレベルである必要はないということです。子要素にネストすることができます。以下の非常に単純な構造を参照してください（これは、一般的な考え方を示すためだけのopen xmlでは正しくないことに注意してください）。

<w:commentstart/>
<w:paragraph>
  <w:run />
  <w:commentend />
</w:paragraph>

私が試した2つのアイテムは次のとおりです。最初：最後までアイテムを返す列挙型を作成しました

public static IEnumerable<OpenXmlElement> SiblingsUntilCommentRangeEnd(CommentRangeStart commentStart)
{
    OpenXmlElement element = commentStart.NextSibling();

    if (IsMatchingCommentEnd(element, commentStart.Id.Value))
    {
        yield break;
    }

    while (true)
    { 
        yield return element;
        element = element.NextSibling();

        // Check that the item 
        if (element == null)
        {
            yield break;
        }

        if (IsMatchingCommentEnd(element, commentStart.Id.Value))
        {
            yield break;
        }
    }
}

public static bool IsMatchingCommentEnd(OpenXmlElement element, string commentId)
{
    CommentRangeEnd commentEnd = element as CommentRangeEnd;
    if (commentEnd != null)
    {
        return commentEnd.Id == commentId;
    }
    return false;
}

2番目：開始と終了が同じレベルにないという問題に気づき、私は探し続けました。ブックマーク要素間の要素を処理するためのエリックホワイトの答えを見つけました。私の例ではそれをレトロフィットしましたが、それでも開始と同じ親を持たない（つまり同じレベルにある）ことは問題であり、私はそれを使用できませんでした。

コメントされているテキストを処理する必要があるので、これを見るより良い方法はありますか？私は要素を処理する方法を探しています。

編集： 私が達成しようとしていることの明確化：私は単語で編集されたドキュメントを取り、ドキュメント内のコメントのために、特定のコメントIDの開始範囲と終了範囲の間にコメントされたテキストを取得しようとしています。

編集2： 私は現在考えているものの作業バージョンを作成しましたが、それに関する私の問題は、Wordとは異なるユーザーの組み合わせでは非常に壊れやすい可能性があることです。これはxmlでも機能しますが、これは実際には問題ではありませんが、OpenXMLSDKに変更したいと考えている可能性があります。現在、1つの特定のコメントを処理する代わりに、必要なアイテムを取得するためにドキュメント全体を解析する必要があるようです。 https://github.com/mhbuck/DocumentCommentParser/

発生する主な問題：CommentRangeStartとCommentRangeEndは、XMLドキュメント内で異なるネストにある可能性があります。ルートノードは、潜在的に唯一の類似した祖先要素です。

score 3 · Accepted Answer

Descendants<T>()メソッドを使用して、特定のタイプのノードのすべての子孫を列挙することができます。したがって、コードは次のようになります (yeld読みやすくするためにを使用せずに記述しました ;)):

public static IEnumerable<OpenXmlElement> SiblingsUntilCommentRangeEnd(CommentRangeStart commentStart)
{
    List<OpenXmlElement> commentedNodes = new List<OpenXmlElement>();

    OpenXmlElement element = commentStart;

    while (true)
    {
        element = element.NextSibling();

        // check that the item exists
        if (element == null)
        {
            break;
        }

        //check that the item is matching comment end
        if (IsMatchingCommentEnd(element, commentStart.Id.Value))
        {
            break;
        }

        //check that there is a matching element in the current element's descendants
        var descendantsCommentEnd = element.Descendants<CommentRangeEnd>();
        if (descendantsCommentEnd != null)
        {
            foreach (CommentRangeEnd rangeEndNode in descendantsCommentEnd)
            {
                if (IsMatchingCommentEnd(rangeEndNode, commentStart.Id.Value))
                {
                    //matching range end element found in current element's descendants
                    //an improvement could be made here to manually select descendants before CommentRangeEnd node
                    break;
                }
            }
        }

        commentedNodes.Add(element);
    }

    return commentedNodes;
}

コメントの 1 つに示されているようにCommentRangeEnd、現在の要素の子孫に要素が見つかった場合は終了します。

このコードはまだテストしていないので、問題があればコメントでお知らせください。

ドキュメントの階層において、開始要素が終了要素よりも深い場合、この方法は機能しないことに注意してください。場合によっては、コメントに入れられたコンテンツの一部が返されないこともあります。必要な場合は、後でこのケースを処理するための代替ソリューションで回答を更新できます。別の方法を使用できる可能性があるため、なぜそれらのコメントを見つける必要があるのかも説明してください.

c# - CommentRangeStartとCommentRangeEndの間のOpenXmlElementsの取得

1 に答える 1

Related

Reference