c# - 2 つの文字列の間のすべての文字列を抽出する

Question

2 つの文字列間のすべての文字列に一致するメソッドを開発しようとしています。

私はこれを試しましたが、最初の一致のみを返します:

string ExtractString(string s, string start,string end)
        {
            // You should check for errors in real-world code, omitted for brevity

            int startIndex = s.IndexOf(start) + start.Length;
            int endIndex = s.IndexOf(end, startIndex);
            return s.Substring(startIndex, endIndex - startIndex);
        }

この文字列があるとしましょう

String Text = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2"

ac# 関数で次のことを行いたい:

public List<string> ExtractFromString(String Text,String Start, String End)
{
    List<string> Matched = new List<string>();
    .
    .
    .
    return Matched; 
}
// Example of use 

ExtractFromString("A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2","A1","A2")

    // Will return :
    // FIRSTSTRING
    // SECONDSTRING
    // THIRDSTRING

ご協力ありがとうございました！

score 35 · Accepted Answer

    private static List<string> ExtractFromBody(string body, string start, string end)
    {
        List<string> matched = new List<string>();

        int indexStart = 0;
        int indexEnd = 0;

        bool exit = false;
        while (!exit)
        {
            indexStart = body.IndexOf(start);

            if (indexStart != -1)
            {
                indexEnd = indexStart + body.Substring(indexStart).IndexOf(end);

                matched.Add(body.Substring(indexStart + start.Length, indexEnd - indexStart - start.Length));

                body = body.Substring(indexEnd + end.Length);
            }
            else
            {
                exit = true;
            }
        }

        return matched;
    }

score 15 · Accepted Answer

これはRegExを使用したソリューションです。次の using ステートメントを含めることを忘れないでください。

using System.Text.RegularExpressions

指定された開始文字列と終了文字列の間のテキストのみを正しく返します。

返されません:

akslakhflkshdflhksdf

返されます:

FIRSTSTRING
SECONDSTRING
THIRDSTRING

正規表現パターンを使用しています[start string].+?[end string]

開始文字列と終了文字列は、正規表現の特殊文字が含まれている場合にエスケープされます。

    private static List<string> ExtractFromString(string source, string start, string end)
    {
        var results = new List<string>();

        string pattern = string.Format(
            "{0}({1}){2}", 
            Regex.Escape(start), 
            ".+?", 
             Regex.Escape(end));

        foreach (Match m in Regex.Matches(source, pattern))
        {
            results.Add(m.Groups[1].Value);
        }

        return results;
    }

次のように、それを String の拡張メソッドにすることができます。

public static class StringExtensionMethods
{
    public static List<string> EverythingBetween(this string source, string start, string end)
    {
        var results = new List<string>();

        string pattern = string.Format(
            "{0}({1}){2}",
            Regex.Escape(start),
            ".+?",
             Regex.Escape(end));

        foreach (Match m in Regex.Matches(source, pattern))
        {
            results.Add(m.Groups[1].Value);
        }

        return results;
    }
}

使用法：

string source = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
string start = "A1";
string end = "A2";

List<string> results = source.EverythingBetween(start, end);

score 3 · Accepted Answer

3

text.Split(new[] {"A1", "A2"}, StringSplitOptions.RemoveEmptyEntries);

于 2012-12-08T19:00:16.657 に答える

score 2 · Accepted Answer

次のコードで開始識別子を使用して、文字列を配列に分割できます。

String str = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";

String[] arr = str.Split("A1");

次に、配列を反復処理し、各文字列の最後の 2 文字を削除します (A2 を削除するため)。また、文字列が A1 で始まると仮定すると空になるため、最初の配列要素を破棄する必要があります。

コードはテストされておらず、現在モバイル上にあります

score 0 · Accepted Answer

これは一般的な解決策であり、より読みやすいコードだと思います。テストされていないので、注意してください。

public static IEnumerable<IList<T>> SplitBy<T>(this IEnumerable<T> source, 
                                               Func<T, bool> startPredicate,
                                               Func<T, bool> endPredicate, 
                                               bool includeDelimiter)
{
    var l = new List<T>();
    foreach (var s in source)
    {
        if (startPredicate(s))
        {
            if (l.Any())
            {
                l = new List<T>();
            }
            l.Add(s);
        }
        else if (l.Any())
        {
            l.Add(s);
        }

        if (endPredicate(s))
        {
            if (includeDelimiter)
                yield return l;
            else
                yield return l.GetRange(1, l.Count - 2);

            l = new List<T>();
        }
    }
}

あなたの場合、あなたは電話することができます、

var text = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
var splits = text.SplitBy(x => x == "A1", x => x == "A2", false);

これは、区切り文字を結果に含めたくない場合(ケースのように)、最も効率的ではありませんが、反対の場合には効率的です。ケースを高速化するには、GetEnumerator を直接呼び出して MoveNext を利用できます。

c# - 2 つの文字列の間のすべての文字列を抽出する

5 に答える 5

Related

Reference