c# - C# を使用して文字列リストから共通部分を抽出する方法

Question

これが私のシナリオです！

List<String> list = new List<String>();
list.Add("E9215001");
list.Add("E9215045");
list.Add("E1115001");
list.Add("E1115022");
list.Add("E1115003");
list.Add("E2115041");
list.Add("E2115042");
list.Add("E4115021");
list.Add("E5115062");

C# と LINQ を使用して、上記のリストから次の共通部分を抽出する必要があります。

E92150 -> {* E92150 *01, * E92150 *45}から抽出

E11150 -> {* E11150 *01, * E11150 *22, * E11150 *03}から抽出

E21150 -> {* E21150 *41, * E21150 *42}から抽出

E41150 -> {* E41150 *21}から抽出

E51150 -> {* E51150 *62}から抽出

更新: ありがとう! みんな！@mlorbetske と @shelleybutterfly の助けを借りて、私はそれを理解しました!

解決：

list.Select((item, index) => new {
Index=index, 
Length=Enumerable.Range(1, (item.Length-2)) //I'm ignoring the last 2 characters
                 .Reverse()
                 .First(proposedLength => list.Count(innerItem =>  
                   innerItem.StartsWith(item.Substring(0, proposedLength))) > 
                   1)}).Select(n => list[n.Index].Substring(0, n.Length)).Distinct()

score 5 · Accepted Answer

これがあなたが探しているものかどうかは疑問ですが、

var result = list.Select(s => s.Substring(0, 6))
                 .Distinct();

score 1 · Accepted Answer

一致を決定するための基準が何であるかわからないので、これを書きました-それは完全に斬新で、実際にあなたが望むものではないことは99.9999％確実です.

基本的に、外側の選択は、決定された長さのすべての部分文字列を取得します。

最初の内部選択は、リスト内の少なくとも 1 つの他の文字列で見つかったこの文字列の最大長を決定します。

group by (最初の内部選択に続く) は、見つかった長さをそれ自体でグループ化します。

このグループ化は、長さ対見つかった回数のディクショナリに変換されます。

次に、長さが見つかった頻度 ( Value) でグループ化のセットを並べ替えます (昇順)。

次に、その実際の長さ (最も頻繁に発生しない長さ - from Key) を取得し、それをの 2 番目のパラメーターに吐き戻してSubstring、0 からその長さまでの部分文字列を取得します。もちろん、今は外側の select に戻っているので、実際に値を取得しています (万歳!)。

ここで、その結果から個別の値のセットを取得し、出来上がりです!

list.Select(
    item => item.Substring(0, 
        list.Select(
            innerItem => Enumerable.Range(1, innerItem.Length)
                           .Reverse()
                           .First(proposedlength => list.Count(innerInnerItem => innerInnerItem.StartsWith(innerItem.Substring(0, proposedlength))) > 1)
                   )
            .GroupBy(length => length)
            .ToDictionary(grouping => grouping.Key, grouping => grouping.Count())
            .OrderBy(pair => pair.Value)
            .Select(pair => pair.Key)
            .First())
        ).Distinct()

上記のコメントを読んだ後、用語ごとに、他のいずれかに存在する明確な最長部分文字列を見つけることにも関心があることがわかりました。そのためのより斬新なコードを次に示します。

list.Select((item, index) => new {
    Index=index, 
    Length=Enumerable.Range(1, item.Length)
                     .Reverse()
                     .First(proposedLength => list.Count(innerItem => innerItem.StartsWith(item.Substring(0, proposedLength))) > 1)
}).Select(n => list[n.Index].Substring(0, n.Length))
  .Distinct()

つまり、リスト内の各項目を繰り返し処理し、エントリのインデックスと、リスト内の少なくとも 1 つの他のエントリで見つかる可能性があるその要素の先頭からの最長の部分文字列を収集します。それに続いて、各インデックス/長さのペアからすべての部分文字列を収集し、文字列の個別のセットのみを取得します。

score 1 · Accepted Answer

インラインクエリ構文である必要がありますか? もしそうなら、どうですか：

var result =
    from item in list
    select item.Substring(0,6);

または Distinct 要件:

var result =
    (
        from item in list
        select item.Substring(0,6);
    )
    .Distinct();

score 0 · Accepted Answer

解決しました！@ mlorbetskeと @ shelleybutterflyに感謝します

list.Select((item, index) => new { Index=index, 
            Length=Enumerable.Range(1, (item.Length-2)) //I don't need the last 2 Char so I'm ignoring it
            .Reverse()
            .First(proposedLength => list.Count(innerItem =>  
             innerItem.StartsWith(item.Substring(0, proposedLength))) > 
             1)}).Select(n => list[n.Index].Substring(0, n.Length)).Distinct()

c# - C# を使用して文字列リストから共通部分を抽出する方法

更新: ありがとう! みんな！@mlorbetske と @shelleybutterfly の助けを借りて、私はそれを理解しました!

解決：

4 に答える 4

Related

Reference