c# - ファイル名の正規表現

Question

ファイル名が SMITH 3H FINAL 03-26-2012.dwg の dwg ファイルがあり、検証目的で正しい正規表現を見つけようとしています。毎週何百ものファイルがあるため、ファイル名の形式を確認する必要があります。正しい。私は正規表現についてほとんど知りません。以下にいくつかのコードを見つけましたが、有効なものとして渡されません。最初の行を正しく読んでいる場合、ファイル名にコンマが必要なので、有効なものとして渡されませんか?

string filenamePattern = String.Concat("^",
                                                   "([a-z',-.]+\\s+)+",  // HARRIS, SMITH 
                                                   "(\\d{1,2}-\\d{1,2}){1}\\s+",  // 09-06
                                                   "([a-z]+\\s)*",  //
                                                   "((\\#?\\s*(\\d(\\s*|,))*\\d*-\\d+-?H?D?\\d*?),*\\s+(&\\s)*)+",  // #5,6-11H & #4,7,8-11H2, etc
                                                   "([a-z()-]+\\s)*",  // CLIP-OUT (FINAL)
                                                   "(\\d{1,2}-\\d{1,2}(-\\d{2}|-\\d{4})){1}",  // 05-11-2009
                                                   "\\.dwg", // .dwg
                                                   "$");
            RegexOptions options = (RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline | RegexOptions.IgnoreCase);
            Regex reg = new Regex(filenamePattern, options);
            if (reg.IsMatch(filename))
            {
                valid = true;
            }

score 3 · Accepted Answer

他の回答に対するあなたのコメントによると、試してみてください：

^[a-z]+(?:[ -][a-z]+)*\s+\d+H\s+[a-z]+\s+\d{2}-\d{2}-\d{4}\.dwg$

説明：

The regular expression:

(?-imsx:^[a-z]+(?:[ -][a-z]+)*\s+\d+H\s+[a-z]+\s+\d{2}-\d{2}-\d{4}\.dwg$)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  [a-z]+                   any character of: 'a' to 'z' (1 or more
                           times (matching the most amount possible))
----------------------------------------------------------------------
  (?:                      group, but do not capture (0 or more times
                           (matching the most amount possible)):
----------------------------------------------------------------------
    [ -]                     any character of: ' ', '-'
----------------------------------------------------------------------
    [a-z]+                   any character of: 'a' to 'z' (1 or more
                             times (matching the most amount
                             possible))
----------------------------------------------------------------------
  )*                       end of grouping
----------------------------------------------------------------------
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))
----------------------------------------------------------------------
  H                        'H'
----------------------------------------------------------------------
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  [a-z]+                   any character of: 'a' to 'z' (1 or more
                           times (matching the most amount possible))
----------------------------------------------------------------------
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  \d{2}                    digits (0-9) (2 times)
----------------------------------------------------------------------
  -                        '-'
----------------------------------------------------------------------
  \d{2}                    digits (0-9) (2 times)
----------------------------------------------------------------------
  -                        '-'
----------------------------------------------------------------------
  \d{4}                    digits (0-9) (4 times)
----------------------------------------------------------------------
  \.                       '.'
----------------------------------------------------------------------
  dwg                      'dwg'
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

score 1 · Accepted Answer

これは私がそれをする方法です：

// This checks for name"(\w)", then space, then 3H (\w{2}), 
// this will only search for two characters, then space
// then date in the form mm-dd-yyyy or dd-mm-yyyy (\d{2}-\d{2}-\d{4})
Regex reg = new Regex(@"(\w*)\s(\w{2})\s(\w*)\s(\d{2}-\d{2}-\d{4})\.dwg");
if(reg.IsMatch(filename))
{
    valid = true;

}

また、各グループを取得することもできます。適切なクラス期間（またはクラス期間、「＃5,6-11H＆＃4,7,8-11H2など」の部分）を検証するための正規表現がなかったことに注意してください。これにより基本的なフレームワークが提供され、そのグループをプルしてコードのチェックを行うことができます。よりクリーンな正規表現を提供します。

編集：

@DaBearsのニーズに基づいて、私は次のことを考え出しました。

Regex reg = new Regex(@"(\w*|\w*-\w*|\w*\s\w*)\s(\w{2})\s(\w*)\s(\d{2}-\d{2}-\d{4})\.dwg");
if(reg.IsMatch(filename))
{
    valid = true;

}

これは、姓、ハイフンでつながれた名前、またはスペースの姓と一致し、グループ内にあるものをすべて提供します。

c# - ファイル名の正規表現

2 に答える 2

Related

Reference