regex - Excelの2つのリストをVBA正規表現と比較する

Question

それらを使用して、Excelの2つのリスト（列）を比較し、一致するものを見つけたいと思います。これは非常に複雑な操作なので、過去にExcelでいくつかの異なる関数（非VBA）を使用して実行しましたが、せいぜい厄介であることがわかったので、オールインワンを試してみたかったのです。可能であれば、VBAソリューション。

最初の列には、不規則な名前があります（引用されたニックネーム、「jr」や「sr」などの接尾辞、「優先」バージョンの名の前後の括弧など）。さらに、ミドルネームが存在する場合、それらは名前またはイニシャルのいずれかである可能性があります。

最初の列の順序は次のとおりです。

 <first name or initial>
 <space>
 <any parenthetical 'preferred' names - if they exist>
 <space>
 <middle name or initial - if it exists>
 <space>
 <quoted nickname or initial - if it exists>
 <space>
 <last name>
 <comma - if necessary><space - if necessary><suffix - if it exists>

2番目の列の順序は次のとおりです。

 `<lastname><space><suffix>,<firstname><space><middle name, if it exists>`

、最初の列にある「不規則性」はありません。

私の主な目的は、最初の列を次の順序で「クリーンアップ」することです。

 `lastname-space-suffix,firstname-space-preferred name-space-
 middle name-space-nickname`

ここでは「不規則性」を保持していますが、比較コードである種の「フラグ」を使用して、ケースバイケースでそれらを警告することができます。

私はいくつかのパターンを試してきましたが、これが私の最新のものです。

["]?([A-Za-z]?)[.]?["]?[.]?[\s]?[,]?[\s]?

ただし、姓と接尾辞（存在する場合）は許可したいと思います。「グローバル」でテストしましたが、たとえば逆参照を使用して、名前とサフィックスを区切る方法がわかりませんでした。

次に、2つのリスト間で、最後、最初、ミドルネームのイニシャル（ほとんどの名前は最初のリストのイニシャルのみであるため）を比較したいと思います。

 An example would be:
 (1st list)
 John (Johnny) B. "Abe" Smith, Jr.
 turned into:
 Smith Jr,John (Johnny) B "Abe"
 or
 Smith Jr,John B

 and
 (2nd list)
 Smith Jr,John Bertrand
 turned into:
 Smith Jr,John B

 Then run a comparison between the two columns.

このリスト比較の開始点または継続点は何ですか？

2012年4月10日補遺：

ちなみに、ニックネームから引用符を削除し、優先名から括弧を削除する必要があります。グループ化された参照をさらにサブグループに分割できますか（以下の例）？

 (?:  ([ ] \( [^)]* \)))?  # (2) parenthetical 'preferred' name (optional) 
 (?:  ([ ] (["'] ) .*?) \6 )? # (5,6) quoted nickname or initial (optional)

次のようにグループ化できますか？

 (?:(([ ])(\()([^)]*)(\))))? # (2) parenthetical 'preferred' name (optional) 
 not sure how to do this one -  # (5,6) quoted nickname or initial (optional)

「RegexCoach」と「RegExr」で試してみましたが、正常に機能しましたが、VBAでは、\ 11、\ 5のように後方参照を返したい場合、返されるのは名、数字、カンマだけでした。（例：「Carl1」）。タイプミスがないか確認します。助けてくれてありがとう。

2012年4月17日補遺：

私が見落とした「状況」という名前が1つありました。それは、「StCyr」や「VonWilhelm」など、2つ以上の単語で構成される最後の名前です。
次の追加でしょうか

 `((St|Von)[ ])?

あなたが提供したこの正規表現で動作しますか？

 `((St|Von)[ ])?([^\,()"']+)

Regex CoachとRegExrでの私のテストは、置換が前にスペースを付けて「St」を返すという点で、完全には機能していません。

score 2 · Accepted Answer

これが役立つ可能性のある正規表現です。これにより、6つのキャプチャグループが次の順序で表示されます：名、優先名、ミドルネーム、ニックネーム、姓、接尾辞。

([a-z]+)\.?\s(?:(\([a-z]+\))\s)?(?:([a-z]+)\.?\s)?(?:("[a-z]+")\s)?([a-z]+)(?:,\s([a-z]+))?

説明は次のとおりです。

([a-z]+)\.?\s          # First name, followed by optional '.' (required)
(?:(\([a-z]+\))\s)?    # Preferred name, optional
(?:([a-z]+)\.?\s)?     # Middle name, optional
(?:("[a-z]+")\s)?      # Nickname, optional
([a-z]+)               # Last name, required
(?:,\s([a-z]+))?       # Suffix, optional

したがって、たとえば、このようなグループを組み合わせてにJohn (Johnny) B. "Abe" Smith, Jr.変換したり、を使用してに変換したりできます。Smith Jr,John (Johnny) B "Abe"\5 \6,\1 \2 \3 \4Smith Jr,John B\5 \6,\1 \3

score 2 · Accepted Answer

やり直し -

これは別のアプローチです。VBA で動作する可能性がありますが、これは単なる例です。これを Perl でテストしたところ、うまくいきました。ただし、perl コードは表示せず、
正規表現といくつかの説明のみを示します。

これは 2 段階のプロセスです。

列のテキストを正規化する
メインの解析を行う

正規化プロセス

列の値を取得する
すべてのドットを取り除く.- をグローバルに検索し\.、何も置き換えない''
空白をスペースに変換 - をグローバルに検索し\s+、単一のスペースに置き換えます[ ]

（正規化できない場合は、何を試しても成功する可能性はあまりないことに注意してください）

主な解析プロセス

列の値を正規化した後 (両方の列に対して行う)、これらの正規表現を実行します。

列 1 の正規表現

^
  [ ]?
  ([^\ ,()"']+)                        # (1)     first name or initial          (required)
  (?:  ([ ] \( [^)]* \))    )?         # (2)     parenthetical 'preferred' name (optional)
  (?:
       ([ ] [^\ ,()"'] )               # (3,4)   middle initial OR name         (optional)
       ([^\ ,()"']*)                   #         name and initial are both captured
  )?
  (?:  ([ ] (["'] ) .*?) \6 )?         # (5,6)   quoted nickname or initial     (optional)
  [ ]  ([^\ ,()"']+)                   # (7)     last name                      (required)
  (?:
        [, ]* ([ ].+?) [ ]?            # (8)     suffix                         (optional)
      | .*?
  )?
$

交換は、あなたが望むものに依存します。
3 つのタイプが定義されています (必要に応じて置き換え$ます\)。

タイプ 1a フルミドル -$7$8,$1$2$3$4$5$6
タイプ 1b ミドルイニシャル -$7$8,$1$2$3$5$6
タイプ2ミドルイニシャル-$7$8,$1$3

変換例:

Input (raw)               = 'John (Johnny) Bertrand "Abe" Smith, Jr.  '
Out type 1 full middle    = 'Smith Jr,John (Johnny) Bertrand "Abe"'
Out type 1 middle initial = 'Smith Jr,John (Johnny) B "Abe"'
Out type 2 middle initial = 'Smith Jr,John B'

列 2 の正規表現

^
  [ ]?
  ([^\ ,()"']+)                  # (1)     last name                      (required)
  (?: ([ ] [^\ ,()"']+) )?       # (2)     suffix                         (optional)
  ,
  ([^\ ,()"']+)                  # (3)     first name or initial          (required)
  (?:
      ([ ] [^\ ,()"'])           # (4,5)   middle initial OR name         (optional)
      ([^\ ,()"']*)
  )?
  .*
$

交換は、あなたが望むものに依存します。2 つのタイプが定義されています
(必要に応じて置き換えます)。 $\

タイプ 1a フルミドル -$1$2,$3$4$5
タイプ 1b ミドルイニシャル -$1$2,$3$4

変換例:

Input                     = 'Smith Jr.,John Bertrand  '
Out type 1 full middle    = 'Smith Jr,John Bertrand'
Out type 1 middle initial = 'Smith Jr,John B'

VBA 置換ヘルプ

これは非常に古い Excel のコピーで機能し、VBA プロジェクトを作成します。
これらは、例を示すために作成された 2 つのモジュールです。
どちらも同じことをします。

1 つ目は、可能なすべての置換タイプの詳細な例です。
2 つ目は、タイプ 2 の比較のみを使用した縮小バージョンです。

お分かりのように、私は以前に VB をやったことがありませんが、置き換えがどのように機能し、どのように Excel
列に結び付けられるかを収集するのに十分なほど単純なはずです。

単純な比較を行う場合は、col 1 val
を 1 回実行してから、それに対して列 2 の各値をチェックしてから、
列 1 の次の val に移動してから繰り返します。

これを行う最速の方法として、2 つの余分な列を作成し、尊重される
列の値をタイプ 2 (変数 strC1_2 および strC2_2、例を参照) に変換して
から、それらを新しい列にコピーします。
その後、正規表現は必要ありません。列を比較し、一致する行を見つけてから
、タイプ 2 の列を削除するだけです。

詳細 -

Sub RegexColumnValueComparison()

' Column 1 and 2 , Sample values
' These should probably be passed in values
' ============================================
strC1 = "John (Johnny)   Bertrand ""Abe""   Smith, Jr.  "
strC2 = "Smith Jr.,John Bertrand  "

' Normalization Regexs for whitespace's and period's
' (use for both column values)
' =============================================
Set rxDot = CreateObject("vbscript.regexp")
rxDot.Global = True
rxDot.Pattern = "\."
Set rxWSp = CreateObject("vbscript.regexp")
rxWSp.Global = True
rxWSp.Pattern = "\s+"

' Column 1 Regex
' ==================
Set rxC1 = CreateObject("vbscript.regexp")
rxC1.Global = False
rxC1.Pattern = "^[ ]?([^ ,()""']+)(?:([ ]\([^)]*\)))?(?:([ ][^ ,()""'])([^ ,()""']*))?(?:([ ]([""']).*?)\6)?[ ]([^ ,()""']+)(?:[, ]*([ ].+?)[ ]?|.*?)?$"

' Column 2 Regex
' ==================
Set rxC2 = CreateObject("vbscript.regexp")
rxC2.Global = False
rxC2.Pattern = "^[ ]?([^ ,()""']+)(?:([ ][^ ,()""']+))?,([^ ,()""']+)(?:([ ][^ ,()""'])([^ ,()""']*))?.*$"

' Normalize column 1 and 2, Copy to new var
' ============================================
strC1_Normal = rxDot.Replace(rxWSp.Replace(strC1, " "), "")
strC2_Normal = rxDot.Replace(rxWSp.Replace(strC2, " "), "")


' ------------------------------------------------------
' This section is informational
' Shows some sample replacements before comparison
' Just pick 1 replacement from each column, discard the rest
' ------------------------------------------------------

' Create Some Replacement Types for Column 1
' =====================================================
strC1_1a = rxC1.Replace(strC1_Normal, "$7$8,$1$2$3$4$5$6")
strC1_1b = rxC1.Replace(strC1_Normal, "$7$8,$1$2$3$5$6")
strC1_2 = rxC1.Replace(strC1_Normal, "$7$8,$1$3")

' Create Some Replacement Types for Column 2
' =====================================================
strC2_1b = rxC2.Replace(strC2_Normal, "$1$2,$3$4$5")
strC2_2 = rxC2.Replace(strC2_Normal, "$1$2,$3$4")

' Show Types in Message Box
' =====================================================
c1_t1a = "Column1 Types:" & Chr(13) & "type 1a full middle    - " & strC1_1a
c1_t1b = "type 1b middle initial - " & strC1_1b
c1_t2 = "type 2 middle initial - " & strC1_2
c2_t1b = "Column2 Types:" & Chr(13) & "type 1b middle initial - " & strC2_1b
c2_t2 = "type 2 middle initial - " & strC2_2

MsgBox (c1_t1a & Chr(13) & c1_t1b & Chr(13) & c1_t2 & Chr(13) & Chr(13) & c2_t1b & Chr(13) & c2_t2)

' ------------------------------------------------------
' Compare a Value from Column 1 vs Column 2
' For this we will compare Type 2 values
' ------------------------------------------------------
If strC1_2 = strC2_2 Then
   MsgBox ("Type 2 values are EQUAL: " & Chr(13) & strC1_2)
Else
   MsgBox ("Type 2 values are NOT Equal:" & Chr(13) & strC1_2 & " != " & strC1_2)
End If

' ------------------------------------------------------
' Same comparison (Type 2) of Normalized column 1,2 values
' In esscense, this is all you need
' ------------------------------------------------------
If rxC1.Replace(strC1_Normal, "$7$8,$1$3") = rxC2.Replace(strC2_Normal, "$1$2,$3$4") Then
   MsgBox ("Type 2 values are EQUAL")
Else
   MsgBox ("Type 2 values are NOT Equal")
End If

End Sub

タイプ 2 のみ -

Sub RegexColumnValueComparison()

' Column 1 and 2 , Sample values
' These should probably be passed in values
' ============================================
strC1 = "John (Johnny)   Bertrand ""Abe""   Smith, Jr.  "
strC2 = "Smith Jr.,John Bertrand  "

' Normalization Regexes for whitespace's and period's
' (use for both column values)
' =============================================
Set rxDot = CreateObject("vbscript.regexp")
rxDot.Global = True
rxDot.Pattern = "\."
Set rxWSp = CreateObject("vbscript.regexp")
rxWSp.Global = True
rxWSp.Pattern = "\s+"

' Column 1 Regex
' ==================
Set rxC1 = CreateObject("vbscript.regexp")
rxC1.Global = False
rxC1.Pattern = "^[ ]?([^ ,()""']+)(?:([ ]\([^)]*\)))?(?:([ ][^ ,()""'])([^ ,()""']*))?(?:([ ]([""']).*?)\6)?[ ]([^ ,()""']+)(?:[, ]*([ ].+?)[ ]?|.*?)?$"

' Column 2 Regex
' ==================
Set rxC2 = CreateObject("vbscript.regexp")
rxC2.Global = False
rxC2.Pattern = "^[ ]?([^ ,()""']+)(?:([ ][^ ,()""']+))?,([^ ,()""']+)(?:([ ][^ ,()""'])([^ ,()""']*))?.*$"

' Normalize column 1 and 2, Copy to new var
' ============================================
strC1_Normal = rxDot.Replace(rxWSp.Replace(strC1, " "), "")
strC2_Normal = rxDot.Replace(rxWSp.Replace(strC2, " "), "")

' Comparison (Type 2) of Normalized column 1,2 values
' ============================================
strC1_2 = rxC1.Replace(strC1_Normal, "$7$8,$1$3")
strC2_2 = rxC2.Replace(strC2_Normal, "$1$2,$3$4")

If strC1_2 = strC2_2 Then
   MsgBox ("Type 2 values are EQUAL")
Else
   MsgBox ("Type 2 values are NOT Equal")
End If

End Sub

括弧/引用応答

As a side note, I will need to eliminate the quotes from the nicknames and the parentheses from the preferred names.

私が正しく理解していれば..

はい、引用符と括弧内のコンテンツを別々にキャプチャできます。
いくつかの変更が必要です。以下の正規表現には、
引用符および/または括弧、
またはその他の形式の有無にかかわらず、置換を定式化する機能があります。

以下のサンプルは、置換を作成する方法を示しています。

ここで非常に重要な注意事項

一致する正規表現から引用符 "" と括弧 () を削除することについて話している場合
、それも同様に実行できます。新しい正規表現が必要です。

唯一の問題は、優先/中間/ニックの間のすべての区別が
窓の外に放り出されることです。これは、これらが位置的であり、
区切られているためです (つまり、(優先) 中間の「ニック」)。

それを考慮すると、このような正規表現の部分式が必要になります

(?:[ ]([^ ,]+))?   # optional preferred
(?:[ ]([^ ,]+))?   # optional middle
(?:[ ]([^ ,]+))?   # optional nick

また、オプションであるため、すべての位置参照が失われ、中間の最初の
式が無効になります。

エンドノート

正規表現テンプレート (置換文字列の作成に使用)

^
  [ ]?

# (required) 
   # First
   #  $1  name
   # -----------------------------------------
    ([^\ ,()"']+)                 # (1) name     

# (optional)
   # Parenthetical 'preferred'
   #  $2    all
   #  $3$4  name
   # -----------------------------------------
    (?: (                         #  (2)   all  
           ([ ]) \( ([^)]*) \)    #  (3,4) space and name
        )
    )?  

# (optional)
  # Middle
  #   $5    initial
  #   $5$6  name
  # -----------------------------------------
    (?:  ([ ] [^\ ,()"'] )       #  (5) first character
         ([^\ ,()"']*)           #  (6) remaining characters

    )?                                   

# (optional)
   # Quoted nick                       
   #  $7$8$9$8  all
   #  $7$9      name
   # -----------------------------------------
    (?: ([ ])                    # (7) space
        (["'])                   # (8) quote
        (.*?)                    # (9) name
        \8 
    )?

# (required)
   #  Last
   #  $10  name
   # -----------------------------------------
    [ ] ([^\ ,()"']+)            # (10) name

# (optional)
   # Suffix 
   #  $11  suffix
   # -----------------------------------------
    (?:    [, ]* ([ ].+?) [ ]?   # (11) suffix
        |  .*?
    )?
$

VBA 正規表現 (第 2 版、上記の VBA プロジェクトでテスト済み)

rxC1.Pattern = "^[ ]?([^ ,()""']+)(?:(([ ])\(([^)]*)\)))?(?:([ ][^ ,()""'])([^ ,()""']*))?(?:([ ])([""'])(.*?)\8)?[ ]([^ ,()""']+)(?:[, ]*([ ].+?)[ ]?|.*?)?$"


strC1_1a  = rxC1.Replace( strC1_Normal, "$10$11,$1$2$5$6$7$8$9$8" )
strC1_1aa = rxC1.Replace( strC1_Normal, "$10$11,$1$3$4$5$6$7$9" )
strC1_1b  = rxC1.Replace( strC1_Normal, "$10$11,$1$2$5$7$8$9$8" )
strC1_1bb = rxC1.Replace( strC1_Normal, "$10$11,$1$3$4$5$7$9" )
strC1_2   = rxC1.Replace( strC1_Normal, "$10$11,$1$5" )

サンプル入出力の可能性

Input (raw)                 = 'John (Johnny) Bertrand "Abe" Smith, Jr.  '

Out type 1a  full middle    = 'Smith Jr,John (Johnny) Bertrand "Abe"'
Out type 1aa full middle    = 'Smith Jr,John Johnny Bertrand Abe'
Out type 1b  middle initial = 'Smith Jr,John (Johnny) B "Abe"'
Out type 1bb middle initial = 'Smith Jr,John Johnny B Abe'
Out type 2   middle initial = 'Smith Jr,John B'

Input (raw)                 = 'John  (Johnny)  Smith, Jr.'

Out type 1a  full middle    = 'Smith Jr,John (Johnny)'
Out type 1aa full middle    = 'Smith Jr,John Johnny'
Out type 1b  middle initial = 'Smith Jr,John (Johnny)'
Out type 1bb middle initial = 'Smith Jr,John Johnny'
Out type 2   middle initial = 'Smith Jr,John'


Input (raw)                 = 'John  (Johnny)  "Abe" Smith, Jr.'

Out type 1a  full middle    = 'Smith Jr,John (Johnny) "Abe"'
Out type 1aa full middle    = 'Smith Jr,John Johnny Abe'
Out type 1b  middle initial = 'Smith Jr,John (Johnny) "Abe"'
Out type 1bb middle initial = 'Smith Jr,John Johnny Abe'
Out type 2   middle initial = 'Smith Jr,John'


Input (raw)                 = 'John   "Abe" Smith, Jr.'

Out type 1a  full middle    = 'Smith Jr,John "Abe"'
Out type 1aa full middle    = 'Smith Jr,John Abe'
Out type 1b  middle initial = 'Smith Jr,John "Abe"'
Out type 1bb middle initial = 'Smith Jr,John Abe'
Out type 2   middle initial = 'Smith Jr,John'

Re:4/17気になる

last names that have 2 or more words. Would the allowance for certain literal names, rather than generic word patterns, be the solution?

実際、いいえ、そうではありません。この場合、フォームで姓に複数の単語を許可
すると、スペースフィールド区切りが姓フィールドに挿入されます。

ただし、特定のフォームでは、唯一のハンディキャップは
"nick"フィールドが欠落している場合であるため、実行できます。ミドルネームに単語が 1 つしかない場合は
、2 つの順列が表示されます。

以下の 3 つの正規表現とテストケースの出力から解決策が得られることを願っています。正規表現は、キャプチャからスペース区切りを削除しました。
そのため、Replace メソッドを使用して置換を構成するか、キャプチャバッファーを格納して
、他の列のキャプチャシナリオの結果と比較することができます。

Nick_rx.Pattern (template)

* This pattern is multi-word last name, NICK is required 

^
  [ ]?

   # First (req'd)
    ([^\ ,()"']+)              # (1) first name

   # Preferred first
    (?: [ ]
       (                       # (2) (preferred), -or-
         \( ([^)]*?) \)        # (3) preferred
       )
    )?  

   # Middle
    (?: [ ]
        (                      # (4) full middle, -or-
          ([^\ ,()"'])         # (5) initial
          [^\ ,()"']*
        )
    )?

   # Quoted nick (req'd)
     [ ]
     (                         # (6) "nick",
       (["'])   # (7) n/a        -or-
       (.*?)                   # (8)  nick
       \7
     )

   # Single/Multi Last (req'd)
    [ ]
    (                          # (9) multi/single word last name
      [^\ ,()"']+
      (?:[ ][^\ ,()"']+)*
    )

   # Suffix 
    (?: [ ]? , [ ]? (.*?) )?   # (10) suffix

  [ ]?
$

-----------------------------------

FLs_rx.Pattern (template)

* This pattern has no MIDDLE/NICK, is single-word last name,
* and has no permutations.

^
  [ ]?

   # First (req'd)
    ([^\ ,()"']+)              # (1) first name

   # Preferred first
    (?: [ ]
       (                       # (2) (preferred), -or-
         \( ([^)]*?) \)        # (3)  preferred
       )
    )?  

   # Single Last (req'd)
     [ ]
     ([^\ ,()"']+)             # (4) single word last name

   # Suffix 
    (?: [ ]? , [ ]? (.*?) )?   # (5) suffix

  [ ]?
$

-----------------------------------

FLm_rx.Pattern (template)

* This pattern has no NICK, is multi-word last name,
* and has 2 permutations.
* 1. Middle as part of Last name.
* 2. Middle is separate from Last name.

^
  [ ]?

   # First (req'd)
    ([^\ ,()"']+)              # (1) first name

   # Preferred first
    (?: [ ]
       (                       # (2) (preferred), -or-
         \( ([^)]*?) \)        # (3)  preferred
       )
    )?  

   # Multi Last (req'd)
    [ ]
    (                         # (4) Multi, as Middle + Last,
                              # -or-
       (?:                         # Middle
          (                        # (5) full middle, -or-
             ([^\ ,()"'])          # (6) initial
             [^\ ,()"']*
          )
          [ ]
       )
                                   # Last (req'd)
       (                           # (7) multi/single word last name
          [^\ ,()"']+
          (?:[ ][^\ ,()"']+)* 
       )
    )

   # Suffix 
    (?: [ ]? , [ ]? (.*?) )?   # (8) suffix

  [ ]?
$

-----------------------------------

Each of these regexes are mutually exclusive and should be checked
in an if-then-else like this (Pseudo code):

str_Normal = rxDot.Replace(rxWSp.Replace(str, " "), "")

If  Nick_rx.Test(str_Normal) Then
     N_1a  = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $2 $4 $6 "), " ")
     N_1aa = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $3 $4 $8 "), " ")
     N_1b  = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $2 $5 $6 "), " ")
     N_1bb = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $3 $5 $8 "), " ")
     N_2   = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $5 "), " ")

     ' see test case results in output below
Else

If FLs_rx.Test(str_Normal) Then

     FLs_1a  = rxWSp.Replace( FLs_rx.Replace(str_Normal, "$4 $5 , $1 $2 "), " ")
     FLs_1aa = rxWSp.Replace( FLs_rx.Replace(str_Normal, "$4 $5 , $1 $3 "), " ")
     FLs_2   = rxWSp.Replace( FLs_rx.Replace(str_Normal, "$4 $5 , $1 "), " ")

Else

If FLm_rx.Test(str_Normal) Then

   ' Permutation 1:
     FLm1_1a  = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$4 $8 , $1 $2 "), " ")
     FLm1_1aa = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$4 $8 , $1 $3 "), " ")
     FLm1_2   = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$4 $8 , $1 "), " ")

  ' Permutation 2:
     FLm2_1a  = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $2 $5 "), " ")
     FLm2_1aa = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $3 $5 "), " ")
     FLm2_1b  = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $2 $6 "), " ")
     FLm2_1bb = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $3 $6 "), " ")
     FLm2_2   = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $6 "), " ")

  ' At this point, the odds are that only one of these permutations will match 
  ' a different column.

Else

     ' The data could not be matched against a valid form
End If

-----------------------------

Test Cases

Found form 'Nick'
Input (raw)                 = 'John1 (JJ) Bert "nick" St Van Helsing ,Jr '
Normal                      = 'John1 (JJ) Bert "nick" St Van Helsing ,Jr '

Out type 1a  full middle    = 'St Van Helsing Jr , John1 (JJ) Bert "nick" '
Out type 1aa full middle    = 'St Van Helsing Jr , John1 JJ Bert nick '
Out type 1b  middle initial = 'St Van Helsing Jr , John1 (JJ) B "nick" '
Out type 1bb middle initial = 'St Van Helsing Jr , John1 JJ B nick '
Out type 2   middle initial = 'St Van Helsing Jr , John1 B '

=======================================================

Found form 'Nick'
Input (raw)                 = 'John2 Bert "nick" Helsing ,Jr '
Normal                      = 'John2 Bert "nick" Helsing ,Jr '

Out type 1a  full middle    = 'Helsing Jr , John2 Bert "nick" '
Out type 1aa full middle    = 'Helsing Jr , John2 Bert nick '
Out type 1b  middle initial = 'Helsing Jr , John2 B "nick" '
Out type 1bb middle initial = 'Helsing Jr , John2 B nick '
Out type 2   middle initial = 'Helsing Jr , John2 B '

=======================================================

Found form 'Nick'
Input (raw)                 = 'John3 Bert "nick" St Van Helsing ,Jr '
Normal                      = 'John3 Bert "nick" St Van Helsing ,Jr '

Out type 1a  full middle    = 'St Van Helsing Jr , John3 Bert "nick" '
Out type 1aa full middle    = 'St Van Helsing Jr , John3 Bert nick '
Out type 1b  middle initial = 'St Van Helsing Jr , John3 B "nick" '
Out type 1bb middle initial = 'St Van Helsing Jr , John3 B nick '
Out type 2   middle initial = 'St Van Helsing Jr , John3 B '

=======================================================

Found form 'First-Last (single)'
Input (raw)                 = 'John4 Helsing '
Normal                      = 'John4 Helsing '

Out type 1a  no middle      = 'Helsing  , John4  '
Out type 1aa no middle      = 'Helsing  , John4  '
Out type 2                  = 'Helsing  , John4 '

=======================================================

Found form 'First-Last (single)'
Input (raw)                 = 'John5 (JJ) Helsing '
Normal                      = 'John5 (JJ) Helsing '

Out type 1a  no middle      = 'Helsing  , John5 (JJ) '
Out type 1aa no middle      = 'Helsing  , John5 JJ '
Out type 2                  = 'Helsing  , John5 '

=======================================================

Found form 'First-Last (multi)'
Input (raw)                 = 'John6 (JJ) Bert St Van Helsing ,Jr '
Normal                      = 'John6 (JJ) Bert St Van Helsing ,Jr '

Permutation 1:
Out type 1a  no middle      = 'Bert St Van Helsing Jr , John6 (JJ) '
Out type 1aa no middle      = 'Bert St Van Helsing Jr , John6 JJ '
Out type 2                  = 'Bert St Van Helsing Jr , John6 '
Permutation 2:
Out type 1a  full middle    = 'St Van Helsing Jr , John6 (JJ) Bert '
Out type 1aa full middle    = 'St Van Helsing Jr , John6 JJ Bert '
Out type 1b  middle initial = 'St Van Helsing Jr , John6 (JJ) B '
Out type 1bb middle initial = 'St Van Helsing Jr , John6 JJ B '
Out type 2   middle initial = 'St Van Helsing Jr , John6 B '

=======================================================

Found form 'First-Last (multi)'
Input (raw)                 = 'John7 Bert St Van Helsing ,Jr '
Normal                      = 'John7 Bert St Van Helsing ,Jr '

Permutation 1:
Out type 1a  no middle      = 'Bert St Van Helsing Jr , John7 '
Out type 1aa no middle      = 'Bert St Van Helsing Jr , John7 '
Out type 2                  = 'Bert St Van Helsing Jr , John7 '
Permutation 2:
Out type 1a  full middle    = 'St Van Helsing Jr , John7 Bert '
Out type 1aa full middle    = 'St Van Helsing Jr , John7 Bert '
Out type 1b  middle initial = 'St Van Helsing Jr , John7 B '
Out type 1bb middle initial = 'St Van Helsing Jr , John7 B '
Out type 2   middle initial = 'St Van Helsing Jr , John7 B '

=======================================================

Form  ***  (unknown)
Input (raw)                 = ' do(e)s not. match ,'
Normal                      = ' do(e)s not match ,'

=======================================================

regex - Excelの2つのリストをVBA正規表現と比較する

2 に答える 2

正規化プロセス

主な解析プロセス

Related

Reference