python - Python: 混合文字列を分割する

Question

次の形式でファイルからいくつかの行を読み取ります。

line = a   b  c  d,e,f    g   h  i,j,k,l   m   n

私が欲しいのは、「、」で区切られた要素のない行です。

a   b  c  d    g   h  i   m   n 
a   b  c  d    g   h  j   m   n
a   b  c  d    g   h  k   m   n
a   b  c  d    g   h  l   m   n
a   b  c  e    g   h  i   m   n
a   b  c  e    g   h  j   m   n
a   b  c  e    g   h  k   m   n
a   b  c  e    g   h  l   m   n
.   .  .  .    .   .  .   .   .
.   .  .  .    .   .  .   .   .

まず分割しますline

sline = line.split()

ここで、「、」を区切り文字として使用して分割できる要素を反復しslineて探します。問題は、それらの要素からどれだけ期待する必要があるかを常に知っているわけではないということです. 何か案は？

score 3 · Accepted Answer

regex、itertools.productおよびいくつかの文字列フォーマットを使用する:

このソリューションでは、初期の間隔も維持されます。

>>> import re
>>> from itertools import product
>>> line = 'a   b  c  d,e,f    g   h  i,j,k,l   m   n'
>>> items = [x[0].split(',') for x in re.findall(r'((\w+,)+\w)',line)]
>>> strs = re.sub(r'((\w+,)+\w+)','{}',line)
>>> for prod in product(*items):
...     print (strs.format(*prod))
...     
a   b  c  d    g   h  i   m   n
a   b  c  d    g   h  j   m   n
a   b  c  d    g   h  k   m   n
a   b  c  d    g   h  l   m   n
a   b  c  e    g   h  i   m   n
a   b  c  e    g   h  j   m   n
a   b  c  e    g   h  k   m   n
a   b  c  e    g   h  l   m   n
a   b  c  f    g   h  i   m   n
a   b  c  f    g   h  j   m   n
a   b  c  f    g   h  k   m   n
a   b  c  f    g   h  l   m   n

もう一つの例：

>>> line = 'a   b  c  d,e,f    g   h  i,j,k,l   m   n q,w,e,r  f o   o'
>>> items = [x[0].split(',') for x in re.findall(r'((\w+,)+\w)',line)]
>>> strs = re.sub(r'((\w+,)+\w+)','{}',line)
for prod in product(*items):
    print (strs.format(*prod))
...     
a   b  c  d    g   h  i   m   n q  f o   o
a   b  c  d    g   h  i   m   n w  f o   o
a   b  c  d    g   h  i   m   n e  f o   o
a   b  c  d    g   h  i   m   n r  f o   o
a   b  c  d    g   h  j   m   n q  f o   o
a   b  c  d    g   h  j   m   n w  f o   o
a   b  c  d    g   h  j   m   n e  f o   o
a   b  c  d    g   h  j   m   n r  f o   o
a   b  c  d    g   h  k   m   n q  f o   o
a   b  c  d    g   h  k   m   n w  f o   o
a   b  c  d    g   h  k   m   n e  f o   o
a   b  c  d    g   h  k   m   n r  f o   o
a   b  c  d    g   h  l   m   n q  f o   o
a   b  c  d    g   h  l   m   n w  f o   o
a   b  c  d    g   h  l   m   n e  f o   o
a   b  c  d    g   h  l   m   n r  f o   o
a   b  c  e    g   h  i   m   n q  f o   o
a   b  c  e    g   h  i   m   n w  f o   o
a   b  c  e    g   h  i   m   n e  f o   o
a   b  c  e    g   h  i   m   n r  f o   o
a   b  c  e    g   h  j   m   n q  f o   o
a   b  c  e    g   h  j   m   n w  f o   o
a   b  c  e    g   h  j   m   n e  f o   o
a   b  c  e    g   h  j   m   n r  f o   o
a   b  c  e    g   h  k   m   n q  f o   o
a   b  c  e    g   h  k   m   n w  f o   o
a   b  c  e    g   h  k   m   n e  f o   o
a   b  c  e    g   h  k   m   n r  f o   o
a   b  c  e    g   h  l   m   n q  f o   o
a   b  c  e    g   h  l   m   n w  f o   o
a   b  c  e    g   h  l   m   n e  f o   o
a   b  c  e    g   h  l   m   n r  f o   o
a   b  c  f    g   h  i   m   n q  f o   o
a   b  c  f    g   h  i   m   n w  f o   o
a   b  c  f    g   h  i   m   n e  f o   o
a   b  c  f    g   h  i   m   n r  f o   o
a   b  c  f    g   h  j   m   n q  f o   o
a   b  c  f    g   h  j   m   n w  f o   o
a   b  c  f    g   h  j   m   n e  f o   o
a   b  c  f    g   h  j   m   n r  f o   o
a   b  c  f    g   h  k   m   n q  f o   o
a   b  c  f    g   h  k   m   n w  f o   o
a   b  c  f    g   h  k   m   n e  f o   o
a   b  c  f    g   h  k   m   n r  f o   o
a   b  c  f    g   h  l   m   n q  f o   o
a   b  c  f    g   h  l   m   n w  f o   o
a   b  c  f    g   h  l   m   n e  f o   o
a   b  c  f    g   h  l   m   n r  f o   o

score 0 · Accepted Answer

import itertools
line_data = 'a   b  c  d,e,f    g   h  i,j,k,l   m   n'
comma_fields_indices = [i for i,val in enumerate(line_data.split()) if "," in val]
comma_fields = [i.split(",") for i in line_data.split() if "," in i]
all_comb = []
for val in itertools.product(*comma_fields):
    sline_data = line_data.split()
    for index,word in enumerate(val):
        sline_data[comma_fields_indices[index]] = word
    all_comb.append(" ".join(sline_data))
print all_comb

score 0 · Accepted Answer

for i in range(len(line)-1):
    if line[i] == ',':
        line = line.replace(line[i]+line[i+1], '')

python - Python: 混合文字列を分割する

6 に答える 6

Related

Reference