python - foo を含むすべての行の最後に bar を置きます

Question

多数の行を含むリストがあり、それぞれが主語-動詞-目的語の形式をとっています。

ジェーンはフレッドが好き
クリスはジョーが嫌い
ネイトはジルを知っている

色分けされた有向エッジでノード間のさまざまな関係を表すネットワークグラフをプロットするには、動詞を矢印に置き換え、各行の最後にカラーコードを配置する必要があります。

ジェーン -> フレッドレッド;
クリス -> ジョー・ブルー;
ネイト -> ジル・ブラック;

動詞の数は少ないので、それらを矢印に置き換えるには、いくつかの検索と置換コマンドを実行するだけです。ただし、その前に、行の動詞に対応するすべての行の末尾にカラーコードを配置する必要があります。Pythonを使用してこれを行いたいと思います。

これらはプログラミングにおける私の初歩的なステップなので、テキストファイルで読み取るコードを明示的に含めてください。

ご協力いただきありがとうございます！

score 5 · Accepted Answer

辞書と文字列の書式設定を調べたいと思われます。一般に、プログラミングの助けが必要な場合は、問題を非常に小さな個別のチャンクに分割し、それらのチャンクを個別に検索するだけで、すべてをより大きな答えに定式化できるはずです。Stack Overflow は、この種の検索に最適なリソースです。

また、Python に関する一般的な好奇心がある場合は、公式の Python ドキュメントを検索または参照してください。どこから始めればよいかわからない場合は、Python チュートリアルを読むか、本を読んでください。自分が何をしているかについての十分な基礎知識を得るために 1 ～ 2 週間投資すると、仕事が完了するにつれて何度も報われます。

verb_color_map = {
    'likes': 'red',
    'dislikes': 'blue',
    'knows': 'black',
}

with open('infile.txt') as infile: # assuming you've stored your data in 'infile.txt'
    for line in infile:
        # Python uses the name object, so I use object_
        subject, verb, object_ = line.split()
        print "%s -> %s %s;" % (subject, object_, verb_color_map[verb])

score 3 · Accepted Answer

簡単です。for動詞のリストが固定されていて小さいと仮定すると、これは辞書とループを使用して簡単に実行できます。

VERBS = {
    "likes": "red"
  , "dislikes": "blue"
  , "knows": "black"
  }

def replace_verb (line):
    for verb, color in VERBS.items():
        if verb in line:
            return "%s %s;" % (
                  line.replace (verb, "->")
                , color
                )
    return line

def main ():
    filename = "my_file.txt"
    with open (filename, "r") as fp:
        for line in fp:
            print replace_verb (line)

# Allow the module to be executed directly on the command line
if __name__ == "__main__":
    main ()

score 2 · Accepted Answer

verbs = {"dislikes":"blue", "knows":"black", "likes":"red"}
for s in open("/tmp/infile"):
  s = s.strip()
  for verb in verbs.keys():
    if (s.count(verb) > 0):
      print s.replace(verb,"->")+" "+verbs[verb]+";"
      break

編集：むしろ「for s in open」を使用してください

score 1 · Accepted Answer

これは少し宿題ではないですか :) もしそうなら、大騒ぎしても大丈夫です。あまり詳しく説明するのではなく、実行しようとしているタスクについて考えてください。

各行について:

それを読んで
単語に分割します（空白 - .split() ）
中間の単語を色に変換します (マッピングに基づく -> cf: python dict()
最初の単語、矢印、3 番目の単語、および色を出力します

NetworkX を使用したコード (networkx.lanl.gov/)

'''
plot relationships in a social network
'''

import networkx
## make a fake file 'ex.txt' in this directory
## then write fake relationships to it.
example_relationships = file('ex.txt','w') 
print >> example_relationships, '''\
Jane Doe likes Fred
Chris dislikes Joe
Nate knows Jill \
'''
example_relationships.close()

rel_colors = {
    'likes':  'blue',
    'dislikes' : 'black',
    'knows'   : 'green',
}

def split_on_verb(sentence):
    ''' we know the verb is the only lower cased word

    >>> split_on_verb("Jane Doe likes Fred")
    ('Jane Does','Fred','likes')

    '''
    words = sentence.strip().split()  # take off any outside whitespace, then split
                                       # on whitespace
    if not words:
        return None  # if there aren't any words, just return nothing

    verbs = [x for x in words if x.islower()]
    verb = verbs[0]  # we want the '1st' one (python numbers from 0,1,2...)
    verb_index = words.index(verb) # where is the verb?
    subject = ' '.join(words[:verb_index])
    obj =  ' '.join(words[(verb_index+1):])  # 'object' is already used in python
    return (subject, obj, verb)


def graph_from_relationships(fh,color_dict):
    '''
    fh:  a filehandle, i.e., an opened file, from which we can read lines
        and loop over
    '''
    G = networkx.DiGraph()

    for line in fh:
        if not line.strip():  continue # move on to the next line,
                                         # if our line is empty-ish
        (subj,obj,verb) = split_on_verb(line)
        color = color_dict[verb]
        # cf: python 'string templates', there are other solutions here
        # this is the 
        print "'%s' -> '%s' [color='%s'];" % (subj,obj,color)
        G.add_edge(subj,obj,color)
        # 

    return G

G = graph_from_relationships(file('ex.txt'),rel_colors)
print G.edges()
# from here you can use the various networkx plotting tools on G, as you're inclined.

score 0 · Accepted Answer

質問に加えて、Karasu は次のようにも述べています (1 つの回答に対するコメントで)。

さて、これが私がこれを解決する方法です。

color_map = \
{
    "likes" : "red",
    "dislikes" : "blue",
    "knows" : "black",
}

def is_verb(word):
    return word in color_map

def make_noun(lst):
    if not lst:
        return "--NONE--"
    elif len(lst) == 1:
        return lst[0]
    else:
        return "_".join(lst)


for line in open("filename").readlines():
    words = line.split()
    # subject could be one or two words
    if is_verb(words[1]):
        # subject was one word
        s = words[0]
        v = words[1]
        o = make_noun(words[2:])
    else:
        # subject was two words
        assert is_verb(words[2])
        s = make_noun(words[0:2])
        v = words[2]
        o = make_noun(words[3:])
    color = color_map[v]
    print "%s -> %s %s;" % (s, o, color)

いくつかのメモ:

0) この問題に "with" は実際には必要ありません。このように書くと、プログラムが古いバージョンの Python に移植しやすくなります。これは Python 2.2 以降で動作するはずです (私は Python 2.6 でのみテストしました)。

1) make_noun() を変更して、複数の単語を処理するのに役立つと思われる戦略を立てることができます。アンダースコアを使ってそれらを連鎖させるだけを示しましたが、形容詞を含む辞書を作成してそれらを破棄したり、名詞の辞書を作成してそれらを選択したりできます。

2) あいまい一致に正規表現を使用することもできます。単純に color_map の辞書を使用する代わりに、置換色とペアになった正規表現を使用してタプルのリストを作成し、正規表現が一致したときに色を置換することができます。

score 0 · Accepted Answer

これは、以前の回答の改良版です。これは、正規表現マッチングを使用して、動詞のあいまい一致を作成します。これらはすべて機能します：

Steve loves Denise
Bears love honey
Maria interested Anders
Maria interests Anders

正規表現パターン「loves?」「love」とオプションの「s」に一致します。パターン "interest.*" は、"interest" に加えて何かに一致します。縦線で区切られた複数の選択肢を持つパターンは、いずれかの選択肢が一致する場合に一致します。

import re

re_map = \
[
    ("likes?|loves?|interest.*", "red"),
    ("dislikes?|hates?", "blue"),
    ("knows?|tolerates?|ignores?", "black"),
]

# compile the regular expressions one time, then use many times
pat_map = [(re.compile(s), color) for s, color in re_map]

# We dont use is_verb() in this version, but here it is.
# A word is a verb if any of the patterns match.
def is_verb(word):
    return any(pat.match(word) for pat, color in pat_map)

# Return color from matched verb, or None if no match.
# This detects whether a word is a verb, and looks up the color, at the same time.
def color_from_verb(word):
    for pat, color in pat_map:
        if pat.match(word):
            return color
    return None

def make_noun(lst):
    if not lst:
        return "--NONE--"
    elif len(lst) == 1:
        return lst[0]
    else:
        return "_".join(lst)


for line in open("filename"):
    words = line.split()
    # subject could be one or two words
    color = color_from_verb(words[1])
    if color:
        # subject was one word
        s = words[0]
        o = make_noun(words[2:])
    else:
        # subject was two words
        color = color_from_verb(words[1])
        assert color
        s = make_noun(words[0:2])
        o = make_noun(words[3:])
    print "%s -> %s %s;" % (s, o, color)

この答えをどのように解釈して拡張するかが明確になることを願っています。より多くの動詞に一致するパターンを簡単に追加できます。「is」と「in」を検出して破棄するロジックを追加すると、「Anders is interested in Maria」が一致するようになります。等々。

ご不明な点がございましたら、さらに詳しくご説明いたします。幸運を。

score 0 · Accepted Answer

パイソン 2.5:

import sys
from collections import defaultdict

codes = defaultdict(lambda: ("---", "Missing action!"))
codes["likes"] =    ("-->", "red")
codes["dislikes"] = ("-/>", "green")
codes["loves"] =    ("==>", "blue")

for line in sys.stdin:
    subject, verb, object_ = line.strip().split(" ")
    arrow, color = codes[verb]
    print subject, arrow, object_, color, ";"

python - foo を含むすべての行の最後に bar を置きます

7 に答える 7

Related

Reference