python - Matching one set of values to another in a text file

Question

I have a text file with this information:

1961 - Roger (Male)
1962 - Roger (Male)
1963 - Roger (Male)
1963 - Jessica (Female)
1964 - Jessica (Female)
1965 - Jessica (Female)
1966 - Jessica (Female)

If I want to search for the word "Roger" in the file, I want it to print out the corresponding years for that name, that is 1961, 1962, 1963. What would be the best way to approach this?

I was doing it with a dictionary but then realized later that dictionaries can't have duplicate values and 1963 is mentioned twice in the text file so it didn't work.

I'm using Python 3, thanks.

score 2 · Accepted Answer

Use a dictionary with the name as the key and store the years in a list:

In [1]: with open("data1.txt") as f:
   ...:     dic={}
   ...:     for line in f:
   ...:         spl=line.split()
   ...:         dic.setdefault(spl[2],[]).append(int(spl[0]))
   ...:     for name in dic :    
   ...:         print (name,dic[name])
   ...:       

Roger [1961, 1962, 1963]
Jessica [1963, 1964, 1965, 1966]

or you can also use collections.defaultdict:

In [2]: from collections import defaultdict

In [3]: with open("data1.txt") as f:
   ...:     dic=defaultdict(list)
   ...:     for line in f:
   ...:         
   ...:         spl=line.split()
   ...:         dic[spl[2]].append(int(spl[0]))
   ...:     for name in dic:    
   ...:         print name,dic[name]
   ...:         
Roger [1961, 1962, 1963]
Jessica [1963, 1964, 1965, 1966]

score 0 · Accepted Answer

Why can't you use a dict and index on name (eg. Roger) as key and have values as a list of years (here [1961,1962,1963] ? would that not work for you?

so at the end of the loop you get all names uniquified with the years as values which is what you seem to want.

score 0 · Accepted Answer

Use tuples. They can be stored in lists, and iterated over.

Say your list looks like this:

data = [(1961, 'Rodger', 'Male'),
        (1962, 'Rodger', 'Male'),
        (1963, 'Rodger', 'Male'),
        (1963, 'Jessica', 'Female')]

You can run queries on it like this:

# Just items where the name is Rodger
[(y, n, s) for y, n, s in data if n == "Rodger"]

# Just the year 1963
[(y, n, s) for y, n, s in data if y == 1963]

Or use more Pythonic code:

for year, name, sex in data:
    if year >= 1962:
        print "In {}, {} was {}".format(year, name, sex)

In 1962, Rodger was Male
In 1963, Rodger was Male
In 1963, Jessica was Female

score 0 · Accepted Answer

You can always use a regular expression.

import re

f = open('names.txt')
name = 'Roger'

for line in f.readlines():
    match = re.search(r'([0-9]+) - %s' % name, line)
    if match:
        print match.group(1)

score 0 · Accepted Answer

As I suggested in the comments:

from collections import defaultdict

result = defaultdict(list)
with open('data.txt', 'rt') as input:
    for line in input:
        year, person = [item.strip() for item in line.split('-')]
        result[person].append(year)

for person, years in result.items():
    print(person, years, sep=': ')

Output:

Roger (Male): ['1961', '1962', '1963']
Jessica (Female): ['1963', '1964', '1965', '1966']

python - Matching one set of values to another in a text file

5 に答える 5

Related

Reference