音声テキスト ファイル (実際には 3 つのファイル) 内の一意の単語の量を見つけるのに苦労しています。誤解がないように完全なコードを示します。
#This program will serve to analyze text files for the number of words in
#the text file, number of characters, sentances, unique words, and the longest
#word in the text file. This program will also provide the frequency of unique
#words. In particular, the text will be three political speeches which we will
#analyze, building on searching techniques in Python.
def main():
harper = readFile("Harper's Speech.txt")
newWords = cleanUpWords(harper)
print(numCharacters(harper), "Characters.")
print(numSentances(harper), "Sentances.")
print(numWords(newWords), "Words.")
print(uniqueWords(newWords), "Unique Words.")
print("The longest word is: ", longestWord(newWords))
obama1 = readFile("Obama's 2009 Speech.txt")
newWords = cleanUpWords(obama1)
print(numCharacters(obama1), "Characters.")
print(numSentances(obama1), "Sentances.")
print(numWords(obama1), "Words.")
print(uniqueWords(newWords), "Unique Words.")
print("The longest word is: ", longestWord(newWords))
obama2 = readFile("Obama's 2008 Speech.txt")
newWords = cleanUpWords(obama2)
print(numCharacters(obama2), "Characters.")
print(numSentances(obama2), "Sentances.")
print(numWords(obama2), "Words.")
print(uniqueWords(newWords), "Unique Words.")
print("The longest word is: ", longestWord(newWords))
def readFile(filename):
'''Function that reads a text file, then prints the name of file without
'.txt'. The fuction returns the read file for main() to call, and print's
the file's name so the user knows which file is read'''
inFile1 = open(filename, "r")
fileContentsList = inFile1.read()
inFile1.close()
print("\n", filename.replace(".txt", "") + ":")
return fileContentsList
def numCharacters(file):
'''Fucntion returns the length of the READ file (not readlines because it
would only read the amount of lines and counting characters would be wrong),
which will be the correct amount of total characters in the text file.'''
return len(file)
def numSentances(file):
'''Function returns the occurances of a period, exclamation point, or
a question mark, thus counting the amount of full sentances in the text file.'''
return file.count(".") + file.count("!") + file.count("?")
def cleanUpWords(file):
words = (file.replace("-", " ").replace(" ", " ").replace("\n", " "))
onlyAlpha = ""
for i in words:
if i.isalpha() or i == " ":
onlyAlpha += i
return onlyAlpha.replace(" ", " ")
def numWords(newWords):
'''Function finds the amount of words in the text file by returning
the length of the cleaned up version of words from cleanUpWords().'''
return len(newWords.split())
def uniqueWords(newWords):
unique = sorted(newWords.split())
unique = set(unique)
return str(len(unique))
def longestWord(file):
max(file.split())
main()
そのため、最後の 2 つの関数 uniqueWords と longestWord が正しく機能しないか、少なくとも出力が間違っています。固有の単語については、527 を取得するはずですが、実際には奇妙な理由で 567 を取得しています。また、私の最長の単語関数は、何をしても、常に何も出力されません。最長の単語を取得するために多くの方法を試しましたが、上記はそれらの方法の 1 つにすぎませんが、すべて返されるものはありません。私の2つの悲しい機能を助けてください!