Random.choice Only Choosing from One Line of File - python

Python novice here, very good at copy and pasting. I'm trying to sample one random word from my text file. It's working, but it's only sampling from a single line. How can I make it sample the entire file?
lines = open("myfilehere").readlines()
line = lines[0]
words = line.split()
print(random.choice(words))
Labely = random.choice(words)
Label1 = Label(root, text=Labely)
Label1.pack()

You split and perform random.choice() on the variable words, which contains only your first line of text (lines[0]). If you want all the text you don't need to use the variable line, run words = lines.split() instead of words = line.split()

First put all words into a list, then choose words from that list:
words=[] #Define an empty list
with open("myfilehere") as fi: #Open the file
for line in fi: #Iterate over each line in file
for word in line.split(): #Iterate over each word in line
words.append(word) #Add the word to the list
print(words)
Labely = random.choice(words)

Related

syntax errors on creating wordDictionary of word and occurences

Having Attribute error issue on line 32. Requesting some assistance figuring out how to display word and occurrences.
import re
file_object = open('dialog.txt')
# read the file content
fileContents = file_object.read()
# convert fileContents to lowercase
final_dialog = fileContents.lower()
# print(final_dialog)
# replace a-z and spaces with cleanText variable
a_string = final_dialog
cleanText = re.sub("[^0-9a-zA-Z]+", "1", a_string)
# print(cleanText)
# wordlist that contains all words found in cleanText
text_string = cleanText
wordList = re.sub("1"," ", text_string)
# print(wordList)
#wordDictionary to count occurrence of each word to list in wordList
wordDictionary = dict()
#loop through .txt
for line in list(wordList):
# remove spaces and newline characters
line = line.strip()
# split the line into words
words = line.split()
#iterate over each word in line
for word in words.split():
if word not in wordDictionary:
wordDictionary[word] = 1
else:
wordDictionary[word] += 1
# print contents of dictionary
print(word)
# print file content
# print(fileContents)
# close file
# file_object.close()
Having Attribute error issue on line 32. Requesting some assistance figuring out how to display word and occurrences.
I think the error is
for word in words.split():
and should be replaced with
for word in words:
Explanation: words is already a list. A list has no split method, so you'll get an AttributeError when trying to call that method.

Replacing lines with a part of that line in Python3

I am working on files with transcripts. I have lines of text and every few lines there is a statement similar to this 'Play video starting at 16 seconds and follow transcript0:16' (there might be more words when minutes are showing). I was able to isolate the text I want to replace the whole sentence with. So the end goal is to leave all the text from the file but replace the sentences with my shorter text - in my case it will be "transcript0:16"
with open("transcript.txt", "r") as fhandle:
newline=[]
for line in fhandle.readlines():
if line.startswith("Play video"):
words = line.split()
word = words[::-1]
wordfinal = word[0]
newline.append(line.replace(line,wordfinal))
with open("transcript.txt", "w") as fhandle:
for line in newline:
fhandle.writelines(line)
Thanks
You can append all the lines of your document in newline and apply your rule if the statement is true, otherwise just append the normal line:
newline=[]
for line in fhandle.readlines():
if line.startswith("Play video"):
words = line.split()
word = words[::-1]
wordfinal = word[0]
newline.append(wordfinal))
else:
newline.append(line)
for line in newline:
fhandle.writelines(line)

Replace words of a long document in Python

I have a dictionary dict with some words (2000) and I have a huge text, like Wikipedia corpus, in text format. For each word that is both in the dictionary and in the text file, I would like to replace it with word_1.
with open("wiki.txt",'r') as original, open("new.txt",'w') as mod:
for line in original:
new_line = line
for word in line.split():
if (dict.get(word.lower()) is not None):
new_line = new_line.replace(word,word+"_1")
mod.write(new_line)
This code creates a new file called new.txt with the words that appear in the dictionary replaced as I want.
This works for short files, but for the longer that I am using as input, it "freezes" my computer.
Is there a more efficient way to do that?
Edit for Adi219:
Your code seems working, but there is a problem:
if a line is like that: Albert is a friend of Albert and in my dictionary I have Albert, after the for cycle, the line will be like this:Albert_1_1 is a friend of Albert_1. How can I replace only the exact word that I want, to avoid repetitions like _1_1_1_1?
Edit2:
To solve the previous problem, I changed your code:
with open("wiki.txt", "r") as original, open("new.txt", "w") as mod:
for line in original:
words = line.split()
for word in words:
if dict.get(word.lower()) is not None:
mod.write(word+"_1 ")
else:
mod.write(word+" ")
mod.write("\n")
Now everything should work
A few things:
You could remove the declaration of new_line. Then, change new_line = new_line.replace(...) line with line = line.replace(...). You would also have to write(line) afterwards.
You could add words = line.split() and use for word in words: for the for loop, as this removes a call to .split() for every iteration through the words.
You could (manually(?)) split your large .txt file into multiple smaller files and have multiple instances of your program running on each file, and then you could combine the multiple outputs into one file. Note: You would have to remember to change the filename for each file you're reading/writing to.
So, your code would look like:
with open("wiki.txt", "r") as original, open("new.txt", "w") as mod:
for line in original:
words = line.split()
for word in words:
if dict.get(word.lower()) is not None:
line = line.replace(word, word + "_1")
mod.write(line)

how to assign a word from a text file to a variable in python

I have a text file and I need to assign a random word from this text file (each word is on a separate line) to a variable in Python. Then I need to remove this word from the text file.
This is what I have so far.
with open("words.txt") as f: #Open the text file
wordlist = [x.rstrip() for x in f]
variable = random.sample(wordlist,1) #Assigning the random word
print(variable)
Use random.choice to pick a single word:
variable = random.choice(wordlist)
You can then remove it from the word list by another comprehension:
new_wordlist = [word for word in wordlist if word != variable]
(You can also use filter for this part)
You can then save that word list to a file by using:
with open("words.txt", 'w') as f: # Open file for writing
f.write('\n'.join(new_wordlist))
If you want to remove just a single instance of the word you should choose an index to use. See this answer.
If you need to handle duplicates, and it's not acceptable to reshuffle the list every time, there's a simple solution: Instead of just randomly picking a word, randomly pick an index. Like this:
index = random.randrange(len(wordlist))
word = wordlist.pop(index)
with open("words.txt", 'w') as f:
f.write('\n'.join(new_wordlist))
Or, alternatively, use enumerate to pick both at once:
word, index = random.choice(enumerate(wordlist))
del wordlist[index]
with open("words.txt", 'w') as f:
f.write('\n'.join(new_wordlist))
Rather than random.choice as Reut suggested, I would do this because it keeps duplicates:
random.shuffle(wordlist) # shuffle the word list
theword = wordlist.pop() # pop the first element

Store words of file in dictionary

I want to store the words of a text file in a dictionary.
My code is
word=0
char=0
i=0
a=0
d={}
with open("m.txt","r") as f:
for line in f:
w=line.split()
d[i]=w[a]
i=i+1
a=a+1
word=word+len(w)
char=char+len(line)
print(word,char)
print(d)
my text file is
jdfjdnv dj g gjv,kjvbm
but the problem is that the dictionary is storing only the first word of the text file .how to store the rest of the words.please help
How many lines does your text file have? If it only has one line your loop executes only once, splits whole line into separate words, then saves one word in Python dict. If you want to save all words from this text file with one line you need to add another loop. Like this:
for word in line.split():
d[i] = word
i += 1
You only store the first word because you only have one line in the file, and your only for loop is over the lines.
Generally, if you are going to key the dictionary by index, you can just use the list you are already making:
w = []
char = 0
with open("m.txt", "r") as f:
for line in f:
char += len(line)
w.extend(line.split())
word = sum(map(len, w))

Categories

Resources