finding occurrences in Multiple sentences - python

I'm new to python however after scouring the internet and going back over my study, I cannot seem to find how to find duplicates of a word within multiple sentences. my aim is to define how many times the word python occurs within these strings. I have tried the split() method and count.(python) and even tried to make a dictionary and word_counter which initially I have been taught to do as part of the basics however nothin in my study has shown me anything similar to this before. i need to be able to display the frequency of the word. python occurs 4 times. any help would be very appreciated
python_occurs = ["welcome to our Python program", "Python is my favorite language!", "I am afraid of Pythons", "I love Python"]

A straight-forward approach is to iterate over every word using split. For each word, it's converted to lowercase and the number of times "python" occurs in it is counted using count.
I guess the reason for you approach not working might be that you forgot to change the letters to lowercase.
python_occurs = ["welcome to our Python program", "Python is my favorite language!", "I am afraid of Pythons", "I love Python"]
count = 0
for sentence in python_occurs:
for word in sentence.split():
# lower is necessary because we want to be case-insensitive
count += word.lower().count("python")

Related

PyDictionary Printing Issue

I have a small issue with PyDictionary; When I enter a list of words, the printing of the words Does NOT keep the order of the word list.
For example:
from PyDictionary import PyDictionary
dictionary=PyDictionary(
"bad ",
"omen",
"azure ",
"sky",
"icy ",
"smile")
print(dictionary.printMeanings())
This list will print first Omen, Then Sky and so on, What I need is to print the word list in its original order. I search on google but there was nothing related, I search the posts in this forum and nothing. I hope you can help me. Thank you in advance.
I found a workaround that gives me a full solution to my initial printing issues.
The Main Problem is that I am using an OLD laptop (older than 12 years), So I have Not been able to use python 3+, Using PyDictionary with python 2.7 arouse the problem of printing the initial word list randomly.
The Solution is to print a single word for each printing, BUT I have to do this about 25,000 times!... Using Notepad++ I made a macro that codes each word to be used with python, Furthermore I was able to even add the Spanish translation to each English word, The printing of each word individually added the benefit that each word definitions are separated from each word.
Using Notepad++ and regex I am able to do the final clean up of each word, and it's meaning.
So I am happy with this workaround... Thank You for your help.

automise long lists in python

Today I wrote my first program, which is essentially a vocabulary learning program! So naturally I have pretty huge lists of vocabulary and a couple of questions. I created a class with the parameters, one of which is the German vocab and one of which is the Spanish vocab. My first question is: is there anyway to turn all the plain text vocabulary that I copy from an internets vocab list into strings and separate them without adding the " and the commas manually?
And my second question:
I created another list to assign each German vocab to each Spanish vocab and it looks a little bit like that:
vocabs = [
Vocabulary(spanish_word[0], german_word[0])
Vocabulary(spanish_word[1], german_word[1])
etc.
]
Vocabulary would be the class, spanish_word the first word list and German the other obviously.
But with a lot of vocab that's a lot of work too. Is there anyway to automate the process to add each word from the Spanish word list to the German one? I first tried it with the
vocabs = [
for spanish word in german word
Vocabulary(spanish_word[0], german_word[0])
]
But that didn't work. Researching on the internet also didn't help much.
Please don't be rude if those are noob questions I'm actually pretty happy that my program is running so well and I would be thankful for all the help to make it better.
Without knowing what it is you're looking to do with the result, it appears you're trying to do this:
vocabs = [Vocabulary(s, g) for s, g in zip(spanish_word, german_word)]
You didn't provide any code or example data around the "turn all the plain text vocabulary [..] into strings and separate them without adding the quotes and the commas manually". There's sure to be a way to do what you need, but you should probably ask a separate question, after first looking for a solution yourself and coming up with a solution. Ask a question if you can't get it to work.

how to save 5 words in one word using python

for example I want to save inevitable, unavoidable, certain, sure = "necessary" if mentioned words are using in my giving sentence, so my program automatically change these words into "necessary" and give me sentence
for example
it is inevitable or unavoidable or certain or sure, that person age should be 18
so my python program automatically detect these words and convert in to
"it is necessary that person age should be 18"
Your issue isn't very clear, tell us what you want to do and what you can't figure out.
I think you should split your sentence to get a list of all words in it. Then, check if one of the words belongs to your list of "changeable" words ( inevitable, unavoidable, certain, sure) if so, replace it with the word you want ("necessary" in your example).
But i'm not sure i understood your problem.
sen = "this is unavoidable that the kids must be 18"
words = sen.split()
new_words = []
for word in words:
if word in ['inevitable', 'unavoidable', 'certain', 'sure']:
word = 'necessary'
new_words.append(word)
new_sen = " ".join(new_words)
print(new_sen)

Print something when a word is in a word list

So I am currently trying to build a Caesar encrypted that automatically tries all the possibilities and compares them to a big list of words to see if it is a real word, so some sort of dictionary attack I guess.
I found a list with a lot of German words, and they even are split so that each word is on a new line. Currently, I am struggling with comparing the sentence that I currently have with the whole word list. So that when the program sees that a word in my sentence is also a word in the Word list that it prints out that this is a real word and possible the right sentence.
So this is how far I currently am, I have not included the code with which I try all the 26 letters. Only my way to look through the word list and compares it to a sentence. Maybe someone can tell me what I am doing wrong and why it doesn't work:
No idea why it doesn't work. I have also tried it with regular expressions but nothing works. The list is really long (166k Words).
There are /n at the en of each word of the list you created from the file, so the they will never be the same as what they are compared to.
Remove the newline character before appending (you can, for example, wordlist.append(line.rstrip())

Extract non-content English language words string - python [duplicate]

This question already has answers here:
How to remove stop words using nltk or python
(13 answers)
Closed 8 years ago.
I am working on Python script in which I want to remove the common english words like "the","an","and","for" and many more from a String. Currently what I have done is I have made a local list of all such words and I just call remove() to remove them from the string. But I want here some pythonish way to achieve this. Have read about nltk and wordnet but totally clueless about that's what I should use and how to use it.
Edit
Well I don't understand why marked as duplicate as my question does not in any way mean that I know about Stop words and now I just want to know how to use it.....the question is about what I can use in my scenario and answer to that was stop words...but when I posted this question I din't know anything about stop words.
Do this.
vocabular = set (english_dictionary)
unique_words = [word for word in source_text.split() if word not in vocabular]
It is simple and efficient as can be. If you don't need positions of unique words, make them set too! Operator in is extremely fast on sets (and slow on lists and other containers)
this will also work:
yourString = "an elevator is made for five people and it's fast"
wordsToRemove = ["the ", "an ", "and ", "for "]
for word in wordsToRemove:
yourString = yourString .replace(word, "")
I have found that what I was looking for is this:
from nltk.corpus import stopwords
my_stop_words = stopwords.words('english')
Now I can remove or replace the words from my list/string where I find the match in my_stop_words which is a list.
For this to work I had to download the NLTK for python and the using its downloader I downloaded stopwords package.
It also contains many other packages which can be used in different situations for NLP like words,brown,wordnet etc.

Categories

Resources