Im having an issue limiting characters not words running different solutions as shown below..any help is appreciated!
test_list = ['Running the boy gladly jumped']
test_words = test_list[0].split()
suffix_list = ['ed', 'ly', 'ing']
final_list = []
for word in test_words:
if suffix_list[0] == word[-len(suffix_list[0]):]:
final_list.append(word[0:-len(suffix_list[0])])
elif suffix_list[1] == word[-len(suffix_list[1]):]:
final_list.append(word[0:-len(suffix_list[1])])
elif suffix_list[2] == word[-len(suffix_list[2]):]:
final_list.append(word[0:-len(suffix_list[2])])
else:
final_list.append(word)
final_list = [' '.join(final_list)]
print (final_list)
If you mean to include only the first 8 characters of each word, you can do this with a list comprehension over final_list like so:
final_list = [word[:min(len(word), 8)] for word in final_list
Removes suffixes and limits each result word to limit characters
limit = 8
test_list = ['Running the boy gladly jumped continuously']
test_words = test_list[0].split()
suffix_list = ['ed', 'ly', 'ing']
final_list = []
for word in test_words:
for suffix in suffix_list:
if word.endswith(suffix):
final_list.append(word[:-len(suffix)][:limit])
break
else:
final_list.append(word[:limit])
print(' '.join(final_list))
Prints:
Runn the boy glad jump continuo
You could use splicing to get the first 8 words
final_list = [' '.join(final_list)][:8]
Related
I have a list:
List1 = ['name','is','JOHN','My']
I want to append the pronoun as the first item in a new list and append the names at last. Other items should be in the middle and their positions can change.
So far I have written:
my_list = ['name','is','JOHN','My']
new_list = []
for i in my_list:
if i.isupper():
my_list.remove(i)
new_list.append(i)
print(new_list)
Here, I can't check if an item is completely upper case or only its first letter is upper case.
Output I get:
['name','is','JOHN','My']
Output I want:
['My','name','is','JOHN']
or:
['My','is','name','JOHN']
EDIT: I have seen this post and it doesn’t have answers to my question.
i.isupper() will tell you if it's all uppercase.
To test if just the first character is uppercase and the rest lowercase, you can use i.istitle()
To make your final result, you can append to different lists based on the conditions.
all_cap = []
init_cap = []
non_cap = []
for i in my_list:
if i.isupper():
all_cap.append(i)
elif i.istitle():
init_cap.append(i)
else:
non_cap.append(i)
new_list = init_cap + non_cap + all_cap
print(new_list)
DEMO
How about this:
s = ['name', 'is', 'JOHN', 'My']
pronoun = ''
name = ''
for i in s:
if i.isupper():
name = i
if i.istitle():
pronoun = i
result = [pronoun, s[0], s[1], name]
print(result)
Don't # me pls XD. Try this.
my_list = ['name','is','JOHN','My']
new_list = ['']
for i in range(len(my_list)):
if my_list[i][0].isupper() and my_list[i][1].islower():
new_list[0] = my_list[i]
elif my_list[i].islower():
new_list.append(my_list[i])
elif my_list[i].isupper():
new_list.append(my_list[i])
print(new_list)
I am working on a project and I want to write a code, that would find words containing only certain letters in a sentence and then return them (print them out).
sentence = "I am asking a question on Stack Overflow"
lst = []
# this gives me a list of all words in a sentence
change = sentence.split()
# NOTE: I know this isn't correct syntax, but that's basically what I want to do.
lst.append(only words containing "a")
print(lst)
Now the part I am struggeling with is, how do I append only words containig letter "a" for example?
you can act like this:
words = sentence.split()
lst = [word for word in words if 'a' in word]
print(lst)
# ['am', 'asking', 'a', 'Stack']
Try this! I hope it's well understood!
sentence = "I am asking a question on Stack Overflow"
lst = []
change = sentence.split()
#we are going to check in every word of the sentence, if letter 'a' is in it.
for a in change:
if 'a' in a:
print(a+" has an a! ")
lst.append(a)
print(lst)
This will output:
['am', 'asking', 'a', 'Stack']
Write a program that asks a user for a file name, then reads in the file. The program should then determine how frequently each word in the file is used. The words should be counted regardless of case, for example Spam and spam would both be counted as the same word. You should disregard punctuation. The program should then output the the words and how frequently each word is used. The output should be sorted by the most frequent word to the least frequent word.
Only problem I am having is getting the code to count "The" and "the" as the same thing. The code counts them as different words.
userinput = input("Enter a file to open:")
if len(userinput) < 1 : userinput = 'ran.txt'
f = open(userinput)
di = dict()
for lin in f:
lin = lin.rstrip()
wds = lin.split()
for w in wds:
di[w] = di.get(w,0) + 1
lst = list()
for k,v in di.items():
newtup = (v, k)
lst.append(newtup)
lst = sorted(lst, reverse=True)
print(lst)
Need to count "the" and "The" as on single word.
We start by getting the words in a list, updating the list so that all words are in lowercase. You can disregard punctuation by replacing them from the string with an empty character
punctuations = '!"#$%&\'()*+,-./:;<=>?#[\\]^_`{|}~'
s = "I want to count how many Words are there.i Want to Count how Many words are There"
for punc in punctuations:
s = s.replace(punc,' ')
words = s.split(' ')
words = [word.lower() for word in words]
We then iterate through the list, and update a frequency map.
freq = {}
for word in words:
if word in freq:
freq[word] += 1
else:
freq[word] = 1
print(freq)
#{'i': 2, 'want': 2, 'to': 2, 'count': 2, 'how': 2, 'many': 2,
#'words': 2, 'are': #2, 'there': 2}
You can use counter and re like this,
from collections import Counter
import re
sentence = 'Egg ? egg Bird, Goat afterDoubleSpace\nnewline'
# some punctuations (you can add more here)
punctuationsToBeremoved = ",|\n|\?"
#to make all of them in lower case
sentence = sentence.lower()
#to clean up the punctuations
sentence = re.sub(punctuationsToBeremoved, " ", sentence)
# getting the word list
words = sentence.split()
# printing the frequency of each word
print(Counter(words))
I'm trying to process a list of words and return a new list
containing only unique word. My definite loop works, however it will only print the words all together, instead of one per line. Can anyone help me out? This is probably a simple question but I am very new to Python. Thank you!
uniqueWords = [ ]
for word in allWords:
if word not in uniqueWords:
uniqueWords.append(word)
else:
uniqueWords.remove(word)
return uniqueWords
You can use str.join:
>>> all_words = ['two', 'two', 'one', 'uno']
>>> print('\n'.join(get_unique_words(all_words)))
one
uno
Or plain for loop:
>>> for word in get_unique_words(all_words):
... print(word)
...
one
uno
However, your method won't work for odd counts:
>>> get_unique_words(['three', 'three', 'three'])
['three']
If your goal is to get all words that appear exactly once, here's a shorter method that works using collections.Counter:
from collections import Counter
def get_unique_words(all_words):
return [word for word, count in Counter(all_words).items() if count == 1]
This code may help, it prints unique words line by line, is what I understood in your question:
allWords = ['hola', 'hello', 'distance', 'hello', 'hola', 'yes']
uniqueWords = [ ]
for word in allWords:
if word not in uniqueWords:
uniqueWords.append(word)
else:
uniqueWords.remove(word)
for i in uniqueWords:
print i
If the order of the words is not important I recommend you to create a set to store the unique words:
uniqueWords = set(allWords)
As you can see running the code below, it can be much faster, but it may depend on the original list of words:
import timeit
setup="""
word_list = [str(x) for x in range(1000, 2000)]
allWords = []
for word in word_list:
allWords.append(word)
allWords.append(word)
"""
smt1 = "unique = set(allWords)"
smt2 = """
uniqueWords = [ ]
for word in allWords:
if word not in uniqueWords:
uniqueWords.append(word)
else:
uniqueWords.remove(word)
"""
print("SET:", timeit.timeit(smt1, setup, number=1000))
print("LOOP:", timeit.timeit(smt2, setup, number=1000))
OUTPUT:
SET: 0.03147706200002176
LOOP: 0.12346845000001849
maybe this fits your idea:
allWords=['hola', 'hello', 'distance', 'hello', 'hola', 'yes']
uniqueWords=dict()
for word in allWords:
if word not in uniqueWords:
uniqueWords.update({word:1})
else:
uniqueWords[word]+=1
for k, v in uniqueWords.items():
if v==1:
print(k)
Prints:
distance
yes
The following code allow me to check if there is only one element of the lists that is in ttext.
from itertools import product, chain
from string import punctuation
list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']
l = [list1, list2, list3]
def test(l, tt):
counts = {word.strip(punctuation):0 for word in tt.split()}
for word in chain(*product(*l)):
if word in counts:
counts[word] += 1
if sum(v > 1 for v in counts.values()) > 1:
return False
return True
Output:
In [16]: ttext = 'hello my name is brian'
In [17]: test(l,ttext)
Out[17]: True
In [18]: ttext = 'hello how are you?'
In [19]: test(l,ttext)
Out[19]: False
Now, how can i do the same if i have space in the elements of the lists, "I have", "you are" and "he is"?
You could add a list comprehension that goes through and splits all the words:
def test(l, tt):
counts = {word.strip(punctuation):0 for word in tt.split()}
splitl = [[word for item in sublist for word in item.split(' ')] for sublist in l]
for word in chain(*product(*splitl)):
if word in counts:
counts[word] += 1
if sum(v > 1 for v in counts.values()) > 1:
return False
return True
You can simplify a lot by just concatenating the lists using '+' rather than having a list of lists. This code also words if the string has spaces in it.
import string
list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']
l = list1 + list2 + list3
def test(l, tt):
count = 0
for word in l:
#set of all punctuation to exclude
exclude = set(string.punctuation)
#remove punctuation from word
word = ''.join(ch for ch in word if ch not in exclude)
if word in tt:
count += 1
if count > 1:
return False
else:
return True
You could just split all the list input by iterating through it. Something like:
words=[]
for list in l:
for word in list:
string=word.split()
words.append(string)
You may consider using sets for this kind of processing.
Here is a quick implementation :
from itertools import chain
from string import punctuation
list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']
l = list(chain(list1, list2, list3))
words = set(w.strip(punctuation) for word in l for w in word.split()) # 1
def test(words, text):
text_words = set(word.strip(punctuation) for word in text.split()) # 2
return len(words & text_words) == 1 # 3
Few comments:
Double for-loop on intentions works, you get a list of the words. The set make sure each word is unique.
Same thing on the input sentence
Using set intersection to get all words in the sentence that are also in your search set. Then using the length of this set to see if there is only one.
Well, first, lets rewrite the function to be more natural:
from itertools import chain
def only_one_of(lists, sentence):
found = None
for item in chain(*lists):
if item in sentence:
if found: return False
else: found = item
return True if found not is None else False
This already works with your constrains as it is only looking for some string item being a substring of sentence. It does not matter if it includes spaces or not. But it may lead to unexpected results. Imagine:
list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']
l = [list1, list2, list3]
only_one_of(l, 'Cadabra')
This returns True because abra is a substring of Cadabra. If this is what you want, then you're done. But if not, you need to redefine what item in sentence really means. So, let's redefine our function:
def only_one_of(lists, sentence, is_in=lambda i, c: i in c):
found = None
for item in chain(*lists):
if is_in(item, sentence):
if found: return False
else: found = item
return True if found not is None else False
Now the last parameter expects to be a function to be applied to two strings that return True if the first is found in the second or False, elsewhere.
You usually want to check if the item is inside the sentence as a word (but a word that can contain spaces in the middle) so let's use regular expressions to do that:
import re
def inside(string, sentence):
return re.search(r'\b%s\b' % string, sentence)
This function returns True when string is in sentence but considering string as a word (the special sequence \b in regular expression stands for word boundary).
So, the following code should pass your constrains:
import re
from itertools import chain
def inside(string, sentence):
return re.search(r'\b%s\b' % string, sentence)
def only_one_of(lists, sentence, is_in=lambda i, c: i in c):
found = None
for item in chain(*lists):
if is_in(item, sentence):
if found: return False
else: found = item
return True if found not is None else False
list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']
list4 = ['I have', 'you are', 'he is']
l = [list1, list2, list3, list4]
only_one_of(l, 'hello my name is brian', inside) # True
only_one_of(l, 'hello how are you?', inside) # False
only_one_of(l, 'Cadabra', inside) # False
only_one_of(l, 'I have a sister', inside) # True
only_one_of(l, 'he is my ex-boyfriend', inside) # False, ex and boyfriend are two words
only_one_of(l, 'he is my exboyfriend', inside) # True, exboyfriend is only one word