This question already has answers here:
Python list comprehension: list sub-items without duplicates
(6 answers)
Closed 7 years ago.
I'm new to python and I wanted to try to use list comprehension but outcome I get is None.
print
wordlist = ['cat', 'dog', 'rabbit']
letterlist = []
letterlist = [letterlist.append(letter) for word in wordlist for letter in word if letter not in letterlist]
print letterlist
# output i get: [None, None, None, None, None, None, None, None, None]
# expected output: ['c', 'a', 't', 'd', 'o', 'g', 'r', 'b', 'i']
Why is that? It seems that it works somehow because I get expected number of outcomes (9) but all of them are None.
list.append(element) doesn’t return anything – it appends an element to the list in-place.
Your code could be rewritten as:
wordlist = ['cat', 'dog', 'rabbit']
letterlist = [letter for word in wordlist for letter in word]
letterlist = list(set(letterlist))
print letterlist
… if you really want to use a list comprehension, or:
wordlist = ['cat', 'dog', 'rabbit']
letterset = set()
for word in wordlist:
letterset.update(word)
print letterset
… which is arguably clearer. Both of these assume order doesn’t matter. If it does, you could use OrderedDict:
from collections import OrderedDict
letterlist = list(OrderedDict.fromkeys("".join(wordlist)).keys())
print letterlist
list.append returns None. You need to adjust the expression in the list comprehension to return letters.
wordlist = ['cat', 'dog', 'rabbit']
letterset = set()
letterlist = [(letterset.add(letter), letter)[1]
for word in wordlist
for letter in word
if letter not in letterset]
print letterlist
If order doesn't matter, do this:
resultlist = list({i for word in wordlist for i in word})
Related
I have been looking for an answer to this for a while but keep finding answers about stripping a specific string from a list.
Let's say this is my list of strings
stringList = ["cat\n","dog\n","bird\n","rat\n","snake\n"]
But all list items contain a new line character (\n)
How can I remove this from all the strings within the list?
Use a list comprehension with rstrip():
stringList = ["cat\n","dog\n","bird\n","rat\n","snake\n"]
output = [x.rstrip() for x in stringList]
print(output) # ['cat', 'dog', 'bird', 'rat', 'snake']
If you really want to target a single newline character only at the end of each string, then we can get more precise with re.sub:
stringList = ["cat\n","dog\n","bird\n","rat\n","snake\n"]
output = [re.sub(r'\n$', '', x) for x in stringList]
print(output) # ['cat', 'dog', 'bird', 'rat', 'snake']
By applying the method strip (or rstrip) to all terms of the list with map
out = list(map(str.strip, stringList))
print(out)
or with a more rudimental check and slice
strip_char = '\n'
out = [s[:-len(strip_char)] if s.endswith(strip_char) else s for s in stringList]
print(out)
Since you can use an if to check if a new line character exists in a string, you can use the code below to detect string elements with the new line character and replace those characters with empty strings
stringList = ["cat\n","dog\n","bird\n","rat\n","snake\n"]
nlist = []
for string in stringList:
if "\n" in string:
nlist.append(string.replace("\n" , ""))
print(nlist)
You could also use map() along with str.rstrip:
>>> string_list = ['cat\n', 'dog\n', 'bird\n', 'rat\n', 'snake\n']
>>> new_string_list = list(map(str.rstrip, string_list))
>>> new_string_list
['cat', 'dog', 'bird', 'rat', 'snake']
I have a list and would like to print all words after 4th position using python and each word after the 3rd position will be suffixed with ".com"
Example
my_list = ['apple', 'ball', 'cat', 'dog', 'egg', 'fish', 'rat']
From the above I would like to print the value from 'egg' onwards, i.e: egg.com, fish.com, rat.com
Just do this:
for i in my_list[3:]:
print(i + '.com')
That is it.
Code
def get_words(lst, word):
' Returns string of words starting from a particular word in list lst '
# Use lst.index to find index of word in list
# slice (i.e. lst[lst.index(word):] for sublist of words from word in list
# list comprehension to add '.com' to each word starting at index
# join to concatenate words
if word in lst:
return ', '.join([x + '.com' for x in lst[lst.index(word):]])
Usage
my_list = ['apple', 'ball', 'cat', 'dog', 'egg', 'fish', 'rat']
print(get_words(my_list, 'egg')) # egg.com, fish.com, rat.com
print(get_words(my_list, 'dog')) # dog.com, egg.com, fish.com, rat.com
print(get_words(my_list, 'pig')) # None
I'm supposed to:
"Write a function search(words, start_chs) that returns a list of all words in the list words whose first letter is in the list start_chs.
If either words or start_chs is empty, return an empty list.
Since the returned list can have many words in it, the words should be arranged in the order that they originally appeared in."
e.g.
words = ['green', 'grasses', 'frolicking', 'cats', '', 'kittens', 'playful']
start_chs = ['a', 'c', 'g']
new_words = search(words, start_chs) # ['green', 'grasses', 'cats']
My code thus far is below, however, it doesn't return all the correct outputs, only some of them.
def search(words, start_chs):
k=0
p=[]
if (len(words)== 0) or (len(start_chs)== 0):
return ([])
while k < len(start_chs):
if start_chs[k][0] == words[k][0]:
p.append(words[k])
k+=1
return(p)
You can try this:-
def search(words, start_chs):
res = []
for word in words:
if word!='' and word[0] in start_chs:
res.append(word)
return res
search(words, start_chs)
output:
['green', 'grasses', 'cats']
It should looks like that. It's bad practice to use iterate list with index, use for element in list instead.
About your code, why it's not working. It's because you should iterate over whole list. In this example you only check if 'a' equal first letter in 'green', 'c' in 'grasses' and 'g' in 'frolicking'.
words = ['green', 'grasses', 'frolicking', 'cats', '', 'kittens', 'playful']
start_chs = ['a', 'c', 'g']
def search(words, start_chs):
result = []
if len(words)== 0 or len(start_chs)== 0:
return []
for word in words:
if word and word[0] in start_chs:
result.append(word)
return result
print(search(words, start_chs))
The problem with the code that you wrote is that you are only checking with the character at the corresponding index of the word. So you can use in in place of this.
def search(words, start_chs):
k=0
p=[]
if (len(words)== 0) or (len(start_chs)== 0):
return []
for i in words:
if i[0] in start_chs:
p.append(i)
k+=1
return p
Try this:
def search(words, start_chs):
start_chs = tuple(start_chs)
return [word for word in words if word.startswith(start_chs)]
words = ['green', 'grasses', 'frolicking', 'cats', '', 'kittens', 'playful']
start_chs = ['a', 'c', 'g']
new_words = search(words, start_chs)
#['green', 'grasses', 'cats']
Right now I have a list of for example
data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff']
I want to remove the words with the repeated letters, in which I want to remove the words
'aa','aac','bbb','bcca','ffffff'
Maybe import re?
Thanks to this thread: Regex to determine if string is a single repeating character
Here is the re version, but I would stick to PM2 ring and Tameem's solutions if the task was as simple as this:
import re
data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff']
[i for i in data if not re.search(r'^(.)\1+$', i)]
Output
['dog', 'cat', 'a', 'aac', 'bcca']
And the other:
import re
data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff']
[i for i in data if not re.search(r'((\w)\2{1,})', i)]
Output
['dog', 'cat', 'a']
Loop is the way to go. Forget about sets so far as they do not work for words with repetitive letters.
Here is a method you can use to determine if word is valid in a single loop:
def is_valid(word):
last_char = None
for i in word:
if i == last_char:
return False
last_char = i
return True
Example
In [28]: is_valid('dogo')
Out[28]: True
In [29]: is_valid('doo')
Out[29]: False
The original version of this question wanted to drop words that consist entirely of repetitions of a single character. An efficient way to do this is to use sets. We convert each word to a set, and if it consists of only a single character the length of that set will be 1. If that's the case, we can drop that word, unless the original word consisted of a single character.
data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff']
newdata = [s for s in data if len(s) == 1 or len(set(s)) != 1]
print(newdata)
output
['dog', 'cat', 'a', 'aac', 'bcca']
Here's code for the new version of your question, where you want to drop words that contain any repeated characters. This one's simpler, because we don't need to make a special test for one-character words..
data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff']
newdata = [s for s in data if len(set(s)) == len(s)]
print(newdata)
output
['dog', 'cat', 'a']
If the repetitions have to be consecutive, we can handle that using groupby.
from itertools import groupby
data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff', 'abab', 'wow']
newdata = [s for s in data if max(len(list(g)) for _, g in groupby(s)) == 1]
print(newdata)
output
['dog', 'cat', 'a', 'abab', 'wow']
Here's a way to check if there are consecutive repeated characters:
def has_consecutive_repeated_letters(word):
return any(c1 == c2 for c1, c2 in zip(word, word[1:]))
You can then use a list comprehension to filter your list:
words = ['dog','cat','a','aa','aac','bbb','bcca','ffffff', 'abab', 'wow']
[word for word in words if not has_consecutive_repeated_letters(word)]
# ['dog', 'cat', 'a', 'abab', 'wow']
One line is all it takes :)
data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff']
data = [value for value in data if(len(set(value))!=1 or len(value) ==1)]
print(data)
Output
['dog', 'cat', 'a', 'aac', 'bcca']
I have the following list of letters:
letters = ['t', 'u', 'v', 'w', 'x', 'y', 'z']
And following list of words:
words = ['apple', 'whisky', 'yutz', 'xray', 'tux', 'zebra']
How can I search using Python if any of the combination of words exist for the list of letters? Like just looking at it we can observe that two words 'yutz' and 'tux' are the only one which can be built for the list of letters we have.
I'm new to Python and I tried to make different for loops but not able to reach anywhere.
for word in words:
for i in letters:
if i in word:
print(word)
else:
print('not in word')
And the result is disaster as you guys can understand.
You need to look at your problem in terms of sets. Any word from your words list that is a subset of your set of letters can be formed by those letters. Put differently, letters needs to be a superset of the word:
letters = {'t', 'u', 'v', 'w', 'x', 'y', 'z'} # a set, not a list
for word in words:
if letters.issuperset(word):
print(word)
The set.issuperset() method returns true if all elements of the iterable argument are in the set.
If you wanted a list, just use a list comprehension:
[word for word in words if letters.issuperset(word)]
Demo:
>>> words = ['apple', 'whisky', 'yutz', 'xray', 'tux', 'zebra']
>>> letters = {'t', 'u', 'v', 'w', 'x', 'y', 'z'} # a set, not a list
>>> [word for word in words if letters.issuperset(word)]
['yutz', 'tux']
Note that this only looks at unique letters. apple is a subset of the letters set {'a', 'p', 'l', 'e'}. If you need to handle letter counts too, you need to use a multiset; Python has an implementation called collections.Counter(). This keeps track not only of the letters, but also of their counts.
The Counter type doesn't support testing for sub- or supersets, so you have to use subtraction instead; if an empty Counter() is produced, the whole word can be formed from the letter counts:
letters = Counter(['a', 'p', 'l', 'e', 'p', 'i'])
words = ['apple', 'applepie']
for word in words:
if not Counter(word) - letters:
print(word)
or as a list comprehension:
[word for word in words if not Counter(word) - letters]
which produces ['apple'], as there is only a single 'e' in the input letter multi-set, and only 2 'p's, not 3.
You can use set.difference here:
r = [w for w in words if not set(w).difference(letters)]
r
['yutz', 'tux']
If the result is an empty set, that means every character in w belongs to letters. If that is the case, set.difference returns an empty set, which is False-y, so not .... results in True and the word is printed. This is equivalent to:
for w in words:
if not set(w).difference(letters):
print(w)
yutz
tux
This is similar to testing with set.issuperset, but approaches the problem from a different perspective.
You can use the all function with a generator to determine if all the characters in a word belonging to words exists in letters:
letters = ['t', 'u', 'v', 'w', 'x', 'y', 'z']
words = ['apple', 'whisky', 'yutz', 'xray', 'tux', 'zebra']
final_words = [i for i in words if all(c in letters for c in i)]
Output:
['yutz', 'tux']
You can use itertool's permutation method :
In one line:
print(set(["".join(permutation) for item in words for permutation in itertools.permutations(letters,len(item)) if "".join(permutation) in words ]))
Detailed solution:
above list comprehension is same as:
words = ['apple', 'whisky', 'yutz', 'xray', 'tux', 'zebra']
letters = ['t', 'u', 'v', 'w', 'x', 'y', 'z']
import itertools
final=[]
for i in words:
for k in itertools.permutations(letters,len(i)):
if "".join(k) in words and "".join(k) not in final:
final.append("".join(k))
print(final)
output:
['yutz', 'tux']