Related
Assuming a list as follows:
list_of_strings = ['foo', 'bar', 'soap', 'sseo', 'spaseo', 'oess']
and a sub string
to_find = 'seos'
I would like to find the string(s) in the list_of_strings that:
Have the same length as to_find
Have the same characters as to_find (irresepective of the order of the characters)
The output from the list_of_strings should be 'sseo', 'oess'] (since it has all the letters from to_find & all have a length of 4)
I have:
import itertools
list_of_strings = [string for string in list_of_strings if len(string) == len(to_find)]
result = [string for string in list_of_strings if any("".join(perm) in string for perm in itertools.permutations(to_find))]
To find how long does it take to run the code I did
import timeit
timeit.timeit("[string for string in list_of_strings if any(''.join(perm) in string for perm in itertools.permutations(to_find))]",
setup='from __main__ import list_of_strings, to_find', number=100000)
The process takes a while to give the output. I am guessing it is because of the use of itertools.permutations.
Is there a way I can make this code more efficient?
Thanks
If order doesn't matter, you can just sort the strings and compare the resulting lists:
list_of_strings = ['foo', 'bar', 'soap', 'sseo', 'spaseo', 'oess']
to_find = sorted('seos')
matches = [word for word in list_of_strings if sorted(word) == to_find]
This should work because Counter creates a dict-like that counts the number of characters in each string and the aim is to match the letters and their counts irrespective of their orders.
from collections import Counter
to_find_counter = Counter(to_find)
# go through the list and check if the Counter is the same as the Counter of to_find
[x for x in list_of_strings if Counter(x)==to_find_counter]
['sseo', 'oess']
This question already has answers here:
How to get all the maximums max function
(4 answers)
Closed 1 year ago.
I have a string separated by commas ,. I want to find the longest string from the given string.
words = 'run,barn,abcdefghi,yellow,barracuda,shark,fish,swim'
What I did so far
print(max(words.split(','), key=len))
And I am getting this output abcdefghi but as you can see abcdefghi and barracuda have same length. So, why I am only getting one instead of two or all.
Also
words = 'fishes,sam,gollum,sauron,frodo,balrog'
in the above string many words have same length. I want to return every one of them.
You can zip len of word to word then create dict from len and return largest len like below:
>>> from collections import defaultdict
>>> words = 'run,barn,abcdefghi,yellow,barracuda,shark,fish,swim'
>>> dct = defaultdict(list)
>>> lstWrdSplt = words.split(',')
>>> for word, length in (zip(lstWrdSplt,(map(len,lstWrdSplt)))):
... dct[length].append(word)
>>> dct[max(dct)]
['abcdefghi', 'barracuda']
# for more explanation
>>> dct
defaultdict(list,
{3: ['run'],
4: ['barn', 'fish', 'swim'],
9: ['abcdefghi', 'barracuda'],
6: ['yellow'],
5: ['shark']})
You can use this as function and use regex for find only words like below:
from collections import defaultdict
import re
def mxLenWord(words):
dct = defaultdict(list)
lstWrdSplt = re.findall('\w+', words)
for word, length in (zip(lstWrdSplt,(map(len,lstWrdSplt)))):
dct[length].append(word.strip())
return dct[max(dct)]
words = 'rUnNiNg ,swimming, eating,biking, climbing'
mxLenWord(words)
Output:
['swimming', 'climbing']
Try the below
from collections import defaultdict
data = defaultdict(list)
words = 'run,barn,abcdefghi,yellow,barracuda,shark,fish,swim'
for w in words.split(','):
data[len(w)].append(w)
word_len = sorted(data.keys(),reverse=True)
for wlen in word_len:
print(f'{wlen} -> {data[wlen]}')
output
9 -> ['abcdefghi', 'barracuda']
6 -> ['yellow']
5 -> ['shark']
4 -> ['barn', 'fish', 'swim']
3 -> ['run']
There're plenty of methods which I find way too complicated for such an easy task. You can solve it using combination of sorted() and groupby():
from itertools import groupby
words = 'run,barn,abcdefghi,yellow,barracuda,shark,fish,swim'
_, (*longest,) = next(groupby(sorted(words.split(","), key=len, reverse=True), len))
print(longest)
To find all words with same length you can use next one-liner:
from itertools import groupby
words = 'fishes,sam,gollum,sauron,frodo,balrog'
words_len = {l: list(w) for l, w in groupby(sorted(words.split(","), key=len), len)}
print(words_len)
How can i choose random multiple elements from list ? I looked it from internet but couldn't find anything.
words=["ar","aba","oto","bus"]
You could achieve that with random.sample():
from random import sample
words = ["ar", "aba", "oto", "bus"]
selected = sample(words, 2)
That would select 2 words randomly from the words list.
You can check Python docs for more details.
I think about that :
import random as rd
words=["ar","aba","oto","bus"]
random_words = [word for word in words if rd.random()>1/2]
You can adjust 1/2 by any value between 0 and 1 to approximate the percentage of words chosen in the initial list.
Use random
Here is example
random.choice
>>> import random
>>> words=["ar","aba","oto","bus"]
>>> print(random.choice(words))
ar
>>> print(random.choice(words))
ar
>>> print(random.choice(words))
oto
>>> print(random.choice(words))
aba
>>> print(random.choice(words))
ar
>>> print(random.choice(words))
bus
random.sample # sample takes one extra argument to pass a list with element is returned
>>> print(random.sample(words, 3))
['bus', 'ar', 'oto']
>>> print(random.sample(words, 3))
['ar', 'oto', 'aba']
>>> print(random.sample(words, 2))
['aba', 'bus']
>>> print(random.sample(words, 2))
['ar', 'aba']
>>> print(random.sample(words, 1))
['ar']
>>> print(random.sample(words, 1))
['ar']
>>> print(random.sample(words, 1))
['oto']
>>> print(random.sample(words, 1))
['bus']
You can use random library
Method 1 - random.choice()
from random import choice
words=["ar","aba","oto","bus"]
word = choice(words)
print(word)
Method 2 - Generate Random Index
from random import randint
words=["ar","aba","oto","bus"]
ind = randint(0, len(words)-1)
word = words[ind]
print(word)
Method 3 - Select Multiple Items
from random import choices
words=["ar","aba","oto","bus"]
selected = choices(words, k=2) # k is the elements count to select
print(selected)
I'd like to split the string u'123K into 123 and K. I've tried re.match("u'123K", "\d+") to match the number and re.match("u'123K", "K") to match the letter but they don't work. What is a Pythonic way to do this?
Use re.findall() to find all numbers and characters:
>>> s = u'123K'
>>> re.findall(r'\d+|[a-zA-Z]+', s) # or use r'\d+|\D+' as mentioned in comment in order to match all numbers and non-numbers.
['123', 'K']
If you are just dealing with this string or if you only want to split the string from the last character you can simply use a indexing:
num, charracter = s[:-1], s[-1:]
You can also use itertools.groupby method, grouping digits:
>>> import itertools as it
>>> for _,v in it.groupby(s, key=str.isdigit):
print(''.join(v))
123
K
is there any patterns i can use to sort out how to create a string that is palindrome which made up with 'X' 'Y'
Let's assume n is even. Generate every string of length n/2 that consists of x and y, and append its mirror image to get a palindrome.
Exercise 1: prove that this generates all palindromes of length n.
Exercise 2: figure out what to do when n is odd.
First generate all possible strings given a list of characters:
>>> from itertools import product
>>> characters = ['x','y']
>>> n = 5
>>> [''.join(i) for i in product(characters, repeat=n)]
['xxxxx', 'xxxxy', 'xxxyx', 'xxxyy', 'xxyxx', 'xxyxy', 'xxyyx', 'xxyyy', 'xyxxx', 'xyxxy', 'xyxyx', 'xyxyy', 'xyyxx', 'xyyxy', 'xyyyx', 'xyyyy', 'yxxxx', 'yxxxy', 'yxxyx', 'yxxyy', 'yxyxx', 'yxyxy', 'yxyyx', 'yxyyy', 'yyxxx', 'yyxxy', 'yyxyx', 'yyxyy', 'yyyxx', 'yyyxy', 'yyyyx', 'yyyyy']
Then filter out non-palindrome:
>>> n = 4
>>> [''.join(i) for i in product(characters, repeat=n) if i[:n/2] == i[::-1][:n/2]]
['xxxx', 'xyyx', 'yxxy', 'yyyy']
>>> n = 5
>>> [''.join(i) for i in product(characters, repeat=n) if i[:n/2] == i[::-1][:n/2]]
['xxxxx', 'xxyxx', 'xyxyx', 'xyyyx', 'yxxxy', 'yxyxy', 'yyxyy', 'yyyyy']
If you don't like if conditions in list comprehension, you can use filter():
>>> from itertools import product
>>> characters = ['x','y']
>>> n = 5
>>> def ispalindrome(x): return x[:n/2] == x[::-1][:n/2];
>>> filter(ispalindrome, [''.join(i) for i in product(characters, repeat=n)])
['xxxxx', 'xxyxx', 'xyxyx', 'xyyyx', 'yxxxy', 'yxyxy', 'yyxyy', 'yyyyy']