how to enumerate / zip as lambda - python

Is there a way to replace the for-loop in the groupList function with a lambda function, perhaps with map(), in Python 3.
def groupList(input_list, output_list=[]):
for i, (v, w) in enumerate(zip(input_list[:-2], input_list[2:])):
output_list.append(f'{input_list[i]} {input_list[i+1]} {input_list[i+2]}')
return output_list
print(groupList(['A', 'B', 'C', 'D', 'E', 'F', 'G']))
(Output from the groupList function would be ['A B C', 'B C D', 'C D E', 'D E F', 'E F G'])

Solution 1:
def groupList(input_list):
return [' '.join(input_list[i:i+3]) for i in range(len(input_list) - 2)]
Solution 2:
def groupList(input_list):
return list(map(' '.join, (input_list[i:i+3] for i in range(len(input_list) - 2))))

Besides the previous solutions, a more efficient (but less concise) solution is to compute a full concatenation first and then slice it.
from itertools import accumulate
def groupList(input_list):
full_concat = ' '.join(input_list)
idx = [0]
idx.extend(accumulate(len(s) + 1 for s in input_list))
return [full_concat[idx[i]:idx[i+3]-1] for i in range(len(idx) - 3)]

Related

Join elements of two lists in Python

Say I have two Python lists containing strings that may or may not be of the same length.
list1 = ['a','b']
list2 = ['c','d','e']
I want to get the following result:
l = ['a c','a d','a e','b c','b d','b e']
The final list all possible combinations from the two lists with a space in between them.
One method I've tried is with itertools
import itertools
for p in itertools.permutations(, 2):
print(zip(*p))
But unfortunately this was not what I needed, as it did not return any combinations at all.
First make all possible combinations of the two lists, then use list comprehension to achieve the desired result:
list1 = ['a', 'b']
list2 = ['c', 'd', 'e']
com = [(x,y) for x in list1 for y in list2]
print([a + ' ' + b for (a, b) in com]) # ['a c', 'a d', 'a e', 'b c', 'b d', 'b e']
What you want is a cartesian product.
Code:
import itertools
list1 = ['a', 'b']
list2 = ['c', 'd', 'e']
l = ['%s %s' % (e[0], e[1]) for e in itertools.product(list1, list2)]
print(l)
result:
['a c', 'a d', 'a e', 'b c', 'b d', 'b e']
This is another possible method:
list1=['a','b']
list2=['c','d','e']
list3=[]
for i in list1:
for j in list2:
list3.append(i+" "+j)
print(list3)
One-Liner Solution, Use list comprehension and add the items of list
list1 = ['a','b']
list2 = ['c','d','e']
print([i+j for i in list1 for j in list2])

Find all combinations of each individual list element

Given the following list
myList = ['A' , 'B' , 'C, D' , 'E, F, G', 'H' , 'I']
How do I go about getting every possible combination for each element in the list that has more than 2 characters. I also do not want to get combinations of all of the elements together if that makes sense.
An example output using the above list would look like below:
myList = ['A' , 'B' , 'C, D' , 'E, F' , 'E, G' , 'F, G' , 'H' , 'I']
Note: I only care about finding the combinations of each element that has more than two characters.
I have attempted using a few times using itertools but that seems to want to find all possible combinations of ALL elements in the list, as opposed to combinations of the individual parts.
for L in range(0, len(myList)+1):
for subset in itertools.combinations(myList, L):
print(subset)
Use itertools combinations on only those elements that have more than 2 letters after splitting.
import itertools
myList = ['A' , 'B' , 'C, D' , 'E, F, G', 'H' , 'I']
result = []
for item in myList:
item_split = item.split(',') #split each item on , separator
if len(item_split) <= 2:
result.append(item)
else: #more than 2 items after splitting. use combinations
result.extend(",".join(pair) for pair in itertools.combinations(item_split, 2))
print(result)
#Output:
['A', 'B', 'C, D', 'E, F', 'E, G', ' F, G', 'H', 'I']
Similar to Paritosh Singh's answer, but with more parentheses :)
from operator import methodcaller
from itertools import chain, combinations
sep = ', '
splitter = methodcaller('split', sep)
def pairs(x):
return combinations(x, 2 if len(x) > 1 else 1)
joiner = sep.join
result = list(map(joiner,
chain.from_iterable(map(pairs,
map(splitter,
my_list)))))
[DIGRESSION ALERT]
... which arguably reads a little better if you use Coconut:
from itertools import chain, combinations
my_list = ['A' , 'B' , 'C, D' , 'E, F, G', 'H' , 'I']
my_result = (my_list
|> split_each
|> pairs
|> chain.from_iterable
|> join_each
|> list
)
where:
split_each = map$(.split(", "))
pairs = map$((x) -> combinations(x, 2 if len(x) > 1 else 1))
join_each = map$(", ".join)

Grammatically correct human readable string from list (with Oxford comma)

I want a grammatically correct human-readable string representation of a list. For example, the list ['A', 2, None, 'B,B', 'C,C,C'] should return the string A, 2, None, B,B, and C,C,C. This contrived example is somewhat necessary. Note that the Oxford comma is relevant for this question.
I tried ', '.join(seq) but this doesn't produce the expected result for the aforementioned example.
Note the preexisting similar questions:
How to print a list in Python "nicely" doesn't concern with a grammatically correct human-readable string.
Grammatical List Join in Python is without the Oxford comma. The example and answers there are correspondingly different and they do not work for my question.
This function works by handling small lists differently than larger lists.
from typing import Any, List
def readable_list(seq: List[Any]) -> str:
"""Return a grammatically correct human readable string (with an Oxford comma)."""
# Ref: https://stackoverflow.com/a/53981846/
seq = [str(s) for s in seq]
if len(seq) < 3:
return ' and '.join(seq)
return ', '.join(seq[:-1]) + ', and ' + seq[-1]
Usage examples:
readable_list([])
''
readable_list(['A'])
'A'
readable_list(['A', 2])
'A and 2'
readable_list(['A', None, 'C'])
'A, None, and C'
readable_list(['A', 'B,B', 'C,C,C'])
'A, B,B, and C,C,C'
readable_list(['A', 'B', 'C', 'D'])
'A, B, C, and D'
You can also use unpacking for a slightly cleaner solution:
def readable_list(_s):
if len(_s) < 3:
return ' and '.join(map(str, _s))
*a, b = _s
return f"{', '.join(map(str, a))}, and {b}"
vals = [[], ['A'], ['A', 2], ['A', None, 'C'], ['A', 'B,B', 'C,C,C'], ['A', 'B', 'C', 'D']]
print([readable_list(i) for i in vals])
Output:
['', 'A', 'A and 2', 'A, None, and C', 'A, B,B, and C,C,C', 'A, B, C, and D']
Based on the accepted answer for the thread you linked to, here's a one-liner that takes an optional argument for whether to use an Oxford comma or not.
from typing import List
def list_items_in_english(l: List[str], oxford_comma: bool = True) -> str:
"""
Produce a list of the items formatted as they would be in an English sentence.
So one item returns just the item, passing two items returns "item1 and item2" and
three returns "item1, item2, and item3" with an optional Oxford comma.
"""
return ", ".join(l[:-2] + [((oxford_comma and len(l) != 2) * ',' + " and ").join(l[-2:])])
I got really stubborn and I really wanted to figure out a one-liner solution.
"{} and {}".format(seq[0], seq[1]) if len(seq)==2 else ', '.join([str(x) if (y < len(seq)-1 or len(seq)<=1) else "and {}".format(str(x)) for x, y in zip(seq, range(len(seq)))])
I think this one does the trick. And I think the problem is also more complicated than I thought to be solved with a non-ugly one-liner.

Read all possible sequential substrings in Python

If I have a list of letters, such as:
word = ['W','I','N','E']
and need to get every possible sequence of substrings, of length 3 or less, e.g.:
W I N E, WI N E, WI NE, W IN E, WIN E etc.
What is the most efficient way to go about this?
Right now, I have:
word = ['W','I','N','E']
for idx,phon in enumerate(word):
phon_seq = ""
for p_len in range(3):
if idx-p_len >= 0:
phon_seq = " ".join(word[idx-(p_len):idx+1])
print(phon_seq)
This just gives me the below, rather than the sub-sequences:
W
I
W I
N
I N
W I N
E
N E
I N E
I just can't figure out how to create every possible sequence.
Try this recursive algorithm:
def segment(word):
def sub(w):
if len(w) == 0:
yield []
for i in xrange(1, min(4, len(w) + 1)):
for s in sub(w[i:]):
yield [''.join(w[:i])] + s
return list(sub(word))
# And if you want a list of strings:
def str_segment(word):
return [' '.join(w) for w in segment(word)]
Output:
>>> segment(word)
[['W', 'I', 'N', 'E'], ['W', 'I', 'NE'], ['W', 'IN', 'E'], ['W', 'INE'], ['WI', 'N', 'E'], ['WI', 'NE'], ['WIN', 'E']]
>>> str_segment(word)
['W I N E', 'W I NE', 'W IN E', 'W INE', 'WI N E', 'WI NE', 'WIN E']
As there can either be a space or not in each of three positions (after W, after I and after N), you can think of this as similar to bits being 1 or 0 in a binary representation of a number ranging from 1 to 2^3 - 1.
input_word = "WINE"
for variation_number in xrange(1, 2 ** (len(input_word) - 1)):
output = ''
for position, letter in enumerate(input_word):
output += letter
if variation_number >> position & 1:
output += ' '
print output
Edit: To include only variations with sequences of 3 characters or less (in the general case where input_word may be longer than 4 characters), we can exclude cases where the binary representation contains 3 zeroes in a row. (We also start the range from a higher number in order to exclude the cases which would have 000 at the beginning.)
for variation_number in xrange(2 ** (len(input_word) - 4), 2 ** (len(input_word) - 1)):
if not '000' in bin(variation_number):
output = ''
for position, letter in enumerate(input_word):
output += letter
if variation_number >> position & 1:
output += ' '
print output
My implementation for this problem.
#!/usr/bin/env python
# this is a problem of fitting partitions in the word
# we'll use itertools to generate these partitions
import itertools
word = 'WINE'
# this loop generates all possible partitions COUNTS (up to word length)
for partitions_count in range(1, len(word)+1):
# this loop generates all possible combinations based on count
for partitions in itertools.combinations(range(1, len(word)), r=partitions_count):
# because of the way python splits words, we only care about the
# difference *between* partitions, and not their distance from the
# word's beginning
diffs = list(partitions)
for i in xrange(len(partitions)-1):
diffs[i+1] -= partitions[i]
# first, the whole word is up for taking by partitions
splits = [word]
# partition the word's remainder (what was not already "taken")
# with each partition
for p in diffs:
remainder = splits.pop()
splits.append(remainder[:p])
splits.append(remainder[p:])
# print the result
print splits
As an alternative answer , you can do it with itertools module and use groupby function for grouping your list and also i use combination to create a list of pair index for grouping key : (i<=word.index(x)<=j) and at last use set for get a unique list .
Also note that you can got a unique combination of pair index at first by this method that when you have pairs like (i1,j1) and (i2,j2) if i1==0 and j2==3 and j1==i2 like (0,2) and (2,3) it mean that those slices result are same you need to remove one of them.
All in one list comprehension :
subs=[[''.join(i) for i in j] for j in [[list(g) for k,g in groupby(word,lambda x: i<=word.index(x)<=j)] for i,j in list(combinations(range(len(word)),2))]]
set([' '.join(j) for j in subs]) # set(['WIN E', 'W IN E', 'W INE', 'WI NE', 'WINE'])
Demo in details :
>>> cl=list(combinations(range(len(word)),2))
>>> cl
[(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)]
>>> new_l=[[list(g) for k,g in groupby(word,lambda x: i<=word.index(x)<=j)] for i,j in cl]
>>> new_l
[[['W', 'I'], ['N', 'E']], [['W', 'I', 'N'], ['E']], [['W', 'I', 'N', 'E']], [['W'], ['I', 'N'], ['E']], [['W'], ['I', 'N', 'E']], [['W', 'I'], ['N', 'E']]]
>>> last=[[''.join(i) for i in j] for j in new_l]
>>> last
[['WI', 'NE'], ['WIN', 'E'], ['WINE'], ['W', 'IN', 'E'], ['W', 'INE'], ['WI', 'NE']]
>>> set([' '.join(j) for j in last])
set(['WIN E', 'W IN E', 'W INE', 'WI NE', 'WINE'])
>>> for i in set([' '.join(j) for j in last]):
... print i
...
WIN E
W IN E
W INE
WI NE
WINE
>>>
i think it can be like this:
word = "ABCDE"
myList = []
for i in range(1, len(word)+1,1):
myList.append(word[:i])
for j in range(len(word[len(word[1:]):]), len(word)-len(word[i:]),1):
myList.append(word[j:i])
print(myList)
print(sorted(set(myList), key=myList.index))
return myList

How to split a string into characters in python

I have a string 'ABCDEFG'
I want to be able to list each character sequentially followed by the next one.
Example
A B
B C
C D
D E
E F
F G
G
Can you tell me an efficient way of doing this? Thanks
In Python, a string is already seen as an enumerable list of characters, so you don't need to split it; it's already "split". You just need to build your list of substrings.
It's not clear what form you want the result in. If you just want substrings, this works:
s = 'ABCDEFG'
[s[i:i+2] for i in range(len(s))]
#=> ['AB', 'BC', 'CD', 'DE', 'EF', 'FG', 'G']
If you want the pairs to themselves be lists instead of strings, just call list on each one:
[list([s[i:i+2]) for i in range(len(s))]
#=> [['A', 'B'], ['B', 'C'], ['C', 'D'], ['D', 'E'], ['E', 'F'], ['F', 'G'], ['G']]
And if you want strings after all, but with something like a space between the letters, join them back together after the list call:
[' '.join(list(s[i:i+2])) for i in range(len(s))]
#=> ['A B', 'B C', 'C D', 'D E', 'E F', 'F G', 'G']
You need to keep the last character, so use izip_longest from itertools
>>> import itertools
>>> s = 'ABCDEFG'
>>> for c, cnext in itertools.izip_longest(s, s[1:], fillvalue=''):
... print c, cnext
...
A B
B C
C D
D E
E F
F G
G
def doit(input):
for i in xrange(len(input)):
print input[i] + (input[i + 1] if i != len(input) - 1 else '')
doit("ABCDEFG")
Which yields:
>>> doit("ABCDEFG")
AB
BC
CD
DE
EF
FG
G
There's an itertools pairwise recipe for exactly this use case:
import itertools
def pairwise(myStr):
a,b = itertools.tee(myStr)
next(b,None)
for s1,s2 in zip(a,b):
print(s1,s2)
Output:
In [121]: pairwise('ABCDEFG')
A B
B C
C D
D E
E F
F G
Your problem is that you have a list of strings, not a string:
with open('ref.txt') as f:
f1 = f.read().splitlines()
f.read() returns a string. You call splitlines() on it, getting a list of strings (one per line). If your input is actually 'ABCDEFG', this will of course be a list of one string, ['ABCDEFG'].
l = list(f1)
Since f1 is already a list, this just makes l a duplicate copy of that list.
print l, f1, len(l)
And this just prints the list of lines, and the copy of the list of lines, and the number of lines.
So, first, what happens if you drop the splitlines()? Then f1 will be the string 'ABCDEFG', instead of a list with that one string. That's a good start. And you can drop the l part entirely, because f1 is already an iterable of its characters; list(f1) will just be a different iterable of the same characters.
So, now you want to print each letter with the next letter. One way to do that is by zipping 'ABCDEFG' and 'BCDEFG '. But how do you get that 'BCDEFG '? Simple; it's just f1[1:] + ' '.
So:
with open('ref.txt') as f:
f1 = f.read()
for left, right in zip(f1, f1[1:] + ' '):
print left, right
Of course for something this simple, there are many other ways to do the same thing. You can iterate over range(len(f1)) and get 2-element slices, or you can use itertools.zip_longest, or you can write a general-purpose "overlapping adjacent groups of size N from any iterable" function out of itertools.tee and zip, etc.
As you want space between the characters you can use zip function and list comprehension :
>>> s="ABCDEFG"
>>> l=[' '.join(i) for i in zip(s,s[1:])]
['A B', 'B C', 'C D', 'D E', 'E F', 'F G']
>>> for i in l:
... print i
...
A B
B C
C D
D E
E F
F G
if you dont want space just use list comprehension :
>>> [s[i:i+2] for i in range(len(s))]
['AB', 'BC', 'CD', 'DE', 'EF', 'FG', 'G']

Categories

Resources