Joining Lists of Lists of Strings - python

I've a list of lists, in which each element is a single character:
ngrams = [['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c'],
['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c']]
From this, I want to generate a new single list with the content ['aa','ab','ac','ba','bb','bc','ca','cb','cc']. The individual elements of each list are appended to each other but in reverse order of the lists. I've come up with this (where np = 2):
for cnt in range(np-2,-1,-1):
thisngrams[-1] = [a+b for (a,b) in zip(thisngrams[-1],thisngrams[cnt])]
My solution needs to handle np higher than just 2. I expect this is O(np), which isn't bad. Can someone suggest a more efficient and pythonic way to do what I want (or is this a good pythonic approach)?

You can try this:
ngrams = [['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c'],
['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c']]
new = map(''.join, zip(*ngrams))
Output:
['aa', 'ba', 'ca', 'ab', 'bb', 'cb', 'ac', 'bc', 'cc']
For more than two elements:
n = [["a", "b", "c"], ["a", "c", "d"], ["e", "f", "g"]]
new = map(''.join, zip(* reversed(ngrams)))
#in Python3
#new = list(map(''.join, zip(* reversed(ngrams))))
Output:
['eaa', 'fcb', 'gdc']

Related

Iterating through a list for specific instances

I have the following code:
paths = [['E', 'D', 'A', 'B'], ['E', 'D', 'A', 'C', 'B'], ['E', 'D', 'B'], ['E', 'D', 'C', 'B'], ['E', 'B'], ['E', 'C', 'B']]
Now, the lists inside a list represent node paths from start to end which were made using Networkx, however that is some background information. My question is more specific.
I am trying to derive the lists that only have every letter from A-E, aka it would return only the list:
paths_desired = [['E', 'D', 'A', 'C', 'B']]
If I were to have another path:
paths = [['E', 'D', 'A', 'B'], ['E', 'D', 'A', 'C', 'B'], ['D', 'B', 'A','C','E'], ['A', 'D', 'C', 'B']]
It would return:
paths_desired = [['E', 'D', 'A', 'C', 'B'],['D', 'B', 'A', 'C', 'E']]
My idea is a for loop that iterates through each list:
for i in pathways:
counter = 0
for j in letters:
if j in i:
counter = counter + 1;
if counter == 5:
desired_paths.append(i)
print(desired_paths)
This works, however, I want to make the loop more specific, meaning I want only lists that have the following order: ['E','D','A','C','B'], even if all the letters are present in a different list, within the paths list.
Additionally, is there a way I can upgrade my for loop, so that I wouldn't count, rather check if the letters are in there, and not more than 1 of each letter? Meaning no multiple Es, no multiple D, etc.
You can use a use a set and .issubset() like this:
def pathways(letters, paths):
ret = []
letters = set(letters)
for path in paths:
if letters.issubset(path):
ret.append(path)
return ret
letters = ['A', 'B', 'C', 'D', 'E']
paths = [['E', 'D', 'A', 'B'], ['E', 'D', 'A', 'C', 'B'],
['D', 'B', 'A','C','E'], ['A', 'D', 'C', 'B']]
print(pathways(letters, paths)) # => [['E', 'D', 'A', 'C', 'B'], ['D', 'B', 'A', 'C', 'E']]
Also, as a comment by ShadowRanger pointed out, the pathways() function could be shortened using filter(). Like this:
def pathways(letters, paths):
return list(filter(set(letters).issubset, paths))
letters = ['A', 'B', 'C', 'D', 'E']
paths = [['E', 'D', 'A', 'B'], ['E', 'D', 'A', 'C', 'B'],
['D', 'B', 'A','C','E'], ['A', 'D', 'C', 'B']]
print(pathways(letters, paths))

How can copy string in the list?

I have a list
list1=['a','b','c]
I wnat to copy every string in the list
like this
list2=['a','a','b','b','c','c']
list3=['a','a','a','b','b','b','c','c','c']
but when I use this code
list2=[x*2 for x in list1]
I get
list2=['aa','bb','cc]
How can I change my code to accomplish my result?
I would use itertools.chain along with itertools.repeat:
from itertools import chain, repeat
chars = ['a', 'b', 'c']
repeat_count = 3
list(chain.from_iterable(repeat(char, repeat_count) for char in chars))
Output:
['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c']
without using itertools, this could be done with nested list comprehension like below
list1=['a','b','c']
print([y for x in list1 for y in [x]*2])
# ['a', 'a', 'b', 'b', 'c', 'c']
print([y for x in list1 for y in [x]*3])
# ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c']
You can use the second for loop with the function range():
lst = ['a', 'b', 'c']
[i for i in lst for _ in range(2)]
# ['a', 'a', 'b', 'b', 'c', 'c']

Why does array content get wiped and reset to the first result for a recursive function?

The issues stems from the output.append(a) on the third line. This program would ideally output 6 unique permutations of the input string, but instead returns 6 of the first result in the recursive loop. I realize exiting the recursion may have something to do with the array being modified, but how can I circumvent this issue to be able to return an array of solutions?
def permute(a, l, r, output):
if l==r:
output.append(a)
else:
for i in range(l,r+1):
a[l], a[i] = a[i], a[l]
permute(a, l+1, r,output)
a[l], a[i] = a[i], a[l] # backtrack
Driver program to test the above function
string = "ABC"
output = []
n = len(string)
a = list(string)
permute(a, 0, n-1,output)
print(output)
For reference, this is what the output looks like:
[['A', 'C', 'B']]
[['B', 'A', 'C'], ['B', 'A', 'C']]
[['B', 'C', 'A'], ['B', 'C', 'A'], ['B', 'C', 'A']]
[['C', 'B', 'A'], ['C', 'B', 'A'], ['C', 'B', 'A'], ['C', 'B', 'A']]
[['C', 'A', 'B'], ['C', 'A', 'B'], ['C', 'A', 'B'], ['C', 'A', 'B'], ['C', 'A', 'B']]
[['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C']]
When the output should be:
['A', 'B', 'C']
['A', 'C', 'B']
['B', 'A', 'C']
['B', 'C', 'A']
['C', 'B', 'A']
['C', 'A', 'B']
The problem is in the line
output.append(a)
it looks fine, but later on the list a changes, and when you append it to output again, the previous a (that you already appended) changes.
To solve the problem, you can simply use shallow copy. Write this instead:
output.append(a[:])
Do you know there is an excisting function in python?
import itertools
listA = ["A", "B", "C"]
perm = itertools.permutations(listA)
for i in list(perm):
print(i)
Result:
('A', 'B', 'C')
('A', 'C', 'B')
('B', 'A', 'C')
('B', 'C', 'A')
('C', 'A', 'B')
('C', 'B', 'A')

Global list mutation inside function in Python

I am working on a backtracking solution for this problem - "Given a string s, partition s such that every substring of the partition is a palindrome."
I have written this code where I am not able to get how is the global 2D list strings is getting updated? What exactly is happening here? I tried using global keyword too with it inside palinBreak function, but it doesn't help! When should global keyword be used?
Observation: Every element of global list strings changes to local list variable arr. For instance, strings = [x, y] and arr = [z], then strings becomes [z, z, z]; whereas I want it to be [x, y, z]. Why does this happen?
EDIT: Adding expected output vs. output I am getting (take notice from line 3 onwards).
Expected output is:
ans is ['a', 'b', 'a', 'a', 'b'] []
strings is [['a', 'b', 'a', 'a', 'b']]
ans is ['a', 'b', 'aa', 'b'] [['a', 'b', 'a', 'a', 'b']]
strings is [['a', 'b', 'a', 'a', 'b'], ['a', 'b', 'aa', 'b']]
ans is ['a', 'baab'] [['a', 'b', 'a', 'a', 'b'], ['a', 'b', 'aa', 'b']]
strings is [['a', 'b', 'a', 'a', 'b'], ['a', 'b', 'aa', 'b'], ['a', 'baab']]
ans is ['aba', 'a', 'b'] [['a', 'b', 'a', 'a', 'b'], ['a', 'b', 'aa', 'b'], ['a', 'baab']]
strings is [['a', 'b', 'a', 'a', 'b'], ['a', 'b', 'aa', 'b'], ['a', 'baab'], ['aba', 'a', 'b']]
[['a', 'b', 'a', 'a', 'b'], ['a', 'b', 'aa', 'b'], ['a', 'baab'], ['aba', 'a', 'b']]
Output:
ans is ['a', 'b', 'a', 'a', 'b'] []
strings is [['a', 'b', 'a', 'a', 'b']]
ans is ['a', 'b', 'aa', 'b'] [['a', 'b', 'aa', 'b']]
strings is [['a', 'b', 'aa', 'b'], ['a', 'b', 'aa', 'b']]
ans is ['a', 'baab'] [['a', 'baab'], ['a', 'baab']]
strings is [['a', 'baab'], ['a', 'baab'], ['a', 'baab']]
ans is ['aba', 'a', 'b'] [['aba', 'a', 'b'], ['aba', 'a', 'b'], ['aba', 'a', 'b']]
strings is [['aba', 'a', 'b'], ['aba', 'a', 'b'], ['aba', 'a', 'b'], ['aba', 'a', 'b']]
[[], [], [], []]
>>>
Code:
def isPalin(s):
i = 0
j = len(s)-1
while(i<j):
if(s[i]!=s[j]):
return False
i+=1
j-=1
return True
def palinBreak(s, start, arr):
#print "Called", start, arr
#global strings
if(start==len(s)):
print "ans is", arr, strings
strings.append(arr)
print "strings is", strings
return 0
flag = -1
for i in range(1, len(s)-start+1):
curr = s[start : start+i]
#print "Testing curr and start and i", curr, start, i
if(isPalin(curr)):
arr.append(curr)
#print arr, start, i
#print "Next call from", start+i
pb = palinBreak(s, start+i, arr)
if(pb != -1):
flag = 1
arr.pop()
#print "popped l", arr
return flag
strings = []
palinBreak("abaab", 0, [])
print strings
The problem is that arr will be the same list in the inner recursive calls.
Try to replace
pb = palinBreak(s, start+i, arr)
with
pb = palinBreak(s, start+i, list(arr))

How can I duplicate element of a list in python?

I wonder if there is a more elegant way to do the following. For example with list comprehension.
Consider a simple list :
l = ["a", "b", "c", "d", "e"]
I want to duplicate each elements n times. Thus I did the following :
n = 3
duplic = list()
for li in l:
duplic += [li for i in range(n)]
At the end duplic is :
['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'd', 'e', 'e', 'e']
You can use
duplic = [li for li in l for _ in range(n)]
This does the same as your code. It adds each element of l (li for li in l) n times (for _ in range n).
You can use:
l = ["a", "b", "c", "d", "e"]
n=3
duplic = [ li for li in l for i in range(n)]
Everytime in python that you write
duplic = list()
for li in l:
duplic +=
there is a good chance that it can be done with a list comprehension.
Try this:
l = ["a", "b", "c", "d", "e"]
print sorted(l * 3)
Output:
['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'd', 'e', 'e', 'e']
from itertools import chain
n = 4
>>>list(chain.from_iterable(map(lambda x: [x]*n,l)))
['a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'c', 'd', 'd', 'd', 'd', 'e', 'e', 'e', 'e']
In [12]: l
Out[12]: ['a', 'b', 'c', 'd', 'e']
In [13]: n
Out[13]: 3
In [14]: sum((n*[item] for item in l), [])
Out[14]: ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'd', 'e', 'e', 'e']

Categories

Resources