I have a function that is supposed to merge two sorted lists into a combined sorted list. I know there are other ways of accomplishing this, but can someone explain why this code doesn't work
def merge_two(list1,list2):
new=[]
l1=list1[:]
l2=list2[:]
while l1 and l2:
if l1[0]<l2[0]:
new.append(l1.pop(0))
else:
new.append(l2.pop(0))
print(new,l1,l2)
return new+l1+l2
For some reason the while loop only seems to run once. For example if I use list=['a','x','z'] and list2=['b','c','f','g'], the print line at the end of the function results in ['a']['x','z']['b','c','f','g']
From debugging this seems to be due to the while loop only executing once, but I'm not sure why that's happening...it should go until either l1 or l2 is empty
That's because a function breaks after it returns something. You need to un-indent your return statement:
def merge_two(list1,list2):
new=[]
l1=list1[:]
l2=list2[:]
while l1 and l2:
if l1[0]<l2[0]:
new.append(l1.pop(0))
else:
new.append(l2.pop(0))
return new+l1+l2
Alternatively, if for some reason you would like a list of all the steps your function produced, you could actually use yield instead of return in the exact same spot you used your return at:
def merge_two(list1,list2):
new=[]
l1=list1[:]
l2=list2[:]
while l1 and l2:
if l1[0]<l2[0]:
new.append(l1.pop(0))
else:
new.append(l2.pop(0))
yield new+l1+l2
Then running it:
>>> list(merge_two(a, b))
[['a', 'x', 'z', 'b', 'c', 'f', 'g'],
['a', 'b', 'x', 'z', 'c', 'f', 'g'],
['a', 'b', 'c', 'x', 'z', 'f', 'g'],
['a', 'b', 'c', 'f', 'x', 'z', 'g'],
['a', 'b', 'c', 'f', 'g', 'x', 'z']]
And of course you'll see that the last list yielded is the sorted list :).
Related
Here is my list,
a = [['a','b','c','d'],['e','f','g','h'],['i','j','k','l'],['m','n','o','p']]
and Here is my function,
def add(change,unchange):
a = change
b = unchange
a[0].insert(a[0].index(a[0][2]),"high_range")
a[0].insert(a[0].index(a[0][3]),"low_range")
print(a)
print(b)
When I try to execute this function,
add(a,a[0])
I'm getting this output,
[['a', 'b', 'high_range', 'low_range', 'c', 'd'], ['e', 'f', 'g', 'h'], ['i', 'j', 'k', 'l'], ['m', 'n', 'o', 'p']]
['a', 'b', 'high_range', 'low_range', 'c', 'd']
But my expected output is the following,
[['a', 'b', 'high_range', 'low_range', 'c', 'd'], ['e', 'f', 'g', 'h'], ['i', 'j', 'k', 'l'], ['m', 'n', 'o', 'p']]
['a', 'b', 'c', 'd']
How to make the first element of the list keep on same in the second variable ? Sorry I'm newbie.
Since a list is a mutable type, when you insert values into a this also gets reflected in b, since they are pointers to the same list. You can either print b before inserting values into the list, or make b a copy of unchange like this:
def add(change,unchange):
a = change
b = unchange[:]
a[0].insert(2, "high_range")
a[0].insert(3, "low_range")
print(a)
print(b)
Also, a[0].index(a[0][2]) is redundant, you already know that the index is 2.
The main problem is in line:
add(a, a[0])
as you are mutating a inside the function a[0] will change as well as they point to the same thing. You need to design your program accordingly. You can refer to this answer. How to clone or copy a list?
depending upon your requirement you can do this.
either suply a copy while calling a function.
add(a, a[0][:]) # or
read #alec's answer.
your function is perfect but execute this:
add(a,a[0][:])
this will make the second variable, namely a[0], a copy, which will be left unchanged.
I have a list that I called lst, it is as follows:
lst = ['A', 'C', 'T', 'G', 'A', 'C', 'G', 'C', 'A', 'G']
What i want to know is how to split this up into four letter strings which start with the first, second, third, and fourth letters; then move to the second, third, fourth and fifth letters and so on and then add it to a new list to be compared to a main list.
Thanks
To get the first sublist, use lst[0:4]. Use python's join function to merge it into a single string. Use a for loop to get all the sublists.
sequences = []
sequence_size = 4
lst = ['A', 'C', 'T', 'G', 'A', 'C', 'G', 'C', 'A', 'G']
for i in range(len(lst) - sequence_size + 1):
sequence = ''.join(lst[i : i + sequence_size])
sequences.append(sequence)
print(sequences)
All 4-grams (without padding):
# window size:
ws = 4
lst2 = [
''.join(lst[i:i+ws])
for i in range(0, len(lst))
if len(lst[i:i+ws]) == 4
]
Non-overlapping 4-grams:
lst3 = [
''.join(lst[i:i+ws])
for i in range(0, len(lst), ws)
if len(lst[i:i+ws]) == 4
]
I think the other answers solve your problem, but if you are looking for a pythonic way to do this, I used List comprehension. It is very recommended to use this for code simplicity, although sometimes diminish code readability. Also it is quite shorter.
lst = ['A', 'C', 'T', 'G', 'A', 'C', 'G', 'C', 'A', 'G']
result = [''.join(lst[i:i+4]) for i in range(len(lst)-3)]
print(result)
Use:
lst = ['A', 'C', 'T', 'G', 'A', 'C', 'G', 'C', 'A', 'G']
i=0
New_list=[]
while i<(len(lst)-3):
New_list.append(lst[i]+lst[i+1]+lst[i+2]+lst[i+3])
i+=1
print(New_list)
Output:
['ACTG', 'CTGA', 'TGAC', 'GACG', 'ACGC', 'CGCA', 'GCAG']
This question already has an answer here:
How can I group equivalent items together in a Python list?
(1 answer)
Closed 3 years ago.
I want to split a list sequence of items in Python or group them if they are similar.
I already found a solution but I would like to know if there is a better and more efficient way to do it (always up to learn more).
Here is the main goal
input = ['a','a', 'i', 'e', 'e', 'e', 'i', 'i', 'a', 'a']
desired_ouput = [['a','a'], ['i'], ['e','e', 'e'], ['i', 'i'], ['a', 'a']
So basically I choose to group by similar neighbour.I try to find a way to split them if different but get no success dooing it.
I'm also keen on listening the good way to expose the problem
#!/usr/bin/env python3
def group_seq(listA):
listA = [[n] for n in listA]
for i,l in enumerate(listA):
_curr = l
_prev = None
_next= None
if i+1 < len(listA):
_next = listA[i+1]
if i > 0:
_prev = listA[i-1]
if _next is not None and _curr[-1] == _next[0]:
listA[i].extend(_next)
listA.pop(i+1)
if _prev is not None and _curr[0] == _prev[0]:
listA[i].extend(_prev)
listA.pop(i-1)
return listA
listA = ['a','a', 'i', 'e', 'e', 'e', 'i', 'i', 'a', 'a']
output = group_seq(listA)
print(listA)
['a', 'a', 'i', 'e', 'e', 'e', 'i', 'i', 'a', 'a']
print(output)
[['a', 'a'], ['i'], ['e', 'e', 'e'], ['i', 'i'], ['a', 'a']]
I think itertool.groupby is probably the nicest way to do this. It's flexible and efficient enough that it's rarely to your advantage to re-implement it yourself:
from itertools import groupby
inp = ['a','a', 'i', 'e', 'e', 'e', 'i', 'i', 'a', 'a']
output = [list(g) for k,g in groupby(inp)]
print(output)
prints
[['a', 'a'], ['i'], ['e', 'e', 'e'], ['i', 'i'], ['a', 'a']]
If you do implement it yourself, it can probably be much simpler. Just keep track of the previous value and the current list you're appending to:
def group_seq(listA):
prev = None
cur = None
ret = []
for l in listA:
if l == prev: # assumes list doesn't contain None
cur.append(l)
else:
cur = [l]
ret.append(cur)
prev = l
return ret
When I write this code:
f=['a','b','c',['d','e','f']]
def j(f):
p=f[:]
for i in range(len(f)):
if type(p[i]) == list:
p[i].reverse()
p.reverse()
return p
print(j(f), f)
I expect that the result would be:
[['f', 'e', 'd'], 'c', 'b', 'a'] ['a', 'b', 'c', ['d', 'e', 'f']]
But the result I see is:
[['f', 'e', 'd'], 'c', 'b', 'a'] ['a', 'b', 'c', ['f', 'e', 'd']]
Why? And how can I write a code that do what I expect?
reverse modifies the list in place, you actually want to create a new list, so you don't reverse the one you've got, something like this:
def j(f):
p=f[:]
for i in range(len(f)):
if type(p[i]) == list:
p[i] = p[i][::-1]
p.reverse()
return p
I have a dataset along the lines of:
data.append(['a', 'b', 'c'], ['a', 'x', 'y', z'], ['a', 'x', 'e', 'f'], ['a'])
I've searched SO and found ways to return duplicates across all lists using intersection_update() (so, in this example, 'a'), but I actually want to return duplicates from any lists, i.e.,:
retVal = ['a', 'x']
Since 'a' and 'x' are duplicated at least once among all lists. Is there a built-in for Python 2.7 that can do this?
Use a Counter to determine the number of each item and chain.from_iterable to pass the items from the sublists to the Counter.
from itertools import chain
from collections import Counter
data=[['a', 'b', 'c'], ['a', 'x', 'y', 'z'], ['a', 'x', 'e', 'f'], ['a']]
c = Counter(chain.from_iterable(data))
retVal = [k for k, count in c.items() if count >= 2]
print(retVal)
#['x', 'a']