Related
For example the original list:
['k','a','b','c','a','d','e','a','b','e','f','j','a','c','a','b']
We want to split the list into lists started with 'a' and ended with 'a', like the following:
['a','b','c','a']
['a','d','e','a']
['a','b','e','f','j','a']
['a','c','a']
The final ouput can also be a list of lists. I have tried a double for loop approach with 'a' as the condition, but this is inefficient and not pythonic.
One possible solution is using re (regex)
import re
l = ['k','a','b','c','a','d','e','a','b','e','f','j','a','c','a','b']
r = [list(f"a{_}a") for _ in re.findall("(?<=a)[^a]+(?=a)", "".join(l))]
print(r)
# [['a', 'b', 'c', 'a'], ['a', 'd', 'e', 'a'], ['a', 'b', 'e', 'f', 'j', 'a'], ['a', 'c', 'a']]
You can do this in one loop:
lst = ['k','a','b','c','a','d','e','a','b','e','f','j','a','c','a','b']
out = [[]]
for i in lst:
if i == 'a':
out[-1].append(i)
out.append([])
out[-1].append(i)
out = out[1:] if out[-1][-1] == 'a' else out[1:-1]
Also using numpy.split:
out = [ary.tolist() + ['a'] for ary in np.split(lst, np.where(np.array(lst) == 'a')[0])[1:-1]]
Output:
[['a', 'b', 'c', 'a'], ['a', 'd', 'e', 'a'], ['a', 'b', 'e', 'f', 'j', 'a'], ['a', 'c', 'a']]
Firstly you can store the indices of 'a' from the list.
oList = ['k','a','b','c','a','d','e','a','b','e','f','j','a','c','a','b']
idx_a = list()
for idx, char in enumerate(oList):
if char == 'a':
idx_a.append(idx)
Then for every consecutive indices you can get the sub-list and store it in a list
ans = [oList[idx_a[x]:idx_a[x + 1] + 1] for x in range(len(idx_a))]
You can also get more such lists if you take in-between indices also.
You can do this with a single iteration and a simple state machine:
original_list = list('kabcadeabefjacab')
multiple_lists = []
for c in original_list:
if multiple_lists:
multiple_lists[-1].append(c)
if c == 'a':
multiple_lists.append([c])
if multiple_lists[-1][-1] != 'a':
multiple_lists.pop()
print(multiple_lists)
[['a', 'b', 'c', 'a'], ['a', 'd', 'e', 'a'], ['a', 'b', 'e', 'f', 'j', 'a'], ['a', 'c', 'a']]
We can use str.split() to split the list once we str.join() it to a string, and then use a f-string to add back the stripped "a"s. Note that even if the list starts/ends with an "a", this the split list will have an empty string representing the substring before the split, so our unpacking logic that discards the first + last subsequences will still work as intended.
def split(data):
_, *subseqs, _ = "".join(data).split("a")
return [list(f"a{seq}a") for seq in subseqs]
Output:
>>> from pprint import pprint
>>> testdata = ['k','a','b','c','a','d','e','a','b','e','f','j','a','c','a','b']
>>> pprint(split(testdata))
[['a', 'b', 'c', 'a'],
['a', 'd', 'e', 'a'],
['a', 'b', 'e', 'f', 'j', 'a'],
['a', 'c', 'a']]
I want to reorder my list in a given order,
For example I have a list of ['a', 'b', 'c', 'd', 'e', 'f', 'g']
this has an index of [0,1,2,3,4,5,6] and lets say the new ordered list would have an order of [3,5,6,1,2,4,0] which would result in ['d','f','g', 'b', 'c', 'e', 'a'].
How would you result in such code?
I thought of using for loop by doing the
for i in range(Len(list))
and after that I thought go using append or creating a new list? maybe but I'm not sure if I'm approaching this right.
All you need to do is iterate the list of indexes, and use it to access the list of elements, like this:
elems = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
idx = [3,5,6,1,2,4,0]
result = [elems[i] for i in idx]
print(result)
Output:
['d', 'f', 'g', 'b', 'c', 'e', 'a']
import numpy as np
my_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
def my_function(my_list, index):
return np.take(my_list, index)
print(my_function(my_list, [3,5,6,1,2,4,0]))
Output: ['d' 'f' 'g' 'b' 'c' 'e' 'a']
I am using a 'for loop' to eliminate the item step by step and generate a new list(feature_combination) including different combinations.
feature_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
feature_combination = []
for i in range(7):
feature_list.pop()
feature_combination.append(feature_list)
feature_combination
The ideal output should be:
[['A', 'B', 'C', 'D', 'E', 'F'],['A', 'B', 'C', 'D', 'E'],['A', 'B', 'C', 'D'],['A', 'B', 'C'],['A', 'B'],['A'], []]
But the current output is:
[[], [], [], [], [], [], []]
When I print the progress step by step:
feature_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
feature_combination = []
for i in range(7):
feature_list.pop()
print(feature_list)
I can get the following the results:
['A', 'B', 'C', 'D', 'E', 'F']
['A', 'B', 'C', 'D', 'E']
['A', 'B', 'C', 'D']
['A', 'B', 'C']
['A', 'B']
['A']
[]
So, why I cannot append these results to an empty list? What is the problem?
It's because when you call feature_combination.append(feature_list), you are appending a reference to feature_list, not the actual value of feature_list. Since feature_list is empty at the end of the for loop, all of the references to it are empty as well.
You can fix it by changing feature_combination.append(feature_list) to feature_combination.append(feature_list.copy()), which makes a copy of the list to store.
First of all, you need to pass an index into pop in order to specify which element to delete. Though I find this unesaccary, instead you could use slicing.
Below is an example of how you could accomplish your goal. This code adjusts to your desired output.
feature_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
feature_combination = []
for i in range(7):
feature_list = feature_list[:-1]
feature_combination.append(feature_list)
print(feature_combination)
output
[['A', 'B', 'C', 'D', 'E', 'F'], ['A', 'B', 'C', 'D', 'E'], ['A', 'B', 'C', 'D'], ['A', 'B', 'C'], ['A', 'B'], ['A'], []]
A Python variable is a symbolic name that is a reference or pointer to an object. Once an object is assigned to a variable, you can refer to the object by that name. But the data itself is still contained within the object. refer this.
This is because the feature_list points to a specific object, which keeps updating as you pop are subsequently. You are basically creating a list that contains [object, object, object ...] all pointing to the same feature_list object. As you keep popping and updating the object, the list that collects multiple instances of this same object also gets updated with this object.
Here is how you can test this happening -
feature_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
feature_combination = []
for i in range(7):
feature_list.pop()
feature_combination.append(feature_list)
print('iteration', i)
print(feature_combination) #Print the primary list after each iteration
iteration 0
[['A', 'B', 'C', 'D', 'E', 'F']]
iteration 1
[['A', 'B', 'C', 'D', 'E'], ['A', 'B', 'C', 'D', 'E']]
iteration 2
[['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'D']]
iteration 3
[['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C'], ['A', 'B', 'C']]
iteration 4
[['A', 'B'], ['A', 'B'], ['A', 'B'], ['A', 'B'], ['A', 'B']]
iteration 5
[['A'], ['A'], ['A'], ['A'], ['A'], ['A']]
iteration 6
[[], [], [], [], [], [], []]`
Notice, that after each iteration, every instance of the sublist is being updated after the pop and reflect inside the main list.
A fix
A fix is to use a slice to get and store a copy.
feature_list = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
feature_combination = []
for i in range(7):
feature_list.pop()
print(feature_list)
feature_combination.append(feature_list[:]) #<----
feature_combination
[['A', 'B', 'C', 'D', 'E', 'F'],
['A', 'B', 'C', 'D', 'E'],
['A', 'B', 'C', 'D'],
['A', 'B', 'C'],
['A', 'B'],
['A'],
[]]
If i have a list
lst = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
and I want to split into new list without 'k', and turn it into a tuple. So I get
(['a'],['b', 'c'], ['d', 'e', 'g'])
I am thinking about first splitting them into different list by using a for loop.
new_lst = []
for element in lst:
if element != 'k':
new_ist.append(element)
This does remove all the 'k' but they are all together. I do not know how to split them into different list. To turn a list into a tuple I would need to make a list inside a list
a = [['a'],['b', 'c'], ['d', 'e', 'g']]
tuple(a) == (['a'], ['b', 'c'], ['d', 'e', 'g'])
True
So the question would be how to split the list into a list with sublist.
You are close. You can append to another list called sublist and if you find a k append sublist to new_list:
lst = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
new_lst = []
sublist = []
for element in lst:
if element != 'k':
sublist.append(element)
else:
new_lst.append(sublist)
sublist = []
if sublist: # add the last sublist
new_lst.append(sublist)
result = tuple(new_lst)
print(result)
# (['a'], ['b', 'c'], ['d', 'e', 'g'])
If you're feeling adventurous, you can also use groupby. The idea is to group elements as "k" or "non-k" and use groupby on that property:
from itertools import groupby
lst = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
result = tuple(list(gp) for is_k, gp in groupby(lst, "k".__eq__) if not is_k)
print(result)
# (['a'], ['b', 'c'], ['d', 'e', 'g'])
Thanks #YakymPirozhenko for the simpler generator expression
tuple(list(i) for i in ''.join(lst).split('k'))
Output:
(['a'], ['b', 'c'], ['d', 'e', 'g'])
Here's a different approach, using re.split from the re module, and map:
import re
lst = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
tuple(map(list, re.split('k',''.join(lst))))
(['a'], ['b', 'c'], ['d', 'e', 'g'])
smallerlist = [l.split(',') for l in ','.join(lst).split('k')]
print(smallerlist)
Outputs
[['a', ''], ['', 'b', 'c', ''], ['', 'd', 'e', 'g']]
Then you could check if each sub lists contain ''
smallerlist = [' '.join(l).split() for l in smallerlist]
print(smallerlist)
Outputs
[['a'], ['b', 'c'], ['d', 'e', 'g']]
How about slicing, without appending and joining .
def isplit_list(lst, v):
while True:
try:
end = lst.index(v)
except ValueError:
break
yield lst[:end]
lst = lst[end+1:]
if len(lst):
yield lst
lst = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g', 'k']
results = tuple(isplit_list(lst, 'k'))
Try this, works and doesn't need any imports!
>>> l = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
>>> t = []
>>> for s in ''.join(l).split('k'):
... t.append(list(s))
...
>>> t
[['a'], ['b', 'c'], ['d', 'e', 'g']]
>>> t = tuple(t)
>>> t
(['a'], ['b', 'c'], ['d', 'e', 'g'])
Why don't you make a method which will take a list as an argument and return a tuple like so.
>>> def list_to_tuple(l):
... t = []
... for s in l:
... t.append(list(s))
... return tuple(t)
...
>>> l = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
>>> l = ''.join(l).split('k')
>>> l = list_to_tuple(l)
>>> l
(['a'], ['b', 'c'], ['d', 'e', 'g'])
Another approach using itertools
import more_itertools
lst = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
print(tuple(more_itertools.split_at(lst, lambda x: x == 'k')))
gives
(['a'], ['b', 'c'], ['d', 'e', 'g'])
I have a list of characters:
Char_list = ['C', 'A', 'G']
and a list of lists:
List_List = [['A', 'C', 'T'], ['C', 'A', 'T', 'G'], ['A', 'C', 'G']]
I would like to remove each Char_list[i] from the list of corresponding index i in List_List.
Output must be as follows:
[['A','T'], ['C', 'T', 'G'], ['A', 'C']]
what I am trying is:
for i in range(len(Char_list)):
for j in range(len(List_List)):
if Char_list[i] in List_List[j]:
List_List[j].remove(Char_list[i])
print list_list
But from the above code each character is removed from all lists.
How can I remove Char_list[i] only from corresponding list in List_list?
Instead of using explicit indices, zip your two lists together, then apply a list comprehension to filter out the unwanted character for each position.
>>> char_list = ['C', 'A', 'G']
>>> list_list = [['A', 'C', 'T'], ['C','A', 'T', 'G'], ['A', 'C', 'G']]
>>> [[x for x in l if x != y] for l, y in zip(list_list, char_list)]
[['A', 'T'], ['C', 'T', 'G'], ['A', 'C']]
You may use enumerate with nested list comprehension expression as:
>>> char_list = ['C', 'A', 'G']
>>> nested_list = [['A', 'C', 'T'], ['C', 'A', 'T', 'G'], ['A', 'C', 'G']]
>>> [[j for j in i if j!=char_list[n]] for n, i in enumerate(nested_list)]
[['A', 'T'], ['C', 'T', 'G'], ['A', 'C']]
I also suggest you to take a look at PEP 8 - Naming Conventions. You should not be using capitalized first alphabet with the variable name.
Char_list = ['C', 'A', 'G']
List_List = [['A', 'C', 'T'], ['C', 'A', 'T', 'G'], ['A', 'C', 'G']]
for i in range(len(Char_list)):
List_List[i].remove(Char_list[i])
print(List_List)
OUTPUT
[['A', 'T'], ['C', 'T', 'G'], ['A', 'C']]
If the characters repeat in nested lists, Use this
Char_list = ['C', 'A', 'G']
List_List = [['A', 'C','C','C', 'T'], ['C', 'A', 'T', 'G'], ['A', 'C', 'G']]
for i in range(len(Char_list)):
for j in range(List_List[i].count(Char_list[i])):
List_List[i].remove(Char_list[i])
print(List_List)