If i have a list
lst = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
and I want to split into new list without 'k', and turn it into a tuple. So I get
(['a'],['b', 'c'], ['d', 'e', 'g'])
I am thinking about first splitting them into different list by using a for loop.
new_lst = []
for element in lst:
if element != 'k':
new_ist.append(element)
This does remove all the 'k' but they are all together. I do not know how to split them into different list. To turn a list into a tuple I would need to make a list inside a list
a = [['a'],['b', 'c'], ['d', 'e', 'g']]
tuple(a) == (['a'], ['b', 'c'], ['d', 'e', 'g'])
True
So the question would be how to split the list into a list with sublist.
You are close. You can append to another list called sublist and if you find a k append sublist to new_list:
lst = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
new_lst = []
sublist = []
for element in lst:
if element != 'k':
sublist.append(element)
else:
new_lst.append(sublist)
sublist = []
if sublist: # add the last sublist
new_lst.append(sublist)
result = tuple(new_lst)
print(result)
# (['a'], ['b', 'c'], ['d', 'e', 'g'])
If you're feeling adventurous, you can also use groupby. The idea is to group elements as "k" or "non-k" and use groupby on that property:
from itertools import groupby
lst = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
result = tuple(list(gp) for is_k, gp in groupby(lst, "k".__eq__) if not is_k)
print(result)
# (['a'], ['b', 'c'], ['d', 'e', 'g'])
Thanks #YakymPirozhenko for the simpler generator expression
tuple(list(i) for i in ''.join(lst).split('k'))
Output:
(['a'], ['b', 'c'], ['d', 'e', 'g'])
Here's a different approach, using re.split from the re module, and map:
import re
lst = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
tuple(map(list, re.split('k',''.join(lst))))
(['a'], ['b', 'c'], ['d', 'e', 'g'])
smallerlist = [l.split(',') for l in ','.join(lst).split('k')]
print(smallerlist)
Outputs
[['a', ''], ['', 'b', 'c', ''], ['', 'd', 'e', 'g']]
Then you could check if each sub lists contain ''
smallerlist = [' '.join(l).split() for l in smallerlist]
print(smallerlist)
Outputs
[['a'], ['b', 'c'], ['d', 'e', 'g']]
How about slicing, without appending and joining .
def isplit_list(lst, v):
while True:
try:
end = lst.index(v)
except ValueError:
break
yield lst[:end]
lst = lst[end+1:]
if len(lst):
yield lst
lst = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g', 'k']
results = tuple(isplit_list(lst, 'k'))
Try this, works and doesn't need any imports!
>>> l = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
>>> t = []
>>> for s in ''.join(l).split('k'):
... t.append(list(s))
...
>>> t
[['a'], ['b', 'c'], ['d', 'e', 'g']]
>>> t = tuple(t)
>>> t
(['a'], ['b', 'c'], ['d', 'e', 'g'])
Why don't you make a method which will take a list as an argument and return a tuple like so.
>>> def list_to_tuple(l):
... t = []
... for s in l:
... t.append(list(s))
... return tuple(t)
...
>>> l = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
>>> l = ''.join(l).split('k')
>>> l = list_to_tuple(l)
>>> l
(['a'], ['b', 'c'], ['d', 'e', 'g'])
Another approach using itertools
import more_itertools
lst = ['a', 'k', 'b', 'c', 'k', 'd', 'e', 'g']
print(tuple(more_itertools.split_at(lst, lambda x: x == 'k')))
gives
(['a'], ['b', 'c'], ['d', 'e', 'g'])
Related
I have some extremely large lists of character strings I need to parse. I need to break them into smaller lists based on a pre-defined character string, and I figured out a way to do it, but I worry that this will not be performant on my real data. Is there a better way to do this?
My goal is to turn this list:
['a', 'b', 'string_to_split_on', 'c', 'd', 'e', 'f', 'g', 'string_to_split_on', 'h', 'i', 'j', 'k', 'string_to_split_on']
Into this list:
[['a', 'b'], ['c', 'd', 'e', 'f', 'g'], ['h', 'i', 'j', 'k']]
What I tried:
# List that replicates my data. `string_to_split_on` is a fixed character string I want to break my list up on
my_list = ['a', 'b', 'string_to_split_on', 'c', 'd', 'e', 'f', 'g', 'string_to_split_on', 'h', 'i', 'j', 'k', 'string_to_split_on']
# Inspect List
print(my_list)
# Create empty lists to store dat ain
new_list = []
good_letters = []
# Iterate over each string in the list
for i in my_list:
# If the string is the seporator, append data to new_list, reset `good_letters` and move to the next string
if i == 'string_to_split_on':
new_list.append(good_letters)
good_letters = []
continue
# Append letter to the list of good letters
else:
good_letters.append(i)
# I just like printing things thay because its easy to read
for item in new_list:
print(item)
print('-'*100)
### Output
['a', 'b', 'string_to_split_on', 'c', 'd', 'e', 'f', 'g', 'string_to_split_on', 'h', 'i', 'j', 'k', 'string_to_split_on']
['a', 'b']
----------------------------------------------------------------------------------------------------
['c', 'd', 'e', 'f', 'g']
----------------------------------------------------------------------------------------------------
['h', 'i', 'j', 'k']
----------------------------------------------------------------------------------------------------
You can also use one line of code:
original_list = ['a', 'b', 'string_to_split_on', 'c', 'd', 'e', 'f', 'g', 'string_to_split_on', 'h', 'i', 'j', 'k', 'string_to_split_on']
split_string = 'string_to_split_on'
new_list = [sublist.split() for sublist in ' '.join(original_list).split(split_string) if sublist]
print(new_list)
This approach is more efficient when dealing with large data set:
import itertools
new_list = [list(j) for k, j in itertools.groupby(original_list, lambda x: x != split_string) if k]
print(new_list)
[['a', 'b'], ['c', 'd', 'e', 'f', 'g'], ['h', 'i', 'j', 'k']]
I want to replace the elements in a list of lists based on a dictionary mapping table, and tried below:
lists_before = [['A', 'B', 'C'], ['A', 'D'], ['D', 'E']]
mapped_dictionary = {'A': 'G', 'B': 'G', 'C':'F'}
Below is the code I use:
lists_after = []
for element in lists_before:
new_element = []
for letter in element :
if letter in list(mapped_dictionary.values()):
letter = repl_dic.get(letter)
new_element.append(letter)
lists_after.append(new_element)
The output expected for lists_after is:
[['G', 'G', 'F'],['G','D'],['D','E']]
However, the output I got is still the same as lists_before.
I cannot figure out what went wrong. Could someone help me?
You can do it like this:
Input:
l = [['A', 'B', 'C'], ['A', 'D'], ['D', 'E']]
m = {'A': 'G', 'B': 'G', 'C': 'F'}
Code:
l_new = list()
for lst in l:
lst_new = list()
for ele in lst:
lst_new.append(m.get(ele, ele))
l_new.append(lst_new)
Output:
[['G', 'G', 'F'], ['G', 'D'], ['D', 'E']]
Or use a 1-liner:
[[m.get(ele, ele) for ele in lst] for lst in l]
[['G', 'G', 'F'], ['G', 'D'], ['D', 'E']]
I'm not sure if this is possible, but is there a way to combine 3 lists into a dictionary so that the list name is the key and the list of items is the value?
Example:
Inputs:
list1 = ['a', 'b', 'c']
list2 = ['d', 'e', 'f']
list3 = ['g', 'h', 'i']
output:
dict = {
'list1': ['a', 'b', 'c'],
'list2': ['d', 'e', 'f'],
'list3': ['g', 'h', 'i']
}
thanks
if you're able to define your list in its own function,
(so they are the only variables in the local function),
you can do this:
def loc():
list1 = ['a', 'b', 'c']
list2 = ['d', 'e', 'f']
list3 = ['g', 'h', 'i']
list_dictionary = (locals())
print(list_dictionary)
loc()
{'list1': ['a', 'b', 'c'], 'list2': ['d', 'e', 'f'], 'list3': ['g', 'h', 'i']}
Otherwise, you may need to resort to something more like this:
list1 = ['a', 'b', 'c']
list2 = ['d', 'e', 'f']
list3 = ['g', 'h', 'i']
list_dictionary = {}
for i in ('list1', 'list2', 'list3'):
list_dictionary[i] = locals()[i]
print (list_dictionary)
{'list1': ['a', 'b', 'c'], 'list2': ['d', 'e', 'f'], 'list3': ['g', 'h', 'i']}
source: https://stackoverflow.com/a/3972978/5411817
if your variable names are repetitive, as in the example, you can compose a list of strings for the variable names:
variable_names_as_strings = []
for i in range(1,4):
variable_names_as_strings.append('list' + str(i))
Then create your dictionary:
for i in variable_names_as_strings:
list_dictionary[i] = locals()[i]
print (list_dictionary)
{'list1': ['a', 'b', 'c'], 'list2': ['d', 'e', 'f'], 'list3': ['g', 'h', 'i']}
more info on locals() (also look up globals()):
https://www.programiz.com/python-programming/methods/built-in/locals
https://docs.python.org/3.3/library/functions.html#locals
list1 = ['a', 'b', 'c']
list2 = ['d', 'e', 'f']
list3 = ['g', 'h', 'i']
dictionary = dict()
dictionary['list1'] = list1
dictionary['list2'] = list2
dictionary['list3'] = list3
print(dictionary)
output:
{'list1': ['a', 'b', 'c'], 'list2': ['d', 'e', 'f'], 'list3': ['g', 'h', 'i']}
I have a list of characters:
Char_list = ['C', 'A', 'G']
and a list of lists:
List_List = [['A', 'C', 'T'], ['C', 'A', 'T', 'G'], ['A', 'C', 'G']]
I would like to remove each Char_list[i] from the list of corresponding index i in List_List.
Output must be as follows:
[['A','T'], ['C', 'T', 'G'], ['A', 'C']]
what I am trying is:
for i in range(len(Char_list)):
for j in range(len(List_List)):
if Char_list[i] in List_List[j]:
List_List[j].remove(Char_list[i])
print list_list
But from the above code each character is removed from all lists.
How can I remove Char_list[i] only from corresponding list in List_list?
Instead of using explicit indices, zip your two lists together, then apply a list comprehension to filter out the unwanted character for each position.
>>> char_list = ['C', 'A', 'G']
>>> list_list = [['A', 'C', 'T'], ['C','A', 'T', 'G'], ['A', 'C', 'G']]
>>> [[x for x in l if x != y] for l, y in zip(list_list, char_list)]
[['A', 'T'], ['C', 'T', 'G'], ['A', 'C']]
You may use enumerate with nested list comprehension expression as:
>>> char_list = ['C', 'A', 'G']
>>> nested_list = [['A', 'C', 'T'], ['C', 'A', 'T', 'G'], ['A', 'C', 'G']]
>>> [[j for j in i if j!=char_list[n]] for n, i in enumerate(nested_list)]
[['A', 'T'], ['C', 'T', 'G'], ['A', 'C']]
I also suggest you to take a look at PEP 8 - Naming Conventions. You should not be using capitalized first alphabet with the variable name.
Char_list = ['C', 'A', 'G']
List_List = [['A', 'C', 'T'], ['C', 'A', 'T', 'G'], ['A', 'C', 'G']]
for i in range(len(Char_list)):
List_List[i].remove(Char_list[i])
print(List_List)
OUTPUT
[['A', 'T'], ['C', 'T', 'G'], ['A', 'C']]
If the characters repeat in nested lists, Use this
Char_list = ['C', 'A', 'G']
List_List = [['A', 'C','C','C', 'T'], ['C', 'A', 'T', 'G'], ['A', 'C', 'G']]
for i in range(len(Char_list)):
for j in range(List_List[i].count(Char_list[i])):
List_List[i].remove(Char_list[i])
print(List_List)
In Python, I have two lists that either have equal number of elements (e.g. 8 and 8) or one less than the other (e.g. 7 and 8; 3 and 4):
list1 = ['A', 'B', 'C', 'D']
list2 = ['E', 'F', 'G', 'H']
or
list3 = ['A', 'B', 'C']
list4 = ['D', 'E', 'F', 'G']
I'm trying to figure out the best way to build an algorithm that will switch the last half of the first list with the first half of the last list, resulting in this, when both lists have an even number of elements:
switched_list1 = ['A', 'B', 'E', 'F']
switched_list2 = ['C', 'D', 'G', 'H']
…and this when the one of the lists has an odd number:
switched_list3 = ['A', 'D', 'E']
switched_list4 = ['B', 'C', 'F', 'G']
What's the most efficient way to build an algorithm that can switch list elements like this?
list1 = ['A', 'B', 'C']
list2 = ['D', 'E', 'F', 'G']
nlist1 = len(list1)/2
nlist2 = len(list2)/2
new1 = list1[:nlist1] + list2[:nlist2]
new2 = list1[nlist1:] + list2[nlist2:]
print new1
print new2
produces
['A', 'D', 'E']
['B', 'C', 'F', 'G']
>>> def StrangeSwitch(list1,list2):
return (list1[:len(list1)/2]+list2[:len(list2)/2],list1[len(list1)/2:]+list2[len(list2)/2:])
>>> list1 = ['A', 'B', 'C', 'D']
>>> list2 = ['E', 'F', 'G', 'H']
>>> (list1,list2)=StrangeSwitch(list1,list2)
>>> list1
['A', 'B', 'E', 'F']
>>> list2
['C', 'D', 'G', 'H']
>>> list3 = ['A', 'B', 'C']
>>> list4 = ['D', 'E', 'F', 'G']
>>> (list3,list4)=StrangeSwitch(list3,list4)
>>> list3
['A', 'B', 'C']
>>> list4
['B', 'C', 'F', 'G']
>>>
Reading the Comments by OP I would take the priviledge of proposing another approach
>>> def StrangeSwitchFast(list1,list2):
#return (list1[:len(list1)/2]+list2[:len(list2)/2],list1[len(list1)/2:]+list2[len(list2)/2:])
return (list(itertools.chain(itertools.islice(list1,0,len(list1)/2),itertools.islice(list2,0,len(list2)/2))),
list(itertools.chain(itertools.islice(list1,len(list1)/2,None),itertools.islice(list2,len(list2)/2,None))))
The above doesn't create any temporary list and if OP desires to use it as an iterator rather than a list for the downstream processing, then the list can be safely dropped from the function and can be left to return as a tuple of iterators.