Python merging sublist - python

I've got the following list :
[['a','b','c'],['d','e'],['f','g','h','i',j]]
I would like a list like this :
['abc','de','fghij']
How is it possible?
[Edit] : in fact, my list could have strings and numbers,
l = [[1,2,3],[4,5,6], [7], [8,'a']]
and would be :
l = [123,456, 7, 8a]
thx to all,

you can apply ''.join method for all sublists.
This can be done either using map function or using list comprehensions
map function runs function passed as first argument to all elements of iterable object
initial = ['a', 'b', 'c'], ['d', 'e'], ['f', 'g', 'h', 'i', 'j']]
result = map(''.join, initial)
also one can use list comprehension
initial = ['a', 'b', 'c'], ['d', 'e'], ['f', 'g', 'h', 'i', 'j']]
result = [''.join(sublist) for sublist in initial]

Try
>>> L = [['a','b','c'],['d','e'],['f','g','h','i','j']]
>>> [''.join(x) for x in L]
['abc', 'de', 'fghij']

Related

Split list of lines into 2d array

I have set of sequences in a list which looks like this :
[agghd,gjg,tomt]
How to split it so that my output looks like the following :
[[a,g,g,h,d],[g,j,g],[t,o,m,t]]
I have done the following code for now :
agghd
gjh
tomt
list2=[]
list2 = [str(sequences.seq).split() for sequences in family]
You can split a string to characters by calling list() on it
list1 = ['agghd', 'gjg', 'tomt']
list2 = [list(string) for string in list1]
# output: [['a', 'g', 'g', 'h', 'd'], ['g', 'j', 'g'], ['t', 'o', 'm', 't']]
You can try
[[eval(n) for n in str(sequences.seq).split()] for sequences in family]

How can I split a list in two unique lists in Python?

Hi I have a list as following:
listt = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
15 members.
I want to turn it into 3 lists, I used this code it worked but I want unique lists. this give me 3 lists that have mutual members.
import random
listt = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
print(random.sample(listt,5))
print(random.sample(listt,5))
print(random.sample(listt,5))
Try this:
from random import shuffle
def randomise():
listt = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
shuffle(listt)
return listt[:5], listt[5:10], listt[10:]
print(randomise())
This will print (for example, since it is random):
(['i', 'k', 'c', 'b', 'a'], ['d', 'j', 'h', 'n', 'f'], ['e', 'l', 'o', 'g', 'm'])
If it doesn't matter to you which items go in each list, then you're better off partitioning the list into thirds:
In [23]: L = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
In [24]: size = len(L)
In [25]: L[:size//3]
Out[25]: ['a', 'b', 'c', 'd', 'e']
In [26]: L[size//3:2*size//3]
Out[26]: ['f', 'g', 'h', 'i', 'j']
In [27]: L[2*size//3:]
Out[27]: ['k', 'l', 'm', 'n', 'o']
If you want them to have random elements from the original list, you'll just need to shuffle the input first:
random.shuffle(L)
Instead of sampling your list three times, which will always give you three independent results where individual members may be selected for more than a single list, you could just shuffle the list once and then split it in three parts. That way, you get three random subsets that will not share any items:
>>> random.shuffle(listt)
>>> list[0:5]
>>> listt[0:5]
['b', 'a', 'f', 'e', 'h']
>>> listt[5:10]
['c', 'm', 'g', 'j', 'o']
>>> listt[10:15]
['d', 'l', 'i', 'n', 'k']
Note that random.shuffle will shuffle the list in place, so the original list is modified. If you don’t want to modify the original list, you should make a copy first.
If your list is larger than the desired result set, then of course you can also sample your list once with the combined result size and then split the result accordingly:
>>> sample = random.sample(listt, 5 * 3)
>>> sample[0:5]
['h', 'm', 'i', 'k', 'd']
>>> sample[5:10]
['a', 'b', 'o', 'j', 'n']
>>> sample[10:15]
['c', 'l', 'f', 'e', 'g']
This solution will also avoid modifying the original list, so you will not need a copy if you want to keep it as it is.
Use [:] for slicing all members out of the list which basically copies everything into a new object. Alternatively just use list(<list>) which copies too:
print(random.sample(listt[:],5))
In case you want to shuffle only once, store the shuffle result into a variable and copy later:
output = random.sample(listt,5)
first = output[:]
second = output[:]
print(first is second, first is output) # False, False
and then the original list can be modified without the first or second being modified.
For nested lists you might want to use copy.deepcopy().

Python: Get all combinations of sequential elements of list

Given an array say x = ['A','I','R']
I would want output as an
[['A','I','R'],['A','I'],['I','R'],['A'],['I'],['R']]
What I don't want as output is :
[['A','I','R'],['A','I'],['I','R'],['A','R'],['A'],['I'],['R']] # extra ['A','R'] which is not in sequence .
Below is the code which gives the output I don't want:
letter_list = [a for a in str]
all_word = []
for i in xrange(0,len(letter_list)):
all_word = all_word + (map(list, itertools.combinations(letter_list,i))) # dont use append. gives wrong result.
all_word = filter(None,all_word) # remove empty combination
all_word = all_word + [letter_list] # add original list
My point is I only want combinations of sequences. Is there any way to use itertools or should I write custom function ?
Yes, you can use itertools:
>>> x = ['A', 'I', 'R']
>>> xs = [x[i:j] for i, j in itertools.combinations(range(len(x)+1), 2)]
>>> xs
[['A'], ['A', 'I'], ['A', 'I', 'R'], ['I'], ['I', 'R'], ['R']]
>>> sorted(xs, key=len, reverse=True)
[['A', 'I', 'R'], ['A', 'I'], ['I', 'R'], ['A'], ['I'], ['R']]
Credit: answer by hochl
Try to use yield:
x = ['A','I','R']
def groupme(x):
s = tuple(x)
for size in range(1, len(s) + 1):
for index in range(len(s) + 1 - size):
yield list(x[index:index + size])
list(groupme(x))
>>> [['A'], ['I'], ['R'], ['A', 'I'], ['I', 'R'], ['A', 'I', 'R']]
don't try to be so magical: two loops will do what you want; one over possible sequence starts, the inner over possible sequence lengths:
x = "AIR" # strings are iterables/sequences, too!
all_words = []
for begin in xrange(len(x)):
for length in xrange(1,len(x) - begin+1):
all_words.append(x[begin:begin+length])
using list comprehension:
letters=['A', 'I', 'R']
[letters[start:end+1]
for start in xrange(len(letters))
for end in xrange(start, len(letters))]
[['A'], ['A', 'I'], ['A', 'I', 'R'], ['I'], ['I', 'R'], ['R']]
if it is important to have the order you proposed (from longest to shortest and when the same length by starting position) you can do instead:
[letters[start:start+l+1]
for l in range(len(letters))[::-1]
for start in xrange(len(letters)-l)]
[['A', 'I', 'R'], ['A', 'I'], ['I', 'R'], ['A'], ['I'], ['R']]
Just to address Holroy comment. If instead of using list comprehension you use a generator expression (just substituting external [] with ()) you would get a much less memory requiring code. But in this case you must be careful of not using the result more than once or for instance not trying to use list methods (such as len, or removing elements) on the result.

search an item of sublist in another list of list by position

I have a list of list created like
biglist=[['A'], ['C', 'T'], ['A', 'T']]
and I will have another list like
smalllist=[['C'], ['T'], ['A', 'T']]
So, I want to check wheter an item in small list contains in that specific index of biglist, if not append to it.
so, making
biglist=[['A','C'], ['C', 'T'], ['A', 'T']]
so, 'C' from fist sublist of smalllist was added to first sublist of biglist. but not for second and third.
I tried like
dd=zip(biglist, smalllist)
for each in dd:
ll=each[0].extend(each[1])
templist.append(list(set(ll)))
but get errors
templist.append(list(set(ll)))
TypeError: 'NoneType' object is not iterable
How to do it?
Thank you
Probably, you should try this:
// This will only work, if smalllist is shorter than biglist
SCRIPT:
biglist = [['A'], ['C', 'T'], ['A', 'T']]
smalllist = [['C'], ['T'], ['A', 'T']]
for i, group in enumerate(smalllist):
for item in group:
if item not in biglist[i]:
biglist[i].append(item)
DEMO:
print(biglist)
# [['A', 'C'], ['C', 'T'], ['A', 'T']]
[list(set(s+b)) for (s,b) in zip(smalllist,biglist)]
For some reason, extend in Python doesn't return the list itself after extending. So ll in your case is None. Just put ll=each[0] on the second line in the loop, and your solution should start working.
Still, I'm not getting, why you don' keep your elements in sets in the first place. This would avoid you from having to convert from list to set and then backwards.
I would just or sets instead of appending to the list and then filtering out duplicates by resorting to set and then to list.
>>> from itertools import izip
>>> templist = []
>>> for els1,els2 in izip(biglist,smalllist):
joined = list(set(els1) | set(els2))
templist.append(joined)
>>> templist
[['A', 'C'], ['C', 'T'], ['A', 'T']]
Keeping elements in sets in the first place seems to be the fastest in Python 3 even for such small amount of elements in each set (see comments):
biglist=[set(['A']), set(['C', 'T']), set(['A', 'T'])]
smalllist=[set(['C']), set(['T']), set(['A', 'T'])]
for els1,els2 in zip(biglist,smalllist):
els1.update(els2)
print(biglist)
Ouput:
[{'A', 'C'}, {'C', 'T'}, {'A', 'T'}]

How to maintain consistency in list?

I have a list like
lst = ['a', 'b', 'c', 'd', 'e', 'f']
I have a pop position list
p_list = [0,3]
[lst.pop(i) for i in p_list] changed the list to ['b', 'c', 'd', 'f'], here after 1st iteration list get modified. Next pop worked on the new modified list.
But I want to pop the element from original list at index [0,3] so, my new list should be
['b', 'c', 'e', 'f']
Lots of reasonable answers, here's another perfectly terrible one:
[item for index, item in enumerate(lst) if index not in plist]
You could pop the elements in order from largest index to smallest, like so:
lst = ['a', 'b', 'c', 'd', 'e', 'f']
p_list = [0,3]
p_list.sort()
p_list.reverse()
[lst.pop(i) for i in p_list]
lst
#output: ['b', 'c', 'e', 'f']
Do the pops in reversed order:
>>> lst = ['a', 'b', 'c', 'd', 'e', 'f']
>>> p_list = [0, 3]
>>> [lst.pop(i) for i in reversed(p_list)][::-1]
['a', 'd']
>>> lst
['b', 'c', 'e', 'f']
The important part here is that inside of the list comprehension you should always call lst.pop() on later indices first, so this will only work if p_list is guaranteed to be in ascending order. If that is not the case, use the following instead:
[lst.pop(i) for i in sorted(p_list, reverse=True)]
Note that this method makes it more complicated to get the popped items in the correct order from p_list, if that is important.
Your method of modifying the list may be error prone, why not use numpy to only access the index elements that you want? That way everything stays in place (in case you need it later) and it's a snap to make a new pop list. Starting from your def. of lst and p_list:
from numpy import *
lst = array(lst)
idx = ones(lst.shape,dtype=bool)
idx[p_list] = False
print lst[idx]
Gives ['b' 'c' 'e' 'f'] as expected.

Categories

Resources