Related
Combining tuples in list of tuples
test_list = [([1, 2, 3], 'gfg'), ([5, 4, 3], 'cs')]
How to get this output:
[(1, 'gfg'), (2, 'gfg'), (3, 'gfg'), (5, 'cs'), (4, 'cs'), (3, 'cs')]
Just to go into a bit more detail about how to do this with list comprehensions and explain what they are and how they work...
To begin with, here's a fairly long-winded way of achieving what you want:
test_list = [([1, 2, 3], 'gfg'), ([5, 4, 3], 'cs')]
result = [] # set up empty list to hold the result
for group in test_list: # loop through each 'group' in your list
(numbers, text) = group # unpack into the list of numbers and the text string
for n in numbers: # loop through the numbers
result.append((n, text)) # add the (number, text) tuple to the result list
print(result)
# [(1, 'gfg'), (2, 'gfg'), (3, 'gfg'), (5, 'cs'), (4, 'cs'), (3, 'cs')]
So we've achieved the result using two for loops, one nested inside the other.
But there's a really neat Python construct called a list comprehension which lets you do this kind of loop in just one line.
Using an example with just a single loop:
numbers = [1, 2, 3]
doubles = []
for n in numbers:
doubles.append(n * 2)
print(doubles)
# [2, 4, 6]
We can re-write this as the following list comprehension:
numbers = [1, 2, 3]
doubles = [n * 2 for n in numbers]
print(doubles)
# [2, 4, 6]
A list comprehension is of the form:
result = [<expression> for item in iterable]
which is equivalent to:
result = []
for item in iterable:
result.append(<expression>)
where <expression> is something involving item.
You can also nest list comprehensions like you can nest for loops. Going back to our original problem, we need to first change it so that we 'unpack' group into numbers and text directly when we set up the for loop:
result = []
for (numbers, text) in test_list:
for n in numbers:
result.append((n, text))
Now imagine dragging the for loops off to the right until we can line them all up:
result = []
for (numbers, text) in test_list:
for n in numbers:
result.append((n, text))
and then put the expression (i.e. (n, text)) at the left:
result = [(n, text) for (numbers, text) in test_list for n in numbers]
List comprehensions may seem strange at first (especially if you're jumping straight into a double list comprehension!), but one you've got your head around how they work, they are really neat and can be very powerful! There are also similar set comprehensions and dictionary comprehensions. Read more here: https://dbader.org/blog/list-dict-set-comprehensions-in-python
You can use nested list comprehensions:
test_list = [([1, 2, 3], 'gfg'), ([5, 4, 3], 'cs')]
result = [(z, y) for x, y in test_list for z in x]
# z = numbers in the lists inside the tuples
# x = the lists inside the tuples
# y = the strings inside the tuples
print(result)
Output:
[(1, 'gfg'), (2, 'gfg'), (3, 'gfg'), (5, 'cs'), (4, 'cs'),(3, 'cs')]
result = [(z, y) for x, y in test_list for z in x] is the list comprehension version for:
result = []
for x, y in test_list:
for z in x:
result.append((z,y))
My question is similar to this previous SO question.
I have two very large lists of data (almost 20 million data points) that contain numerous consecutive duplicates. I would like to remove the consecutive duplicate as follows:
list1 = [1,1,1,1,1,1,2,3,4,4,5,1,2] # This is 20M long!
list2 = ... # another list of size len(list1), also 20M long!
i = 0
while i < len(list)-1:
if list[i] == list[i+1]:
del list1[i]
del list2[i]
else:
i = i+1
And the output should be [1, 2, 3, 4, 5, 1, 2] for the first list.
Unfortunately, this is very slow since deleting an element in a list is a slow operation by itself. Is there any way I can speed up this process? Please note that, as shown in the above code snipped, I also need to keep track of the index i so that I can remove the corresponding element in list2.
Python has this groupby in the libraries for you:
>>> list1 = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> from itertools import groupby
>>> [k for k,_ in groupby(list1)]
[1, 2, 3, 4, 5, 1, 2]
You can tweak it using the keyfunc argument, to also process the second list at the same time.
>>> list1 = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> list2 = [9,9,9,8,8,8,7,7,7,6,6,6,5]
>>> from operator import itemgetter
>>> keyfunc = itemgetter(0)
>>> [next(g) for k,g in groupby(zip(list1, list2), keyfunc)]
[(1, 9), (2, 7), (3, 7), (4, 7), (5, 6), (1, 6), (2, 5)]
If you want to split those pairs back into separate sequences again:
>>> zip(*_) # "unzip" them
[(1, 2, 3, 4, 5, 1, 2), (9, 7, 7, 7, 6, 6, 5)]
You can use collections.deque and its max len argument to set a window size of 2. Then just compare the duplicity of the 2 entries in the window, and append to the results if different.
def remove_adj_dups(x):
"""
:parameter x is something like '1, 1, 2, 3, 3'
from an iterable such as a string or list or a generator
:return 1,2,3, as list
"""
result = []
from collections import deque
d = deque([object()], maxlen=2) # 1st entry is object() which only matches with itself. Kudos to Trey Hunner -->object()
for i in x:
d.append(i)
a, b = d
if a != b:
result.append(b)
return result
I generated a random list with duplicates of 20 million numbers between 0 and 10.
def random_nums_with_dups(number_range=None, range_len=None):
"""
:parameter
:param number_range: use the numbers between 0 and number_range. The smaller this is then the more dups
:param range_len: max len of the results list used in the generator
:return: a generator
Note: If number_range = 2, then random binary is returned
"""
import random
return (random.choice(range(number_range)) for i in range(range_len))
I then tested with
range_len = 2000000
def mytest():
for i in [1]:
return [remove_adj_dups(random_nums_with_dups(number_range=10, range_len=range_len))]
big_result = mytest()
big_result = mytest()[0]
print(len(big_result))
The len was 1800197 (read dups removed), in <5 secs, which includes the random list generator spinning up.
I lack the experience/knowhow to say if it is memory efficient as well. Could someone comment please
I have a list of lists with 4 elements in each of them.
LoL=[[1,1,1,1],[4,2,3,[1,3]],[4,5,3,[0,4]]]
The 4th elements can be a list of two parts like [0,4] in [4,5,3,[0,4]].
I need to use its elements as keys for a dictionary,
Pseudo code:
dic = { [1,1,1,1]:'a',[4,2,3,[1,3]]:'b',[4,5,3,[0,4]]:'c' }
so tried to change them to tuples.
It works for simple lists (like [1,1,1,1]), but for the ones containing another list (like [4,5,3,[0,4]]) it raises an error:
dic[tuple([1,1,1,1])]=['bla','blah']
print dic
{(1, 1, 1, 1): ['bla', 'blah']}
dic[tuple([4, 2, 3, [1, 3]])]=['blablah']
TypeError: unhashable type: 'list'
I need to reuse the keys as lists later. So trying to change elements of LoL to strings (e.g. using repr()) is not an option!
Edit:
I know why lists cannot be used as dictionary keys. Here they are not changed while in the dic. I just need some way to pass them to another module to extract them.
Just convert your nested lists to nested tuples. Here's a quick demo. It's not perfect, but it works.
#! /usr/bin/env python
LoL = [[1,1,1,1],[4,2,3,[1,3]],[4,5,3,[0,4]]]
def nested_list_to_tuple(nlist):
return tuple([nested_list_to_tuple(i) if isinstance(i, list) else i for i in nlist])
ToT = nested_list_to_tuple(LoL)
print ToT
output
((1, 1, 1, 1), (4, 2, 3, (1, 3)), (4, 5, 3, (0, 4)))
Just use tuples:
a = {}
a[(4, 2, 3, (1, 3))] = ['blablah']
print(a)
Output:
{(4, 2, 3, (1, 3)): ['blablah']}
I am fairly new to python and am trying to figure out how to duplicate items within a list. I have tried several different things and searched for the answer extensively but always come up with an answer of how to remove duplicate items, and I feel like I am missing something that should be fairly apparent.
I want a list of items to duplicate such as if the list was [1, 4, 7, 10] to be [1, 1, 4, 4, 7, 7, 10, 10]
I know that
list = range(5)
for i in range(len(list)):
list.insert(i+i, i)
print list
will return [0, 0, 1, 1, 2, 2, 3, 3, 4, 4] but this does not work if the items are not in order.
To provide more context I am working with audio as a list, attempting to make the audio slower.
I am working with:
def slower():
left = Audio.getLeft()
right = Audio.getRight()
for i in range(len(left)):
left.insert(????)
right.insert(????)
Where "left" returns a list of items that are the "sounds" in the left headphone and "right" is a list of items that are sounds in the right headphone. Any help would be appreciated. Thanks.
Here is a simple way:
def slower(audio):
return [audio[i//2] for i in range(0,len(audio)*2)]
Something like this works:
>>> list = [1, 32, -45, 12]
>>> for i in range(len(list)):
... list.insert(2*i+1, list[2*i])
...
>>> list
[1, 1, 32, 32, -45, -45, 12, 12]
A few notes:
Don't use list as a variable name.
It's probably cleaner to flatten the list zipped with itself.
e.g.
>>> zip(list,list)
[(1, 1), (-1, -1), (32, 32), (42, 42)]
>>> [x for y in zip(list, list) for x in y]
[1, 1, -1, -1, 32, 32, 42, 42]
Or, you can do this whole thing lazily with itertools:
from itertools import izip, chain
for item in chain.from_iterable(izip(list, list)):
print item
I actually like this method best of all. When I look at the code, it is the one that I immediately know what it is doing (although others may have different opinions on that).
I suppose while I'm at it, I'll just point out that we can do the same thing as above with a generator function:
def multiply_elements(iterable, ntimes=2):
for item in iterable:
for _ in xrange(ntimes):
yield item
And lets face it -- Generators are just a lot of fun. :-)
listOld = [1,4,7,10]
listNew = []
for element in listOld:
listNew.extend([element,element])
This might not be the fastest way but it is pretty compact
a = range(5)
list(reduce(operator.add, zip(a,a)))
a then contains
[0, 0, 1, 1, 2, 2, 3, 3, 4, 4]
a = [0,1,2,3]
list(reduce(lambda x,y: x + y, zip(a,a))) #=> [0,0,1,1,2,2,3,3]
I asked some similar questions [1, 2] yesterday and got great answers, but I am not yet technically skilled enough to write a generator of such sophistication myself.
How could I write a generator that would raise StopIteration if it's the last item, instead of yielding it?
I am thinking I should somehow ask two values at a time, and see if the 2nd value is StopIteration. If it is, then instead of yielding the first value, I should raise this StopIteration. But somehow I should also remember the 2nd value that I asked if it wasn't StopIteration.
I don't know how to write it myself. Please help.
For example, if the iterable is [1, 2, 3], then the generator should return 1 and 2.
Thanks, Boda Cydo.
[1] How do I modify a generator in Python?
[2] How to determine if the value is ONE-BUT-LAST in a Python generator?
This should do the trick:
def allbutlast(iterable):
it = iter(iterable)
current = it.next()
for i in it:
yield current
current = i
>>> list(allbutlast([1,2,3]))
[1, 2]
This will iterate through the entire list, and return the previous item so the last item is never returned.
Note that calling the above on both [] and [1] will return an empty list.
First off, is a generator really needed? This sounds like the perfect job for Python’s slices syntax:
result = my_range[ : -1]
I.e.: take a range form the first item to the one before the last.
the itertools module shows a pairwise() method in its recipes. adapting from this recipe, you can get your generator:
from itertools import *
def n_apart(iterable, n):
a,b = tee(iterable)
for count in range(n):
next(b)
return zip(a,b)
def all_but_n_last(iterable, n):
return (value for value,dummy in n_apart(iterable, n))
the n_apart() function return pairs of values which are n elements apart in the input iterable, ignoring all pairs . all_but_b_last() returns the first value of all pairs, which incidentally ignores the n last elements of the list.
>>> data = range(10)
>>> list(data)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(n_apart(data,3))
[(0, 3), (1, 4), (2, 5), (3, 6), (4, 7), (5, 8), (6, 9)]
>>> list(all_but_n_last(data,3))
[0, 1, 2, 3, 4, 5, 6]
>>>
>>> list(all_but_n_last(data,1))
[0, 1, 2, 3, 4, 5, 6, 7, 8]
The more_itertools project has a tool that emulates itertools.islice with support for negative indices:
import more_itertools as mit
list(mit.islice_extended([1, 2, 3], None, -1))
# [1, 2]
gen = (x for x in iterable[:-1])