I am trying to average numbers of different lists of varying lengths. (in nested-list form as shown below)
mylist =[[1, 3, 7, 10], [3, 9, 9, 0], [5, 6]]
I want the result of
averaged_list = [3, 6, 8, 5]
I have tried,
averaged_list = [mean(x) for x in zip(*mylist)]
which only lends:
[3, 6]
mylist above is simplified just to demonstrate the purpose but it will be lengthier in practice.
Thank you for the help and advice!
zip will ignore excess values according to the shortest length iterable. You must use itertools.zip_longest instead, and must take care of filtering the None fill-values:
import itertools
averaged_list = [
mean((x for x in xs if x is not None)) # ignore fillvalues
for xs in itertools.zip_longest(*mylist)
]
You could do this:
from itertools import zip_longest
import numpy as np
averaged_list = [np.nanmean(x) for x in zip_longest(*mylist, fillvalue=np.nan)]
(See #michaeldel's answer for explanation, same idea)
Related
How do I add elements of lists within a list component wise?
p=[[1,2,3],[1,0,-1]]
I have tried the following:
list(map(sum,zip(p[0],p[1])))
Will get me [2,2,2] which is what I need. But how to extend it for a variable number of lists? For example, p=[[1,2,3],[1,0,-1],[1,1,1]] should yield [3,3,3].
A solution I figured out is the following:
import pandas as pd
p=[[1,2,3],[1,0,-1],[1,1,1]]
list(pd.DataFrame(p).sum())
Is there a more "Pythonic" way to solve this problem?
Use * for unpack lists:
a = list(map(sum,zip(*p)))
print (a)
[3, 3, 3]
In numpy solution is similar like in pandas:
a = np.array(p).sum(axis=0).tolist()
print(a)
[3, 3, 3]
You can use * to unpack the list and sum to sum it up.
If you are uncomfortable with the map function you can do it like this:
p = [[1, 2, 3], [4, 5, 6], [-5,-7,-9]]
sum_list = [sum(elem) for elem in zip(*p)]
print(sum_list)
How do you generate a random list of lists? I can generate a random list using the following [random.sample(range(80), 20)].
You can use the * function to make it list of lists but the numbers aren't random for each list. I can find answers relating to the creation of random list of lists that produce variable lengths but I can't find anything that will produce random numbers with the same length.
Here's one way, assuming that you want several random samples of the same kind.
First, for neatness, not necessity, curry random.sample using functools partial so that it produces, for the sake of this example, a sample of size 5 from the integers from 0 through 9 inclusive, each time it's called.
Then simply use list comprehension to produce a collection of those samples.
>>> import random
>>> from functools import partial
>>> one_sample = partial(random.sample, range(10), 5)
>>> one_sample()
[2, 8, 0, 5, 6]
>>> sample = [one_sample() for _ in range(3)]
>>> sample
[[9, 6, 7, 2, 0], [5, 8, 9, 6, 7], [5, 0, 6, 7, 3]]
I have a nested list in the following form
inputlist = [[1,2,3],[4,5,6],[7,8,9],[1,2,3,4],[5,6,7,8],[1,2],[3,4]]
I would like further nest it based on changing length as follows:
outputlist = [[[1,2,3],[4,5,6],[7,8,9]],[[1,2,3,4],[5,6,7,8]],[[1,2],[3,4]]]
The underlying logic is that I wish to group every change in list length into a new sublist. It is kind of difficult to explain but I hope the above two examples show what I am trying to do.
How can I achieve this simply and elegantly using python? Thanks.
>>> from itertools import groupby
>>> input_list = [[1,2,3],[4,5,6],[7,8,9],[1,2,3,4],[5,6,7,8],[1,2],[3,4]]
>>> [list(g) for k, g in groupby(input_list, key=len)]
[[[1, 2, 3], [4, 5, 6], [7, 8, 9]], [[1, 2, 3, 4], [5, 6, 7, 8]], [[1, 2], [3, 4]]]
Here's an approach.
Get a list of the lengths involved:
#solen: set of lengths
solen = set([len(subl) for subl in inputlist]) # portable
solen = {len[subl] for subl in inputlist} # post Python 2.6
Then build the list of lists of a particular length:
#losubl: list of sublists, one for each item from solen
losubl = [[subl for subl in inputlist if len(subl) == ulen] for ulen in solen]
As jamylak points out, this solution is less efficient than the one based on itertools (more than one pass, sacrifices some order information). OTOH, it may avoid an import if you don't have other uses for itertools. If the lists you're working with are big and complicated, it's probably worth the extra import to use itertools.
I have a list containing lists, each representing a company, with EBITDA values inside. Now I want to take the log of the returns and create a new list that has all the log returns using:
ln(ebitda t / ebitda t-1)
The desired result would be a list inside a list that keeps each company's results together like so:
[[0.69314718055994529, 0.40546510810816438, 0.28768207245178085], [0.18232155679395459, 0.15415067982725836, 0.13353139262452257]]
but so far I am getting:
[0.69314718055994529, 0.40546510810816438, 0.28768207245178085, 0.18232155679395459, 0.15415067982725836, 0.13353139262452257]
I found a way on SO to loop through each list and make the calculations like below:
from itertools import zip_longest
import numpy as np
l = [[1, 2, 3, 4], [5, 6, 7, 8]]
logs = []
for num in l:
for x, y in zip_longest(num[1:], num[0:-1]):
logs.append(np.log((x/y)))
However, for the results to be usable I need to be able to get them back into their own lists and I am not sure how to do this.
Thank you for reading and any help is appreciated.
Try this:
import numpy as np
l = [[1, 2, 3, 4], [5, 6, 7, 8]]
logs = []
for enterprise in l:
logs.append([])
for i in range(1, len(enterprise)):
logs[-1].append(np.log(enterprise[i]/enterprise[i-1]))
This is a cleaner version of what you have, but I don't understand the statement about getting them back into their own lists. Maybe iterating over l2 is what you want?
import numpy as np
l = [[1, 2, 3, 4], [5, 6, 7, 8]]
def f(num):
return [np.log(num[i+1]/float(num[i])) for i in xrange(len(num)-1)]
l2 = [f(num) for num in l]
for logren in l2:
print(logren)
I have a list of lists named 'run'. I am creating an average of those lists using this section of my code:
ave = [0 for t in range(s)]
for t in range(s):
z = 0
for i in range(l):
z = z + run[i][t]
#Converted values to a string for output purposes
# Added \n to output
ave[t]= ((str(z / l) + "\n"))
Much to my surprise, this code worked the first time that I wrote it. I'm now planning on working with much larger lists and many more values, and it's possible that performance issues will come into play. Is this method of writing an average inefficient in its use of computational resources, and how could I write code that was more efficient?
List comprehensions may be more efficient.
>>> run = [[1, 2, 3, 4, 5], [6, 7, 8, 9], [10, 11, 12, 13]]
>>> [sum(elem)/len(elem) for elem in zip(*run)]
[5.666666666666667, 6.666666666666667, 7.666666666666667, 8.666666666666666]
Alternatively, you could try map()
>>> list(map(lambda x: sum(x)/len(x), zip(*run)))
[5.666666666666667, 6.666666666666667, 7.666666666666667, 8.666666666666666]
You can improve efficiency by having Python do more of the work for you with efficient built-in functions and list comprehensions:
averages = [sum(items) / len(run) for items in zip(*run)]
import numpy as np
ave = [np.avg(col) for col in zip(*run)]
OR
ave = [sum(col)/len(col) for col in zip(*run)]
I entered this question looking for the following, which is dumb but as it does not use zip it does not ignore any value.
If you have a numerical list of lists with different lengths and want to find the average list
import numpy as np
def my_mean(list_of_lists):
maxlen = max([len(l) for l in list_of_lists])
for i in range(len(list_of_lists)):
while len(list_of_lists[i]) < maxlen:
list_of_lists[i].append(np.nan)
return np.nanmean(list_of_lists, axis=0)
aaa = [1, 2, 3]
bbb = [1, 2, 3, 5]
ccc = [4, 5, 6, 5, 10]
lofl = [aaa, bbb, ccc]
print(my_mean(lofl))
gives
[ 2. 3. 4. 5. 10.]