Related
I am using the following function to find the subsets of a list L. However, when converting the output of the function powerset into a list it takes way too long. Any suggestion?
For clarification, this powerset function does not output the empty subset and the subset L itself (it is intentional).
My list L:
L = [0, 3, 5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107]
The code:
def powerset(s):
x = len(s)
masks = [1 << i for i in range(x)]
for i in range(1, (1 << x)-1):
yield [ss for mask, ss in zip(masks, s) if i & mask]
my_Subsets = list(powerset(L)) # <--- THIS TAKES WAY TOO LONG
Your set has 55 elements. Meaning 2^55=36028797018963968 subsets.
There's no way, in any language, any algorithm to make that fast. Because for each subset you need at least one allocation, and that single operation repeated 2^55 times will run forever. For example if we were to run one allocation per nanosecond (in reality this is orders of magnitude slower) we are looking at something over a year (if my calculations are correct). In Python probably 100 years. :P
Not to mention that the final result is unlikely to fit in the entire world's data storage (ram + hard drives) currently available. And definitely not in a single machine's storage. And so final list(...) conversion will fail with 100% probability, even if you wait those years.
Whatever you are trying to achieve (this is likely an XY problem) you are doing it the wrong way.
What you could do is create a class that will behave like a list but would only compute the items as needed and not actually store them:
class Powerset:
def __init__(self,base):
self.base = base
def __len__(self):
return 2**len(self.base)-2 # - 2 you're excluding empty and full sets
def __getitem__(self,index):
if isinstance(index,slice):
return [ self.__getitem__(i) for i in range(len(self))[index] ]
else:
return [ss for bit,ss in enumerate(self.base) if (1<<bit) & (index+1)]
L = [0, 3, 5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107]
P = Powerset(L)
print(len(P)) # 36028797018963966
print(P[:10]) # [[0], [3], [0, 3], [5], [0, 5], [3, 5], [0, 3, 5], [6], [0, 6], [3, 6]]
print(P[3:6]) # [[5], [0, 5], [3, 5]]
print(P[-3:]) # [[5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107], [0, 5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107], [3, 5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107]]
Obviously, if the next thing you do is a sequential search or traversal of the powerset, it will still take forever.
I need a list of integers that skips every ten numbers. I've tried to do it with brute force:
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
This has been okay so far, but I need to go much farther and can't type in every number.
I've tried this as well, but I guess I don't understand how range works:
x = list(range(10))
y = list(range(20:29))
z = list(range(40:49))
final_list = x + y + z
Any help would be greatly appreciated.
Something like this will probably work . Just change the range to repeat how ever many times you need it. The +20 is 10 for the first that is included and then for the ones skipped. And I increase both limits
my_list = []
lower=0
high=10
for n in range(5):
x = range(lower, high)
for n in x:
my_list.append(n)
lower=lower+20;
high=high+20;
y = list(range(20:29))
z = list(range(40:49))
was invalided.
range(start, end, step).Under your circumstance, you need these numbers: [20, 30), [40, 50).
So just use(like #jizhihaoSAMA said in comment):
x = list(range(10))
y = list(range(20, 30))
z = list(range(40, 50))
final_list = x + y + z
print(final_list)
Result:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
https://i.stack.imgur.com/cGygs.png]
x = list(range(11))
y = list(range(20, 30))
z = list(range(40, 50))
final_list = x + y + z
print(final_list)
You could also do this with a bit of math by adding the appropriate multiple of 10 to sequential numbers:
[ n + 10*(n//10) for n in range(100)]
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189]
I would like to create a program that as a result gives output that every even numbers from 0 to 100, including 0.
a = list(range(0, 101, 2))
print(a)
Output (even numbers)
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100]
The main thing is to have it in pair, inside the the list.
[(0, ), (2, ), ... , (98, )]
You can use list comprehension as follows
[(i,) for i in range(0, 101, 2)]
What about
a = list((i,) for i in range(0, 101, 2))
Not sure it's the most efficient way.
I am completely stuck. In my original dataframe I have 1 column of interest (fluorescence) and I want to take a fixed amount of elements (=3, color yellow) at fixed interval (5) and average them. The output should be saved into a NewList.
fluorescence = df.iloc[1:20, 0]
fluorescence=pd.to_numeric(fluorescence)
## add a list to count
fluorescence['time']= list(range(1,20,1))
## create a list with interval
interval = list(range(1, 20, 5))
NewList=[]
for i in range(len(fluorescence)):
if fluorescence['time'][i] == interval[i]:
NewList.append(fluorescence[fluorescence.tail(3).mean()])
print(NewList)
Any input is welcome!!
Thank you in advance
Here, I'm taking subset of dataframe for every 5 consecutive iterations and taking tail 3 rows mean
import pandas as pd
fluorescence=pd.DataFrame([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15])
NewList=[]
j=0
for i1 in range(4,len(fluorescence),5):
NewList.append(fluorescence.loc[j:i1,0].tail(3).mean())
j=i1
print(NewList)
If you have a list of data and you want to grab 3 entries out of every 5 you can segment your list as follows:
from statistics import mean
data = [63, 64, 43, 91, 44, 84, 14, 43, 87, 53, 81, 98, 34, 33, 60, 82, 86, 6, 81, 96, 99, 10, 76, 73, 63, 89, 70, 29, 32, 3, 98, 52, 37, 8, 2, 80, 50, 99, 71, 5, 7, 35, 56, 47, 40, 2, 8, 56, 69, 15, 76, 52, 24, 56, 89, 52, 30, 70, 68, 71, 17, 4, 39, 39, 85, 29, 18, 71, 92, 8, 1, 95, 52, 94, 71, 88, 59, 64, 100, 96, 65, 15, 89, 19, 63, 38, 50, 65, 52, 26, 46, 79, 85, 32, 12, 67, 35, 22, 54, 81]
new_data = []
for i in range(0, len(data), 5):
every_five = data[i:i+5]
three_out_of_five = every_five[2:5]
new_data.append(mean(three_out_of_five))
print(new_data)
I don't know why I am getting an error of index. I am quite new to python and therefore am not able to figure out what to do. I think I am initialzing some wrong dimensions but I am not able to break it.
import numpy as np
import matplotlib as plt
x = np.array([45, 68, 41, 87, 61, 44, 67, 30, 54, 8, 39, 60, 37, 50, 19, 86, 42, 29, 32, 61, 25, 77, 62, 98, 47, 36, 15, 40, 9, 25, 34, 50, 61, 75, 51, 96, 20, 13, 18, 35, 43, 88, 25, 95, 68, 81, 29, 41, 45, 87,45, 68, 41, 87, 61, 44, 67, 30, 54, 8, 39, 60, 37, 50, 19, 86, 42, 29, 32, 61, 25, 77, 62, 98, 47, 36, 15, 40, 9, 25, 34, 50, 61, 75, 51, 96, 20, 13, 18, 35, 43, 88, 25, 95, 68, 81, 29, 41, 45, 87])
len_x = len(x)
mean = np.mean(x)
xup = np.zeros(shape=(1,120))
for i in range(len_x) :
xup[i] = (x[i] - mean) ** 2
xup_sum = np.sum(xup)
var = xup_sum / len_x
std_dev = var ** 0.5
z = np.zeros(shape = (1,120))
for i in range(len_x) :
z[i] = (x[i] - mean)/std_dev
print("Mean :", mean)
print("Standard_dev :",std_dev)
print("Variance : ",var)
You really should tell us where the error occurred. But I can guess:
xup = np.zeros(shape=(1,120))
for i in range(len_x) :
xup[i,:] = (x[i] - mean) ** 2 #<=====
(Similar z loop follows)
I added an implied ,:. Your xup[i] is indexing the first dimension. But that is only size 1. As created it's the 2nd dimension that is large. xup[0,i] is the right indexing.
Why is xup 2d with the (1,120) shape? Why not the same shape as x (which I assume is (120,))? xup = np.zeros(len_x).
Better yet use a proper numpy array calculation:
xup = (x-mean)**2
However this xup has the shape (100,), the same as x.
You are already using np.mean(x) which operates on the whole of x. Operators like - and ** do so as well.
(Earlier I'd suggested using np.zeros_like(x), but then realized that it would create an integer array like x. Assigning float values from the calculation to that would give problems. When doing an assign and fill loop you need to pay attention to both the shape and dtype of target array.)
xup is 2 dimensional. So instead of xup[i] you would need xup[0][i]
Just fix these 2 places:
xup = np.zeros(shape=(1,120))
for i in range(len_x) :
xup[0, i] = (x[i] - mean) ** 2
And then again here:
z = np.zeros(shape = (1,120))
for i in range(len_x) :
z[0, i] = (x[i] - mean)/std_dev
This would be the file you posted above with the 2 changes:
import numpy as np
import matplotlib as plt
x = np.array([45, 68, 41, 87, 61, 44, 67, 30, 54, 8, 39, 60, 37, 50, 19, 86, 42, 29, 32, 61, 25, 77, 62, 98, 47, 36, 15, 40, 9, 25, 34, 50, 61, 75, 51, 96, 20, 13, 18, 35, 43, 88, 25, 95, 68, 81, 29, 41, 45, 87,45, 68, 41, 87, 61, 44, 67, 30, 54, 8, 39, 60, 37, 50, 19, 86, 42, 29, 32, 61, 25, 77, 62, 98, 47, 36, 15, 40, 9, 25, 34, 50, 61, 75, 51, 96, 20, 13, 18, 35, 43, 88, 25, 95, 68, 81, 29, 41, 45, 87])
len_x = len(x)
mean = np.mean(x)
xup = np.zeros(shape=(1,120))
for i in range(len_x) :
xup[0, i] = (x[i] - mean) ** 2
xup_sum = np.sum(xup)
var = xup_sum / len_x
std_dev = var ** 0.5
z = np.zeros(shape = (1,120))
for i in range(len_x) :
z[0, i] = (x[i] - mean)/std_dev
print("Mean :", mean)
print("Standard_dev :",std_dev)
print("Variance : ",var)