loop over column and average elements at fixed interval

loop over column and average elements at fixed interval - python

I am completely stuck. In my original dataframe I have 1 column of interest (fluorescence) and I want to take a fixed amount of elements (=3, color yellow) at fixed interval (5) and average them. The output should be saved into a NewList.
fluorescence = df.iloc[1:20, 0]
fluorescence=pd.to_numeric(fluorescence)
## add a list to count
fluorescence['time']= list(range(1,20,1))
## create a list with interval
interval = list(range(1, 20, 5))
NewList=[]
for i in range(len(fluorescence)):
if fluorescence['time'][i] == interval[i]:
NewList.append(fluorescence[fluorescence.tail(3).mean()])
print(NewList)
Any input is welcome!!
Thank you in advance

Here, I'm taking subset of dataframe for every 5 consecutive iterations and taking tail 3 rows mean
import pandas as pd
fluorescence=pd.DataFrame([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15])
NewList=[]
j=0
for i1 in range(4,len(fluorescence),5):
NewList.append(fluorescence.loc[j:i1,0].tail(3).mean())
j=i1
print(NewList)

If you have a list of data and you want to grab 3 entries out of every 5 you can segment your list as follows:
from statistics import mean
data = [63, 64, 43, 91, 44, 84, 14, 43, 87, 53, 81, 98, 34, 33, 60, 82, 86, 6, 81, 96, 99, 10, 76, 73, 63, 89, 70, 29, 32, 3, 98, 52, 37, 8, 2, 80, 50, 99, 71, 5, 7, 35, 56, 47, 40, 2, 8, 56, 69, 15, 76, 52, 24, 56, 89, 52, 30, 70, 68, 71, 17, 4, 39, 39, 85, 29, 18, 71, 92, 8, 1, 95, 52, 94, 71, 88, 59, 64, 100, 96, 65, 15, 89, 19, 63, 38, 50, 65, 52, 26, 46, 79, 85, 32, 12, 67, 35, 22, 54, 81]
new_data = []
for i in range(0, len(data), 5):
every_five = data[i:i+5]
three_out_of_five = every_five[2:5]
new_data.append(mean(three_out_of_five))
print(new_data)

Related

This is about the euler 11th python

nums = [8, 2, 22, 97, 38, 15, 00, 40, 00, 75, 4, 5, 7, 78, 52, 12, 50, 77, 91, 8,
49, 49, 99, 40, 17, 81, 18, 57, 60, 87, 17, 40, 98, 43, 69, 48, 4, 56, 62, 00,
81, 49, 31, 73, 55, 79, 14, 29, 93, 71, 40, 67, 53, 88, 30, 3, 49, 13, 36, 65,
52, 70, 95, 23, 4, 60, 11, 42, 69, 24, 68, 56, 1, 32, 56, 71, 37, 2, 36, 91,
22, 31, 16, 71, 51, 67, 63, 89, 41, 92, 36, 54, 22, 40, 40, 28, 66, 33, 13, 80,
24, 47, 32, 60, 99, 3, 45, 2, 44, 75, 33, 53, 78, 36, 84, 20, 35, 17, 12, 50,
32, 98, 81, 28, 64, 23, 67, 10, 26, 38, 40, 67, 59, 54, 70, 66, 18, 38, 64, 70,
67, 26, 20, 68, 2, 62, 12, 20, 95, 63, 94, 39, 63, 8, 40, 91, 66, 49, 94, 21,
24, 55, 58, 5, 66, 73, 99, 26, 97, 17, 78, 78, 96, 83, 14, 88, 34, 89, 63, 72,
21, 36, 23, 9, 75, 00, 76, 44, 20, 45, 35, 14, 00, 61, 33, 97, 34, 31, 33, 95,
78, 17, 53, 28, 22, 75, 31, 67, 15, 94, 3, 80, 4, 62, 16, 14, 9, 53, 56, 92,
16, 39, 5, 42, 96, 35, 31, 47, 55, 58, 88, 24, 00, 17, 54, 24, 36, 29, 85, 57,
86, 56, 00, 48, 35, 71, 89, 7, 5, 44, 44, 37, 44, 60, 21, 58, 51, 54, 17, 58,
19, 80, 81, 68, 5, 94, 47, 69, 28, 73, 92, 13, 86, 52, 17, 77, 4, 89, 55, 40,
4, 52, 8, 83, 97, 35, 99, 16, 7, 97, 57, 32, 16, 26, 26, 79, 33, 27, 98, 66,
88, 36, 68, 87, 57, 62, 20, 72, 3, 46, 33, 67, 46, 55, 12, 32, 63, 93, 53, 69,
4, 42, 16, 73, 38, 25, 39, 11, 24, 94, 72, 18, 8, 46, 29, 32, 40, 62, 76, 36,
20, 69, 36, 41, 72, 30, 23, 88, 34, 62, 99, 69, 82, 67, 59, 85, 74, 4, 36, 16,
20, 73, 35, 29, 78, 31, 90, 1, 74, 31, 49, 71, 48, 86, 81, 16, 23, 57, 5, 54,
1, 70, 54, 71, 83, 51, 54, 69, 16, 92, 33, 48, 61, 43, 52, 1, 89, 19, 67, 48]
def row(n): #finds the row of the numbers
number_row = n//20
return number_row
def hor_mult(n):
hor_final = 1
num = 1
for i in range(4):
if n < 17+20*row(n): #finds if the number is 4 digit away from the end of the row
num *= nums[n+i]
if num > hor_final: #if the number is higher than final prints number
print(num)
hor_final = num
else:
hor_final = hor_final #else the final num stays the same
else:
return hor_final #
for n in range(400):
print(hor_mult(n))
I am trying to find the biggest back to back multiplication of 4 number, but my code prints every 4 multiplication of back to back numbers.
First part of the code (def row)finds the row of the 4 numbers because all four numbers must be on the same row.
In the second part (hor_mult) I tried to find the biggest 4 back to back mult of nums

There are these issues:
if n < 17+20*row(n): is a condition that does not depend on the loop, so it should not appear inside the loop. It is also a quite complex way to say that the column index should be less than 17, so why not write a function col instead of row? You can use the % operator for that.
The check if num > hor_final should not be made while the product of four isn't completed yet, as there might still be a 0 to be included, making the product less than what it currently is. So this check should not be in the loop, but appear after it. Moreover, you want to compare products of mulitple calls of hor_mult, so this check shouldn't be inside that function, nor should hor_final be a local name inside that function. The result is only final when the loop in the main code (over 400) has finished, so hor_final should be defined there.
print(num): the function shouldn't print anything: it cannot know by itself whether the maximum was achieved as that depends on other calls of hor_mult. Printing is not a job for this function. This function's job should be just to return a product of four. It is for the caller to decide whether it is great enough and to print. That printing can only happen when all products have been calculated -- not before.
else: return hor_final: no, you shouldn't return a partial product, not even 1. As indicated earlier, the corresponding if condition should be outside the loop, and its else case should return 0 (the least possible product when input is non-negative), not hor_final.
In the main program loop, there are 400 calls of print. It should be clear that this is wrong. You want to execute print only once. The loop should serve to find out which returned value is the greatest. That's the purpose of the loop. After the loop you should print, and only then.
Here is how the code could be fixed:
def col(n):
return n % 20
def hor_mult(n):
if col(n) < 17: # Only bother looping when there is room for 4 values
num = 1
for value in nums[n: n+4]: # pythonic way to get those 4 values
num *= value
return num
else:
return 0 # When not enough values to make the product of 4.
# pythonic way to make those 400 calls and get the maximum
hor_final = max(map(hor_mult, range(400)))
print(hor_final) # only print when you have full information
Note that the Euler Project challenge asks more than just this. You also need to check the products in other directions, which will be more challenging. To be really honest with you, seeing the problems in your attempt, I think Euler Project challenges are going to get too difficult at this stage, and I would advise you to first practice on simpler challenges.

Power Sets too slow python

I am using the following function to find the subsets of a list L. However, when converting the output of the function powerset into a list it takes way too long. Any suggestion?
For clarification, this powerset function does not output the empty subset and the subset L itself (it is intentional).
My list L:
L = [0, 3, 5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107]
The code:
def powerset(s):
x = len(s)
masks = [1 << i for i in range(x)]
for i in range(1, (1 << x)-1):
yield [ss for mask, ss in zip(masks, s) if i & mask]
my_Subsets = list(powerset(L)) # <--- THIS TAKES WAY TOO LONG

Your set has 55 elements. Meaning 2^55=36028797018963968 subsets.
There's no way, in any language, any algorithm to make that fast. Because for each subset you need at least one allocation, and that single operation repeated 2^55 times will run forever. For example if we were to run one allocation per nanosecond (in reality this is orders of magnitude slower) we are looking at something over a year (if my calculations are correct). In Python probably 100 years. :P
Not to mention that the final result is unlikely to fit in the entire world's data storage (ram + hard drives) currently available. And definitely not in a single machine's storage. And so final list(...) conversion will fail with 100% probability, even if you wait those years.
Whatever you are trying to achieve (this is likely an XY problem) you are doing it the wrong way.

What you could do is create a class that will behave like a list but would only compute the items as needed and not actually store them:
class Powerset:
def __init__(self,base):
self.base = base
def __len__(self):
return 2**len(self.base)-2 # - 2 you're excluding empty and full sets
def __getitem__(self,index):
if isinstance(index,slice):
return [ self.__getitem__(i) for i in range(len(self))[index] ]
else:
return [ss for bit,ss in enumerate(self.base) if (1<<bit) & (index+1)]
L = [0, 3, 5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107]
P = Powerset(L)
print(len(P)) # 36028797018963966
print(P[:10]) # [[0], [3], [0, 3], [5], [0, 5], [3, 5], [0, 3, 5], [6], [0, 6], [3, 6]]
print(P[3:6]) # [[5], [0, 5], [3, 5]]
print(P[-3:]) # [[5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107], [0, 5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107], [3, 5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107]]
Obviously, if the next thing you do is a sequential search or traversal of the powerset, it will still take forever.

Split integer into equal chunks

What is the most efficient and reliable way in Python to split sectors up like this:
number: 101 (may vary of course)
chunk1: 1 to 30
chunk2: 31 to 61
chunk3: 62 to 92
chunk4: 93 to 101
Flow:
copy sectors 1 to 30
skip sectors in chunk 1 and copy 30 sectors starting from sector 31.
and so on...
I have this solved in a "manual" way using modules and basic math but there's got to be a function for this?
Thank you.

I assume that you will have number in a list format. So, in this case if you want very specific format of cluster of number sequence and you know where it should separate then using indexing is the best way as it will have less time complexity. So,you can always create a small code and make it a function to use repeatedly. Something like below:
def sectors(num_seq,chunk_size=30):
...: import numpy as np
...: sectors = int(np.ceil(len(num_seq)/float(chunk_size))) #create number of sectors
...: for i in range(sectors):
...: if i < (sectors - 1):
...: print num_seq[(chunk_size*i):(chunk_size*(i+1))] #All will chunk equal size except the last one.
...: else:
...: print num_seq[(chunk_size*i):] #Takes rest at the end.
Now, every time you want similar thing you can reuse it and it is efficient as you are defining list index value instead of searching through it.
Here is the output:
x = range(1,101)
print sectors(x)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
[31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60]
[61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90]
[91, 92, 93, 94, 95, 96, 97, 98, 99, 100]
Please let me know if this meets your requirement.

Easy and fast(single iteration):
>>> input = range(1, 102)
>>> n = 30
>>> output = [input[i:i+n] for i in range(0, len(input), n)]
>>> output
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], [31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60], [61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101]]
Another very simple and comprehensive way:
>>> f = lambda x,y: [ x[i:i+y] for i in range(0,len(x),y)]
>>> f(range(1, 102), 30)
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], [31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60], [61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101]]

You can try using numpy.histogram if you're looking to spit a number into equal sized bins (sectors).
This will create an array of numbers, demarcating each bin boundary:
import numpy as np
number = 101
values = np.arange(number, dtype=int)
bins = np.histogram(values, bins='auto')
print(bins)

Set community index as a vertex attribute with IGraph Python

When I detect communities on a graph with Igraph in Python, I get a result like this:
print g.community_multilevel(return_levels=False)
Clustering with 100 elements and 4 clusters
[0] 16, 17, 18, 20, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 39, 40, 44
[1] 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 19, 38, 92, 94, 96,
97, 98, 99
[2] 42, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 66, 67, 69
[3] 21, 41, 65, 68, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,
84, 85, 86, 87, 88, 89, 90, 91, 93, 95
I'm adding the corresponding community number as an attribute to each vertex like this:
for v in g.vs():
c = 0
for i in g.community_multilevel(return_levels=False):
if v.index in i:
print v.index,i,c
v["group"] = c
c += 1
Is there a more elegant way to achieve this?

What you are doing is terribly inefficient because you are running the community detection algorithm for every single iteration of the outer loop even though its result should be the same no matter how many times you run it. A much simpler way to do it would be:
cl = g.community_multilevel(return_levels=False)
g.vs["group"] = cl.membership

Percent list slicing

I'm using python 3.2.3 IDLE and this is my code:
originalList = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]
newList = orginalList[0.05:0.95] #<<<<I have no idea what I'm doing here
print (newList)
I have an original list of numbers, they are 1 - 100 and i want to make a new list from the original list however the new list must only have data that belongs to the sub-range 5%- 95% of the original list
so the new list must be like [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18....95]. How do i do that? i know my newList code is wrong

originalList.sort()
newList = originalList[int(len(originalList) * .05) : int(len(originalList) * .95)]

sl = slice(4, 95)
print(originalList[sl])
Also see http://docs.python.org/2/library/functions.html#slice

size = len(originalList)
newList = originalList[0.05*size - 1:0.95*size + 1]

If you want to get part of a list, the syntax is
List = [1,2,3,4,5,6,7,8,9,10]
newList = [*start index*:*Index to end AT*]
so, the first number is the index where the sub-list starts, while the second number is the index at which the sublist stops (that index is not included).
hope this helps!

I'd also use a list comprehension for creating the original list... less mistake prone.
originalList = range(1,101)
newList = originalList[(len(originalList)*.05)-1:len(originalList)*.95]
print newList
Gives the desired result...
Edit: Changed range to be more concise per comment below.

For lists of arbitrary length, you could do:
>>> l = range(200)
>>> percentage = 5
>>> skip = int(len(l) * (float(percentage) / 100) / 2)
>>> len(l[skip:-skip])
190

You could use the fidx module, which allows percentages as indexes:
import fidx
originalList = fidx([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100])
# or better: originalList = fidx.list(range(1,101))
newList = originalList[0.05:0.95]
print (newList)
which returns
[6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

loop over column and average elements at fixed interval - python

Related

This is about the euler 11th python

Power Sets too slow python

Split integer into equal chunks

Set community index as a vertex attribute with IGraph Python

Percent list slicing

Categories

Resources