I am trying to find the indices of the starting position of each positive value sequence. I only got the position of the positive values in the code. My code looks like following:
index = []
for i, x in enumerate(lst):
if x > 0:
index.append(i)
print index
I expect the output of [-1.1, 2.0, 3.0, 4.0, 5.0, -2.0, -3.0, -4.0, 5.5, 6.6, 7.7, 8.8, 9.9] to be [1, 8]
I think it would better if you use list comprehension
index = [i for i, x in enumerate(lst) if x > 0]
Currently you are selecting all indexes where the number is positive, instead you would want to collect the index only when a number switches from negative to positive.
Additionally you can handle all negative numbers, or numbers starting from positive as well
def get_pos_indexes(lst):
index = []
#Iterate over the list using indexes
for i in range(len(lst)-1):
#If first element was positive, add 0 as index
if i == 0:
if lst[i] > 0:
index.append(0)
#If successive values are negative and positive, i.e indexes switch over, collect the positive index
if lst[i] < 0 and lst[i+1] > 0:
index.append(i+1)
#If index list was empty, all negative characters were encountered, hence add -1 to index
if len(index) == 0:
index = [-1]
return index
print(get_pos_indexes([-1.1, 2.0, 3.0, 4.0, 5.0, -2.0, -3.0, -4.0, 5.5, 6.6, 7.7, 8.8, 9.9]))
print(get_pos_indexes([2.0, 3.0, 4.0, 5.0, -2.0, -3.0, -4.0, 5.5, 6.6, 7.7, 8.8, 9.9]))
print(get_pos_indexes([2.0,1.0,4.0,5.0]))
print(get_pos_indexes([-2.0,-1.0,-4.0,-5.0]))
The output will be
[1, 8]
[0, 7]
[0]
[-1]
Related
The question is simple.
Suppose we have Series with this values:
srs = pd.Series([7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0])
How can I find place (index) of subseries 1.0, 2.0, 3.0?
Using a rolling window we can find the first occurrence of a list a.It puts a 'marker' (e.g. 0, any non-Nan value will be fine) at the end (right border) of the window. Then we use first_valid_index to find the index of this element and correct this value by the window size:
a = [1.0, 2.0, 3.0]
srs.rolling(len(a)).apply(lambda x: 0 if (x == a).all() else np.nan).first_valid_index()-len(a)+1
Output:
2
The simplest solution might be to use list comprehension:
a = srs.tolist() # [7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0]
b = [1.0, 2.0, 3.0]
[x for x in range(len(a)) if a[x:x+len(b)] == b]
# [2]
One naive way is to iterate over the series, subset the n elements and compare if they are equal to the given list:
Here the code:
srs = pd.Series([7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0])
sub_list = [1.0, 2.0, 3.0]
n = len(sub_list)
index_matching = []
for i in range(srs.shape[0] - n + 1):
sub_srs = srs.iloc[i: i+n]
if (sub_srs == sub_list).all():
index_matching.append(sub_srs.index)
print(index_matching)
# [RangeIndex(start=2, stop=5, step=1)]
Or in one line with list comprehension:
out = [srs.iloc[i:i+n].index for i in range(srs.shape[0] - n + 1) if (srs.iloc[i: i+n] == sub_list).all()]
print(out)
# [RangeIndex(start=2, stop=5, step=1)]
If you want an explicit list:
real_values = [[i for i in idx] for idx in out]
print(real_values)
# [[2, 3, 4]]
I am working with the following dictionary:
d = {'inds':[0, 3, 7, 3, 3, 5, 1], 'vals':[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]}
I am wanting to create a new_list that takes the values in list d['vals'] and places them in new_list by corresponding index in list d['inds']. The ultimate result should be:
[1.0, 7.0, 0.0, 11.0, 0.0, 6.0, 0.0, 3.0]
This takes the following:
d['inds'] == [0, 3, 7, 3, 3, 5, 1]
d['vals'] == [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]
For any index position not included in d['inds'] the corresponding value is 0.0.
For index positions that are repeated, the True value for that position is the sum of the individual values. For example, above 3 is repeated 3 times; so, new_list[3] should == 11, which is the sum of 2.0 + 4.0 + 5.0.
First, allocate a list of the appropriate length and full of zeroes:
result = [0] * (max(d['inds']) + 1)
Then loop over the indices and values and add them to the values in the list:
for ind, value in zip(d['inds'], d['vals']):
result[ind] += value
Output:
>>> result
[1.0, 7.0, 0, 11.0, 0, 6.0, 0, 3.0]
After collaborating with a co-worker, who helped walked me through this, the following was arrived at for a more dynamic function (to allow for different lengths of the resulting list):
import numpy as np
d ={
'inds': [0,3,7,3,3,5,1],
'vals': list(range(1,8))}
## this assumes the values in the list associated with the 'vals' key
## remain in numerical order due to range function.
def newlist(dictionary, length) ##length must be at least max(d['inds'])+1
out = np.zeroes(length)
for i in range (len(dictionary['inds'])):
out[dictionary['inds'][i]] += d['vals'][i]
return(out)
I'm trying to add up certain elements of two lists that are related. I will put an example so you understand what I'm talking about. In the end I write the code I have, it works but I want to optimize it, otherwise I have to write lots of things by hand. Apologies if the question is not interesting.
list1 = [4.0, 8.0, 14.0, 20.0, 22.0, 26.0, 28.0, 30.0, 32.0, 34.0, 36.0, 38.0, 40.0]
list2 = [2.1, 1.8, 9.5, 5., 5.4, 6.7, 3.3, 5.3, 8.8, 9.4, 5., 9.3, 3.1]
List 1 corresponds to time, so what I want to do is to cluster everything every 10 [units of time], i.e. from list1 I can see that the first and second element belong to the range 0-10, so I would need to add their corresponding points in list2. Later from list1 I see that the third and fourth elements belong to the range (10< time <= 20), so I add the same elements in list2, later for the third range, I need to add the following 4 elements in list3 and so on. In the end I would like to create 2 new lists
list3 = [10., 20., 30., 40.]
list4 = [3.9, 14.5, 20.7, 35.6]
The code I wrote is the following:
list1 = [4.0, 8.0, 14.0, 20.0, 22.0, 26.0, 28.0, 30.0, 32.0, 34.0, 36.0, 38.0, 40.0]
list2 = [2.1, 1.8, 9.5, 5., 5.4, 6.7, 3.3, 5.3, 8.8, 9.4, 5., 9.3, 3.1]
list3 = numpy.arange(0., 40., 10.)
a = [[] for i in range(4)]
for i, j in enumerate(list1):
if 0.<=j<=10.:
a[0].append(list2[i])
elif 10.<j<=20.:
a[1].append(list2[i])
elif 20.<j<=30.:
a[2].append(list2[i])
elif 30.<j<=40.:
a[3].append(list2[i])
list4 = [sum(i) for i in a]
it works, however, list1 in reality is way more larger (few orders of magnitude) and I don't want to write all the if's by hand (as well as the sublists I make). Any suggestions will be appreciated.
First of all if we are talking about huge sets, I would use numpy, pandas, or another tool that is designed for this. From my experience, Python itself is not designed to work for things with more than 10M elements (unless there is a structure in the data you can exploit).
Now we can use this as follows:
import numpy as np
# construct lists
l1 = np.array(list1)
l2 = np.array(list2)
# determine the "groups" of the values
g = (l1-0.00001)//10
# create a boolean mask that determines where the groups change
flag = np.concatenate(([True], g[1:] != g[:-1]))
# determine the indices of the swaps
inv_idx, = flag.nonzero()
# calculate the sum per subrange
result = np.add.reduceat(list2,inv_idx)
For your sample output, this gives:
>>> result
array([ 3.9, 14.5, 20.7, 35.6])
The 0.00001 is used to push a 20.0 to some 19.9999 is and thus assign it to group 1 instead of group 2. The advantage of this approach is that (a) it works for an arbitrary number of "groups" and (b) a fixed number of "swipes" are done over the list so it scales linear with the number of elements in the list.
If you transform your list in numpy.array, there are easy way to extract some stuff in a 1D-array based on another one:
import numpy
list1 = numpy.array([4.0, 8.0, 14.0, 20.0, 22.0, 26.0, 28.0, 30.0, 32.0, 34.0, 36.0, 38.0, 40.0])
list2 = numpy.array([2.1, 1.8, 9.5, 5., 5.4, 6.7, 3.3, 5.3, 8.8, 9.4, 5., 9.3, 3.1])
step = 10
r, s = range(0,50,10), []
for i in r:
s.append(numpy.sum([l for l in list2[(list1 > i) & (list1 <= i+step)]]))
print r[1:], s[:-1]
#[10, 20, 30, 40] [3.9, 14.5, 20.7, 35.6]
Edit
In one line:
s = [numpy.sum([l for l in list2[(list1 > i) & (list1 < i+step)]]) for i in r]
Let's say I have a list of floats. I was wondering how I would loop through the list and whenever a negative value occurs, to split the list into two separate lists.
The initial set of values:
[0.1,
0.5,
3.2,
8.2,
0.0,
19.7,
0.0,
-0.8,
-12.0,
-8.2,
-2.5,
-6.9,
-1.3,
0.0]
Example result I am looking for:
listA = [0.1, 0.5, 3.2, 8.2, 0.0, 19.7, 0.0]
listB = [-0.8, -12.0, -8.2, -2.5, -6.9, -1.3, 0.0]
The key here would be that the length of the list would vary, and the position at which the first negative value occurs is never the same.
So in short: wherever the first negative value occurs, split into two separate lists.
Any ideas? Any help would be greatly appreciated.
-Cheers
First, you may use generator expression to find the index of the first negative value:
neg = next((i for i, v in enumerate(values) if v < 0), -1)
Then, slice your list (assuming neg != -1):
listA, listB = values[:neg], values[neg:]
The idea is very simple, looping through your list, if the number is positive then add it to the first list, if the number is negative then turn the saw_negative = True and from now on append to the second list.
li = [0.1, 0.5, 3.2, 8.2, 0.0, 19.7, 0.0, -0.8, -12.0, -8.2, -2.5, -6.9, -1.3, 0.0]
first_li = []
second_li = []
saw_negative = False
for item in li:
if item >= 0 and not saw_negative:
first_li.append(item)
elif item < 0 or saw_negative:
saw_negative = True
second_li.append(item)
print first_li
print second_li
Output:
[0.1, 0.5, 3.2, 8.2, 0.0, 19.7, 0.0]
[-0.8, -12.0, -8.2, -2.5, -6.9, -1.3, 0.0]
This is another approach, until the number is negative append the number to the first list, whenever the number is negative append the rest of the list to the second list and break the loop
li = [0.1, 0.5, 3.2, 8.2, 0.0, 19.7, 0.0, -0.8, -12.0, -8.2, -2.5, -6.9,
-1.3, 0.0]
first_li = []
second_li = []
for index, item in enumerate(li):
if item < 0:
second_li = li[index:]
break
first_li.append(item)
print first_li
print second_li
Output:
[0.1, 0.5, 3.2, 8.2, 0.0, 19.7, 0.0]
[-0.8, -12.0, -8.2, -2.5, -6.9, -1.3, 0.0]
This can also be done in functional style using the groupby and chain functions from the itertools standard library module:
from itertools import groupby, chain
def split_at_first_negative(lst):
"""Split the list at the first occurrence of a negative value.
>>> split_at_first_negative([1, 2, 3, -1, -5, -3, 5, -6, 1])
([1, 2, 3], [-1, -5, -3, 5, -6, 1])
"""
groups = groupby(lst, lambda x: x >= 0)
first = list(next(groups)[1])
second = list(chain.from_iterable(g[1] for g in groups))
return first, second
A Posn is a list of length two [x,y], where
x and y are both Float values, corresponding to
the x and y coordinates of the point, respectively.
make_posn: float float -> Posn
def make_posn(x_coord, y_coord):
return [x_coord, y_coord]
How do I add all the x-values in a list of Posns?
Ex: [ [3.0, 4.0], [8.0, -1.0], [0.0, 2.0]] would be 11
sum them:
In [2]: sum(x[0] for x in [ [3.0, 4.0], [8.0, -1.0], [0.0, 2.0]])
Out[2]: 11.0
The following piece of code should work for your
_sum = 0.0
for sublist in [ [3.0, 4.0], [8.0, -1.0], [0.0, 2.0]]:
_sum += sublist[0]
It initializes a sum accumulator to zero and then iterates over the sublist elements of the list to add the value of the first element of each list, to the initial sum