How to find part of series in some series - python

The question is simple.
Suppose we have Series with this values:
srs = pd.Series([7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0])
How can I find place (index) of subseries 1.0, 2.0, 3.0?

Using a rolling window we can find the first occurrence of a list a.It puts a 'marker' (e.g. 0, any non-Nan value will be fine) at the end (right border) of the window. Then we use first_valid_index to find the index of this element and correct this value by the window size:
a = [1.0, 2.0, 3.0]
srs.rolling(len(a)).apply(lambda x: 0 if (x == a).all() else np.nan).first_valid_index()-len(a)+1
Output:
2

The simplest solution might be to use list comprehension:
a = srs.tolist() # [7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0]
b = [1.0, 2.0, 3.0]
[x for x in range(len(a)) if a[x:x+len(b)] == b]
# [2]

One naive way is to iterate over the series, subset the n elements and compare if they are equal to the given list:
Here the code:
srs = pd.Series([7.0, 2.0, 1.0, 2.0, 3.0, 5.0, 4.0])
sub_list = [1.0, 2.0, 3.0]
n = len(sub_list)
index_matching = []
for i in range(srs.shape[0] - n + 1):
sub_srs = srs.iloc[i: i+n]
if (sub_srs == sub_list).all():
index_matching.append(sub_srs.index)
print(index_matching)
# [RangeIndex(start=2, stop=5, step=1)]
Or in one line with list comprehension:
out = [srs.iloc[i:i+n].index for i in range(srs.shape[0] - n + 1) if (srs.iloc[i: i+n] == sub_list).all()]
print(out)
# [RangeIndex(start=2, stop=5, step=1)]
If you want an explicit list:
real_values = [[i for i in idx] for idx in out]
print(real_values)
# [[2, 3, 4]]

Related

Python How to Decompress a dictionary

I have a dictionary with:
inds = [0, 3, 7, 3, 3, 5, 1]
vals = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]
d = {'inds': inds, 'vals': vals}
print(d) will get me: {'inds': [0, 3, 7, 3, 3, 5, 1], 'vals': [1.0, 2.0, 3.0, 4.0, 5.0, 6.0,
7.0]}
As you can see, inds(keys) are not ordered, there are dupes, and there are missing ones: range is 0 to 7 but there are only 0,1,3,5,7 distinct integers. I want to write a function that takes the dictionary (d) and decompresses this into a full vector like shown below. For any repeated indices (3 in this case), I'd like to sum the corresponding values, and for the missing indices, want 0.0.
# ind: 0 1 2 3* 4 5 6 7
x == [1.0, 7.0, 0.0, 11.0, 0.0, 6.0, 0.0, 3.0]
Trying to write a function that returns me a final list... something like this:
def decompressor (d, n=None):
final_list=[]
for i in final_list:
final_list.append()
return(final_list)
# final_list.index: 0 1 2 3* 4 5 6 7
# final_list = [1.0, 7.0, 0.0, 11.0, 0.0, 6.0, 0.0, 3.0]
Try it,
xyz = [0.0 for x in range(max(inds)+1)]
for i in range(max(inds)):
if xyz[inds[i]] != 0.0:
xyz[inds[i]] += vals[i]
else:
xyz[inds[i]] = vals[i]
Some things are still not clear to me but supposing you are trying to make a list in which the maximum index is the one you can find in your inds list, and you want a list as a result you can do something like this:
inds = [0, 3, 7, 3, 3, 5, 1]
vals = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]
#initialize a list of zeroes with lenght max index
res=[float(0)]*(max(inds)+1)
#[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
#Loop indexes and values in pairs
for i, v in zip(inds, vals):
#Add the value to the corresponding index
res[i] += v
print (res)
#[1.0, 7.0, 0.0, 11.0, 0.0, 6.0, 0.0, 3.0]
inds = [0, 3, 7, 3, 3, 5, 1]
vals = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]
first you have to initialise the dictionary , ranging from min to max value in the inds list
max_id = max(inds)
min_id = min(inds)
my_dict={}
i = min_id
while i <= max_id:
my_dict[i] = 0.0
i = i+1
for i in range(len(inds)):
my_dict[inds[i]] += vals[i]
my_dict = {0: 1.0, 1: 7.0, 2: 0, 3: 11.0, 4: 0, 5: 6.0, 6: 0, 7: 3.0}

Creating List From Dictionary that Specifies Position by Index and Skips Some Index Positions

I am working with the following dictionary:
d = {'inds':[0, 3, 7, 3, 3, 5, 1], 'vals':[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]}
I am wanting to create a new_list that takes the values in list d['vals'] and places them in new_list by corresponding index in list d['inds']. The ultimate result should be:
[1.0, 7.0, 0.0, 11.0, 0.0, 6.0, 0.0, 3.0]
This takes the following:
d['inds'] == [0, 3, 7, 3, 3, 5, 1]
d['vals'] == [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]
For any index position not included in d['inds'] the corresponding value is 0.0.
For index positions that are repeated, the True value for that position is the sum of the individual values. For example, above 3 is repeated 3 times; so, new_list[3] should == 11, which is the sum of 2.0 + 4.0 + 5.0.
First, allocate a list of the appropriate length and full of zeroes:
result = [0] * (max(d['inds']) + 1)
Then loop over the indices and values and add them to the values in the list:
for ind, value in zip(d['inds'], d['vals']):
result[ind] += value
Output:
>>> result
[1.0, 7.0, 0, 11.0, 0, 6.0, 0, 3.0]
After collaborating with a co-worker, who helped walked me through this, the following was arrived at for a more dynamic function (to allow for different lengths of the resulting list):
import numpy as np
d ={
'inds': [0,3,7,3,3,5,1],
'vals': list(range(1,8))}
## this assumes the values in the list associated with the 'vals' key
## remain in numerical order due to range function.
def newlist(dictionary, length) ##length must be at least max(d['inds'])+1
out = np.zeroes(length)
for i in range (len(dictionary['inds'])):
out[dictionary['inds'][i]] += d['vals'][i]
return(out)

how to conditionally replace values in list of lists in python

I am a python new bie.
I have two list of lists like this:
A = List[List[float,float]]
B = List[List[float,float]]
both different sizes
Ex :
A has [time, value]
B has [start_time, end_time]
A = [[0.0,10.0] , [1.0,10.0], [2.0,10.0], [3.0,10.0], [4.0,10.0], [5.0,10.0], [6.0,10.0]]
B = [[0.0,2.0], [5.0,6.0]]
What I am trying to do is :
if A has a time which is not in B, I should make the corresponding 'value' in A to zero.
So output would be :
[[0.0,10.0] , [1.0,10.0], [2.0,10.0], [3.0,0.0], [4.0,0.0], [5.0,10.0], [6.0,10.0]]
i.e if a time segment in B has no corresponding time value present in A, the value corresponding to that time should be made zero. In this case, the values between 2 and 5 are there in B , so partner values of'3' and '4' in A are made to zeroes.
Please tell me how to do it.
I have referred here: Python - How to change values in a list of lists? and List of lists into numpy array
So One idea i got was to convert B into single list and then compare values from AA and A. However I haven't made much progress.
AA = numpy.hstack(B) # for getting array of times
for i in 1: len(AA):
if (AA[i]==A[
For simple problems, a nested loop design can often get you to a quick solution without needing to worry about the specifics of list-flattening functions.
for i in range(len(A)):
time = A[i][0]
isValidTime = False
for time_segment in B:
if time_segment[0] <= time <= time_segment[1]:
isValidTime = True
break
if not isValidTime:
A[i][1] = 0.0
Edit: Just to be clear, including the 'break' statement isn't necessary to get to the solution, but it helps avoid unnecessary computation. If we've determined that an item in A does have a valid time, we can safely stop searching through the time segments in B and move on to the next item of A.
If you flatten B, then you can compare it:
A = [[0.0, 10.0], [1.0, 10.0], [2.0, 10.0], [3.0, 10.0], [4.0, 10.0], [5.0, 10.0], [6.0, 10.0]]
B = [[0.0, 2.0], [5.0, 6.0]]
Bvals = [item for sublist in B for item in sublist]
print(Bvals)
newA = [x if x[0] in Bvals else [x[0], 0.0] for x in A]
print(newA)
Outputs:
[0.0, 2.0, 5.0, 6.0]
[[0.0, 10.0], [1.0, 0.0], [2.0, 10.0], [3.0, 0.0], [4.0, 0.0], [5.0, 10.0], [6.0, 10.0]]
I suppose the B list contains time intervals, right? In that case, you can do something like the following:
updated = [x if any([start <= x[0] end <= for start, end in B]) else [x[0], 0] for x in A]
That may be a bit too compact for some people, but it essentially does the same as the following:
updated = []
for time, value in A:
for start, end in B:
if start <= time and time <= end:
updated.append([time, value])
break
updated.append([time, 0])
On a side note, if you are doing interval checking, this is probably not the most efficient way to do it. Take a look at interval trees, which is a data structure for performing various interval-related queries (yours included).
A Pythonic one-liner:
[[t,v if any(s <= t and t <= e for s,e in B) else 0.0] for t,v in A]
which gives:
[[0.0, 10.0], [1.0, 10.0], [2.0, 10.0], [3.0, 0.0], [4.0, 0.0], [5.0, 10.0], [6.0, 10.0]]

Split list after repeating elements

I have this loop for creating a list of coefficients:
for i in N:
p = 0
for k in range(i+1):
p += (x**k)/factorial(k)
c.append(p)
For example N = [2, 3, 4] would give list c:
[1.0, 2.0, 2.5, 1.0, 2.0, 2.5, 2.6666666666666665, 1.0, 2.0, 2.5, 2.6666666666666665, 2.708333333333333]
I want a way of making separate lists after each 1.0 element. For example a nested list:
[[1.0, 2.0, 2.5], [1.0, 2.0, 2.5, 2.6666666666666665], [1.0, 2.0, 2.5, 2.6666666666666665, 2.708333333333333]]
I was thinking of using an if test, like
for c_ in c:
if c_ == 1.0:
anotherList.append(c_)
This only appends 1.0's though and I don't know how I can make it append everything after a one instead of just 1.0.
you can use itertools.groupby in list comprehension :
>>> [[1.0]+list(g) for k,g in itertools.groupby(l,lambda x:x==1.0) if not k]
[[1.0, 2.0, 2.5], [1.0, 2.0, 2.5, 2.6666666666666665], [1.0, 2.0, 2.5, 2.6666666666666665, 2.708333333333333]]
Try something like
another_list = []
for c_ in c:
if c_ == 1.0:
another_list.append([])
another_list[-1].append(c_)
Thanks for the suggestion #James Jenkinson

Python: how to add first value in each list

A Posn is a list of length two [x,y], where
x and y are both Float values, corresponding to
the x and y coordinates of the point, respectively.
make_posn: float float -> Posn
def make_posn(x_coord, y_coord):
return [x_coord, y_coord]
How do I add all the x-values in a list of Posns?
Ex: [ [3.0, 4.0], [8.0, -1.0], [0.0, 2.0]] would be 11
sum them:
In [2]: sum(x[0] for x in [ [3.0, 4.0], [8.0, -1.0], [0.0, 2.0]])
Out[2]: 11.0
The following piece of code should work for your
_sum = 0.0
for sublist in [ [3.0, 4.0], [8.0, -1.0], [0.0, 2.0]]:
_sum += sublist[0]
It initializes a sum accumulator to zero and then iterates over the sublist elements of the list to add the value of the first element of each list, to the initial sum

Categories

Resources