Related
I have a time-series A holding several values. I need to obtain a series B that is defined algebraically as follows:
B[t] = a * A[t] + b * B[t-1]
where we can assume B[0] = 0, and a and b are real numbers.
Is there any way to do this type of recursive computation in Pandas? Or do I have no choice but to loop in Python as suggested in this answer?
As an example of input:
> A = pd.Series(np.random.randn(10,))
0 -0.310354
1 -0.739515
2 -0.065390
3 0.214966
4 -0.605490
5 1.293448
6 -3.068725
7 -0.208818
8 0.930881
9 1.669210
As I noted in a comment, you can use scipy.signal.lfilter. In this case (assuming A is a one-dimensional numpy array), all you need is:
B = lfilter([a], [1.0, -b], A)
Here's a complete script:
import numpy as np
from scipy.signal import lfilter
np.random.seed(123)
A = np.random.randn(10)
a = 2.0
b = 3.0
# Compute the recursion using lfilter.
# [a] and [1, -b] are the coefficients of the numerator and
# denominator, resp., of the filter's transfer function.
B = lfilter([a], [1, -b], A)
print B
# Compare to a simple loop.
B2 = np.empty(len(A))
for k in range(0, len(B2)):
if k == 0:
B2[k] = a*A[k]
else:
B2[k] = a*A[k] + b*B2[k-1]
print B2
print "max difference:", np.max(np.abs(B2 - B))
The output of the script is:
[ -2.17126121e+00 -4.51909273e+00 -1.29913212e+01 -4.19865530e+01
-1.27116859e+02 -3.78047705e+02 -1.13899647e+03 -3.41784725e+03
-1.02510099e+04 -3.07547631e+04]
[ -2.17126121e+00 -4.51909273e+00 -1.29913212e+01 -4.19865530e+01
-1.27116859e+02 -3.78047705e+02 -1.13899647e+03 -3.41784725e+03
-1.02510099e+04 -3.07547631e+04]
max difference: 0.0
Another example, in IPython, using a pandas DataFrame instead of a numpy array:
If you have
In [12]: df = pd.DataFrame([1, 7, 9, 5], columns=['A'])
In [13]: df
Out[13]:
A
0 1
1 7
2 9
3 5
and you want to create a new column, B, such that B[k] = A[k] + 2*B[k-1] (with B[k] == 0 for k < 0), you can write
In [14]: df['B'] = lfilter([1], [1, -2], df['A'].astype(float))
In [15]: df
Out[15]:
A B
0 1 1
1 7 9
2 9 27
3 5 59
Working with Pandas, I have to rewrite queries implemented as a dict:
query = {"height": 175}
The key is the attribute for the query and the value could be a scalar or iterable.
In the first part I check if the value is not NaN and scalar.
If this condition holds I write the query expression with the == symbol, but else if the value is Iterable I would need to write the expression with the in keyword.
This is the actual code that I need to fix in order to work also with Iterables.
import numpy as np
from collections import Iterable
def query_dict_to_expr(query: dict) -> str:
expr = " and ".join(["{} == {}"
.format(k, v) for k, v in query.items()
if (not np.isnan(v)
and np.isscalar(v))
else "{} in #v".format(k) if isinstance(v, Iterable)
]
)
return expr
but I got invalid syntax in correspondence with the else statement.
If I understand correctly, you don't need to check the type:
In [47]: query
Out[47]: {'height': 175, 'lst_col': [1, 2, 3]}
In [48]: ' and '.join(['{} == {}'.format(k,v) for k,v in query.items()])
Out[48]: 'height == 175 and lst_col == [1, 2, 3]'
Demo:
In [53]: df = pd.DataFrame(np.random.randint(5, size=(5,3)), columns=list('abc'))
In [54]: df
Out[54]:
a b c
0 0 0 3
1 4 2 4
2 2 2 3
3 0 1 0
4 0 4 1
In [55]: query = {"a": 0, 'b':[0,4]}
In [56]: q = ' and '.join(['{} == {}'.format(k,v) for k,v in query.items()])
In [57]: q
Out[57]: 'a == 0 and b == [0, 4]'
In [58]: df.query(q)
Out[58]:
a b c
0 0 0 3
4 0 4 1
You misplaces the if/else in the comprehension. If you put the if after the for, like f(x) for x in iterable if g(x), this will filter the elements of the iterable (and can not be combined with an else). Instead, you want to keep all the elements, i.e. use f(x) for x in iterable where f(x) just happens to be a ternary expression, i.e. in the form a(x) if c(x) else b(x).
Instead, try like this (simplified non-numpy example):
>>> query = {"foo": 42, "bar": [1,2,3]}
>>> " and ".join(["{} == {}".format(k, v)
if not isinstance(v, list)
else "{} in {}".format(k, v)
for k, v in query.items()])
'foo == 42 and bar in [1, 2, 3]'
x = [1, 3, 2, 5, 7]
Starting from the first value of a list:
if the next value is greater, it prints "\nthe value x is greater than y"
if the next value is equal, it prints "\nthe value x is equal to y"
if the next value is smaller, it prints "\nthe value x is smaller than y"
How do I translate this into the exact Python code? I'm actually working with a pandas data frame, I just simplified it by using a list as an example.
x = [1, 3, 2, 5, 7]
With the given above, the output should be like this:
the value 3 is greater than 1
the value 2 is smaller than 3
the value 5 is greater than 2
the value 7 is greater than 5
Directly generate the output using str.join and a list comprehension zipping the list with a shifted version of itself for comparing inside the comprehension:
x = [1, 3, 2, 5, 7]
output = "\n".join(["the value {} is {} than {}".format(b,"greater" if b > a else "smaller",a) for a,b in zip(x,x[1:])])
print(output)
(note that "greater than" or "smaller than" isn't strict and applies to equal values even if it's confusing, so maybe a third alternative could be created to handle those cases as Benedict suggested, if the case can happen)
result:
the value 3 is greater than 1
the value 2 is smaller than 3
the value 5 is greater than 2
the value 7 is greater than 5
you can fiddle with the linefeeds with those variants:
"".join(["the value {} is {} than {}\n" ...
or
"".join(["\nthe value {} is {} than {}" ...
Python 2 one liner:
[print(str(l[i+1])+" is greater than" + str(l[i])) if l[i+1]>l[i] else print(str(l[i+1])+" is smaller than" + str(l[i])) for i in range(len(l)-1)]
Another Python 2 one-liner. This one handles equal items.
x = [1, 3, 2, 5, 5, 7]
print '\n'.join('the value %s is %s %s'%(u,['equal to','greater than','less than'][cmp(u,v)],v)for u,v in zip(x[1:],x))
output
the value 3 is greater than 1
the value 2 is less than 3
the value 5 is greater than 2
the value 5 is equal to 5
the value 7 is greater than 5
Can be made runnable with python 3 by defining cmp as:
cmp = lambda x,y : 0 if x==y else -1 if x < y else 1
One could just use a for-loop and ternary operators, as follows;
x = [1, 3, 2, 5, 7]
for i in range(len(x)-1):
comparison = "greater than" if x[i+1]>x[i] else ("equal to" if x[i+1]==x[i] else "less than")
print("The value {0} is {1} {2}.".format(x[i+1],comparison,x[i]))
def cmp(item1, item2):
if item2 == item1:
return "{} is equal to {}".format(item2, item1)
elif item2 >= item1:
return "{} is greater than {}".format(item2, item1)
elif item2 <= item1:
return "{} is less than {}".format(item2, item1)
else:
return "Invalid item(s)."
x = [1, 3, 2, 5, 7]
for i in range(len(x)-1):
print(cmp(x[i],x[i+1]))
x = [1,4,5,3,4]
for i in range(0, len(x) - 1):
out = "is equal to"
if (x[i] < x[i + 1]):
out = "is greater than"
elif (x[i] > x[i + 1]):
out = "is less than"
print ("%s %s %s" % (x[i + 1], out, x[i]))
Do you want an explanation also?
Edit:
Oops, and it would output:
4 is greater than 1
5 is greater than 4
3 is less than 5
4 is greater than 3
Using recursion:
def foo(n, remaining):
if not remaining:
return
if n < remaining[0]:
print('the value {} is greater than {}'.format(remaining[0], n))
else:
print('the value {} is smaller than {}'.format(remaining[0], n))
foo(remaining[0], remaining[1:])
def the_driver(num_list):
foo(num_list[0], num_list[1:])
if __name__ == '___main__':
x = [1, 3, 2, 5, 7]
the_driver(x)
Example using lambda function
x = [1, 3, 2, 5, 7]
greater = lambda a, b: a > b
old_i = x[0]
for i in x[1::]:
if old_i :
print(i,"is",
"greater" if greater(i, old_i) else "smaller","than",old_i)
old_i = i
Output
3 is greater than 1
2 is smaller than 3
5 is greater than 2
7 is greater than 5
let's say i have several numbers and i want to keep reducing each of them by some given value. each number must stop reducing at a pre determined value. My code is below.
a = b = c = 100
x = y = 1
print a, b, x, y
s = 1
while s:
if a >= 11:
a -= x
if b >= 2:
b -= y
if c >= 21:
c -= y
print a, b, c
if a == 10 and b == 1 and c == 20:
s = 0
Can this be done in a more efficient way?
Why not use python lists, that way you can define any amount of numbers each with their own decrease and their own stop limit, adding a number is as simple as adding the values to each list rather than having to type extra code;
numbers = [10, 100, 1000]
decrease = [1, 10, 100]
stop = [5, 50, 500]
b = True
while b:
b = False
for c, n in enumerate(numbers):
if n <= stop[c]:
continue
numbers[c] = n - decrease[c]
b = True
print numbers # [5, 50, 500]
I've got the following function which checks to see if any of the strings in b is present in a. This works fine.
a = "a b c d c"
b = ["a", "c", "e"]
if any(x in a for x in b):
print True
else:
print False
I would like to modify it to tell me how many of the strings in b where found in a, which in this case is 2 - a and c. Although c is found twice, it shouldn't make a difference.
How can I do this?
Just change any to sum
print(sum(x in a for x in b)) # prints 2
Here's how it is working:
>>> [x in a for x in b]
[True, True, False]
>>> t = [x in a for x in b]
>>> sum(t) # sum() is summing the True values here
2
This can be done with sum(map(lambda x: 1 if x in a else 0, b)) or sum([1 if x in a else 0 for x in b])
this will do what you want:
def anycount(it):
return len([e for e in it if e])
a = "a b c d c"
b = ["a", "c", "e"]
print (anycount(x in a for x in b))
2