I know this is a very simple question, but I couldn't find it. Basically what I'm trying to is have a=2, b=3, and I need a+b=b so when I add a and b what it equals needs to be b.
Either:
b = a + b
# make b equal to a + b
Or
b += a
# increase the value of b by a
declare variables
a = 2
b = 3
operates
b = a+b
You can use the += shorthand for summation on variable.
For instance, in order to add in the value of a in the b variable, and retaining the last value of b:
b += a
This is the same thing as saying: b = a + b. Both works fine. Choose as you feel is readable.
Here is a quick example:
First, initialize your variables:
a = 2
b = 3
print a, b
Output: 2 3
You can add a to b, this way:
b += a
print b
Output: 5
You can also add to the actual value of b 10
b = 10+b
print b
Output: 15
Related
This question already has answers here:
Multiple assignment and evaluation order in Python
(11 answers)
Closed 2 years ago.
Recently I was reading through the official Python documentation when I came across the example on how to code the Fibonacci series as follows:
a, b = 0, 1
while a < 10:
print (a)
a, b = b, a + b
which outputs to 0,1,1,2,3,5,8
Since I've never used multiple assignment myself, I decided to hop into Visual Studio to figure out how it worked. I noticed that if I changed the notation to...
a = 0
b = 1
while a < 10:
print (a)
a, b = b, a + b
... the output remains the same.
However, if I change the notation to...
a = 0
b = 1
while a < 10:
print(a)
a = b
b = a + b
... the output changes to 0, 1, 2, 4, 8
The way I understand multiple assignments is that it shrinks what can be done into two lines into one. But obviously, this reasoning must be flawed if I can't apply this logic to the variables under the print(a) command.
It would be much appreciated if someone could explain why this is/what is wrong with my reasoning.
a = 0
b = 1
while a < 10:
print(a)
a = b
b = a + b
In this case, a becomes b and then b becomes the changed a + b
a, b = 0, 1
while a < 10:
print (a)
a, b = b, a+b
In this case, a becomes b and at the same time b becomes the original a + b.
That's why, in your case b becomes the new a + b, which, since a = b, basically means b = b + b. That's why the value of b doubles everytime.
When you do a, b = d, e the order in which assignment happens in from right to left. That is, b is first given the value of e and then the other assignment happens. So when you do a, b = b, a + b what you are effectively writing is,
b = a + b
a = b
Hence the difference.
You can verify this by doing
a = 0
b = 1
while a < 10:
a, b = b, a + b
print(a, b)
the first output is 1 1. So first b becomes 0+1 and then a is given the value of b=a making it also 1.
If you want more details on how this works, you can check out this question.
In a multiple assignment, the right side is always computed first.
In effect,
a, b = b, a + b
is the same as:
b = a + b
a = b
Say I have this Python code
def fib2(n): # return Fibonacci series up to n
result = []
a, b = 0, 1
while b < n:
result.append(b)
a, b = b, a+b
return result
For n=1000 this prints:
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
But I don't understand why it's 1 1 2 3.
The issue is with this line:
a, b = b, a+b
What is the order of execution?
The two options I see are:
1:
a = b
b = a+b
2:
b = a+b
a = b
But neither gives me the correct result when I try it manually.
What am I missing?
None of the two options you shared actually describe the working of:
a, b = b, a+b
Above code assigns a with the value of b. And b with the older value of a+b (i.e. in a+b the older value of a). You may consider it as an equivalent of:
>>> temp_a, temp_b = a, b
>>> a = temp_b
>>> b = temp_a + temp_b
Example: Dual variable assignment in one line:
>>> a, b = 3, 5
>>> a, b = b, a+b
>>> a
5
>>> b
8
Equivalent Explicit Logic:
>>> a, b = 3, 5
>>> temp_a, temp_b = a, b
>>> a = temp_b
>>> b = temp_a + temp_b
>>> a
5
>>> b
8
The order of operations in a, b = b, a+b is that the tuple (b, a+b) is constructed, and then that tuple is assigned to the variables (a, b). In other words, the right side of the assignment is entirely evaluated before the left side.
(Actually, starting with Python 2.6, no tuple is actually constructed in cases like this with up to 3 variables - a more efficient series of bytecode operations gets substituted. But this is, by design, not a change that has any observable differences.)
It's python standard way to swap two variables, Here is a working example to clear your doubt,
Python evaluates expressions from left to right. Notice that while
evaluating an assignment, the right-hand side is evaluated before the
left-hand side.
http://docs.python.org/3/reference/expressions.html#evaluation-order
a=[1,2,3,4,5]
for i,j in enumerate(a):
if i==1:
a[i-1],a[i]=a[i],a[i-1]
print(a)
output:
[1, 2, 3, 4, 5]
For more info , read this tutorial
i have a df
a name
1 a/b/c
2 w/x/y/z
3 q/w/e/r/t
i want to split the name column on '/' to get this output
id name main sub leaf
1 a/b/c a b c
2 w/x/y/z w x z
3 q/w/e/r/t q w t
i.e first two slashes add as main and sub respectively,
and leaf should be filled with word after last slash
i tried using this, but result was incorrect
df['name'].str.split('/', expand=True).rename(columns={0:'main',1:'sub',2:'leaf'})
is there any way to assign columns
Use split with assign:
s = df['name'].str.split('/')
df = df.assign(main=s.str[0], sub=s.str[1], leaf=s.str[-1])
print (df)
a name leaf main sub
0 1 a/b/c c a b
1 2 w/x/y/z z w x
2 3 q/w/e/r/t t q w
For change order of columns:
s = df['name'].str.split('/')
df = df.assign(main=s.str[0], sub=s.str[1], leaf=s.str[-1])
df = df[df.columns[:-3].tolist() + ['main','sub','leaf']]
print (df)
a name main sub leaf
0 1 a/b/c a b c
1 2 w/x/y/z w x z
2 3 q/w/e/r/t q w t
Or:
s = df['name'].str.split('/')
df = (df.join(pd.DataFrame({'main':s.str[0], 'sub':s.str[1], 'leaf':s.str[-1]},
columns=['main','sub','leaf'])))
print (df)
a name main sub leaf
0 1 a/b/c a b c
1 2 w/x/y/z w x z
2 3 q/w/e/r/t q w t
Option 1
Using str.split, but don't expand the result. You should end up with a column of lists. Next, use df.assign, assign columns to return a new DataFrame object.
v = df['name'].str.split('/')
df.assign(
main=v.str[ 0],
sub=v.str[ 1],
leaf=v.str[-1]
)
name leaf main sub
a
1 a/b/c c a b
2 w/x/y/z z w x
3 q/w/e/r/t t q w
Details
This is what v looks like:
a
1 [a, b, c]
2 [w, x, y, z]
3 [q, w, e, r, t]
Name: name, dtype: object
This is actually a lot easier to handle, because you have greater control over elements with the .str accessor. If you expand the result, you have to snap your ragged data to a tabular format to fit into a new DataFrame object, thereby introducing Nones. At that point, indexing (finding the ith or ith-last element) becomes a chore.
Option 2
Using direct assignment (to maintain order) -
df['main'] = v.str[ 0]
df['sub' ] = v.str[ 1]
df['leaf'] = v.str[-1]
df
name main sub leaf
a
1 a/b/c a b c
2 w/x/y/z w x z
3 q/w/e/r/t q w t
Note that this modifies the original dataframe, instead of returning a new one, so it is cheaper. However, it is more intractable if you have a large number of columns.
You might instead consider this alternative which should generalise to many more columns:
for c, i in [('main', 0), ('sub', 1), ('leaf', -1)]:
df[c] = v[i]
df
name main sub leaf
a
1 a/b/c a b c
2 w/x/y/z w x z
3 q/w/e/r/t q w t
Iterate over a list of tuples. The first element in a tuple is the column name, and the second is the corresponding index to pick the result from v. You still have to assign each one separately, whether you like it or not. Using a loop would probably be a clean way of doing it.
I need to filter a data frame with a dict, constructed with the key being the column name and the value being the value that I want to filter:
filter_v = {'A':1, 'B':0, 'C':'This is right'}
# this would be the normal approach
df[(df['A'] == 1) & (df['B'] ==0)& (df['C'] == 'This is right')]
But I want to do something on the lines
for column, value in filter_v.items():
df[df[column] == value]
but this will filter the data frame several times, one value at a time, and not apply all filters at the same time. Is there a way to do it programmatically?
EDIT: an example:
df1 = pd.DataFrame({'A':[1,0,1,1, np.nan], 'B':[1,1,1,0,1], 'C':['right','right','wrong','right', 'right'],'D':[1,2,2,3,4]})
filter_v = {'A':1, 'B':0, 'C':'right'}
df1.loc[df1[filter_v.keys()].isin(filter_v.values()).all(axis=1), :]
gives
A B C D
0 1 1 right 1
1 0 1 right 2
3 1 0 right 3
but the expected result was
A B C D
3 1 0 right 3
only the last one should be selected.
IIUC, you should be able to do something like this:
>>> df1.loc[(df1[list(filter_v)] == pd.Series(filter_v)).all(axis=1)]
A B C D
3 1 0 right 3
This works by making a Series to compare against:
>>> pd.Series(filter_v)
A 1
B 0
C right
dtype: object
Selecting the corresponding part of df1:
>>> df1[list(filter_v)]
A C B
0 1 right 1
1 0 right 1
2 1 wrong 1
3 1 right 0
4 NaN right 1
Finding where they match:
>>> df1[list(filter_v)] == pd.Series(filter_v)
A B C
0 True False True
1 False False True
2 True False False
3 True True True
4 False False True
Finding where they all match:
>>> (df1[list(filter_v)] == pd.Series(filter_v)).all(axis=1)
0 False
1 False
2 False
3 True
4 False
dtype: bool
And finally using this to index into df1:
>>> df1.loc[(df1[list(filter_v)] == pd.Series(filter_v)).all(axis=1)]
A B C D
3 1 0 right 3
Abstraction of the above for case of passing array of filter values rather than single value (analogous to pandas.core.series.Series.isin()). Using the same example:
df1 = pd.DataFrame({'A':[1,0,1,1, np.nan], 'B':[1,1,1,0,1], 'C':['right','right','wrong','right', 'right'],'D':[1,2,2,3,4]})
filter_v = {'A':[1], 'B':[1,0], 'C':['right']}
##Start with array of all True
ind = [True] * len(df1)
##Loop through filters, updating index
for col, vals in filter_v.items():
ind = ind & (df1[col].isin(vals))
##Return filtered dataframe
df1[ind]
##Returns
A B C D
0 1.0 1 right 1
3 1.0 0 right 3
Here is a way to do it:
df.loc[df[filter_v.keys()].isin(filter_v.values()).all(axis=1), :]
UPDATE:
With values being the same across columns you could then do something like this:
# Create your filtering function:
def filter_dict(df, dic):
return df[df[dic.keys()].apply(
lambda x: x.equals(pd.Series(dic.values(), index=x.index, name=x.name)), asix=1)]
# Use it on your DataFrame:
filter_dict(df1, filter_v)
Which yields:
A B C D
3 1 0 right 3
If it something that you do frequently you could go as far as to patch DataFrame for an easy access to this filter:
pd.DataFrame.filter_dict_ = filter_dict
And then use this filter like this:
df1.filter_dict_(filter_v)
Which would yield the same result.
BUT, it is not the right way to do it, clearly.
I would use DSM's approach.
For python2, that's OK in #primer's answer. But, you should be careful in Python3 because of dict_keys. For instance,
>> df.loc[df[filter_v.keys()].isin(filter_v.values()).all(axis=1), :]
>> TypeError: unhashable type: 'dict_keys'
The correct way to Python3:
df.loc[df[list(filter_v.keys())].isin(list(filter_v.values())).all(axis=1), :]
Here's another way:
filterSeries = pd.Series(np.ones(df.shape[0],dtype=bool))
for column, value in filter_v.items():
filterSeries = ((df[column] == value) & filterSeries)
This gives:
>>> df[filterSeries]
A B C D
3 1 0 right 3
To follow up on DSM's answer, you can also use any() to turn your query into an OR operation (instead of AND):
df1.loc[(df1[list(filter_v)] == pd.Series(filter_v)).any(axis=1)]
You can also create a query
query_string = ' and '.join(
[f'({key} == "{val}")' if type(val) == str else f'({key} == {val})' for key, val in filter_v.items()]
)
df1.query(query_string)
Combining previous answers, here's a function you can feed to df1.loc. Allows for AND/OR (using how='all'/'any'), plus it allows comparisons other than == using the op keyword, if desired.
import operator
def quick_mask(df, filters, how='all', op=operator.eq) -> pd.Series:
if how == 'all':
comb = pd.Series.all
elif how == 'any':
comb = pd.Series.any
return comb(op(df[[*filters]], pd.Series(filters)), axis=1)
# Usage
df1.loc[quick_mask(df1, filter_v)]
I had an issue due to my dictionary having multiple values for the same key.
I was able to change DSM's query to:
df1.loc[df1[list(filter_v)].isin(filter_v).all(axis=1), :]
Any suggestions on how to do that in Python?
if x():
a = 20
b = 10
else:
a = 10
b = 20
I can swap them as below, but it's not as clear (nor very pythonic IMO)
a = 10
b = 20
if x():
[a, b] = [b, a]
(a,b) = (20,10) if x() else (10,20)
Swapping values with a, b = b, a is considered idiomatic in Python.
a, b = 10, 20
if x(): a, b = b, a
One nice thing is about this is you do not repeat the 10 and 20, so it is a little DRY-er.