Meaning and implementation of loc[~*value*] [duplicate] - python

This question already has answers here:
Tilde sign in pandas DataFrame
(4 answers)
Logical operators for Boolean indexing in Pandas
(4 answers)
How can I obtain the element-wise logical NOT of a pandas Series?
(6 answers)
Closed 1 year ago.
Watching this piece of code in the book:
def split_train_test_by_id(data, test_ratio, id_column, hash=hashlib.md5):
ids = data[id_column]
in_test_set = ids.apply(lambda id_: test_set_check(id_, test_ratio, hash))
return data.loc[~in_test_set], data.loc[in_test_set]
Never saw this loc[~<..>] before. Probably understanding the functionality, however want to be sure. Also is it working only in pandas or python in general?

I saw some great comments above, but wanted to make sure that it's clear for a beginner. The ~ flips 1s to 0s and 0s to 1s. It is commonly used with pandas to signify not. In your example, ~in_test_set is similar to saying not in_test_set. The advantage to ~ is that it works with a set of values and is not limited to a single value. See the Python wiki on bitwise operators.

Related

How could I translate the command find of MATLAB to Python? [duplicate]

This question already has answers here:
MATLAB-style find() function in Python
(9 answers)
Is there a NumPy function to return the first index of something in an array?
(20 answers)
Replacement of matlab find function in python
(1 answer)
Converting find() in matlab to python
(3 answers)
Closed 1 year ago.
I have this iteration in a program in Matlab and want to translate it to Python, but my problem is in the parameters for 'n' and 'direction'.
for i=1:size(labels)
idx_V=[idx_V;find(y(idxUnls(trial,:))==labels(i),l/length(labels),'first')]
end
There isn't a one-to-one swap for MATLAB's find function in Python. Taking inspiration from another answer here, I would propose the following solution:
% MATLAB
inds = find(myarray == condition, n, 'first')
# Python
import numpy as np
inds = [ind for (ind, val) in np.ndenumerate(myarray == condition) if val]
inds = inds[0:n]
I'm sure there is probably some trickery to think about in terms of which dimension find operates over first, compared to ndenumerate. The Python expression could also be constructed as a generator.
If you want a similar implementation, you'll have to write it yourself in Python.

Operator :: in python to remove the group delay [duplicate]

This question already has answers here:
Understanding slicing
(38 answers)
Closed 2 years ago.
If a filter's group delay is N, the filtered signal using this filter is sig_filtered, what does sig_filtered[N::] mean in python?
I saw other usages of this python operator "::", e.g. A[::3] in another post (What is :: (double colon) in Python when subscripting sequences?), where A is a list. Can somebody give out a summary on how to use this python operator "::"?
sig_filtered[N::] is the same as sig_filtered[N:] and the same as sig_filtered[N::1], or sig_filtered[N:len(sig_filtered):1], or sig_filtered[N:len(sig_filtered)].
There are three values which define a slice: start, stop and step, e.g. data[start:stop:step]
You can omit start and will
default to 0.
You can omit stop and it will default to the full
length.
You can omit step and it will default to 1.
These behave the same way as the arguments to the range function

Difference between two conditional queries on a pandas dataframe? [duplicate]

This question already has answers here:
pandas logical and operator with and without brackets produces different results [duplicate]
(2 answers)
Logical operators for Boolean indexing in Pandas
(4 answers)
Closed 3 years ago.
I was trying to find records based on two conditions on a data frame preg
First:
preg[preg.caseid==2298 & preg.pregordr==1]
This throws and error that truth value of a series is ambiguous.
Why?
Second:
But this one works!
preg[(preg.caseid==2298) & (preg.pregordr==1)]
So what exactly is the difference between the two?
Because it thinks that you're doing 2298 & preg.pregordr something like that, without parenthesis you can do:
preg[preg.caseid.eq(2298) & preg.pregordr.eq(1)]

Processing of Booleans in Python [duplicate]

This question already has answers here:
Does Python support short-circuiting?
(3 answers)
Closed 5 years ago.
I have a question about a logical expression of the following sort:
for i in range (k): #k is large
if (a==b and test(c)==b): #test() takes some time to calculate
do something
Now I want to know, how the logical expression is processed. Are the two simple expressions calculated first and then combined via and? Or is a==b calculated, and in case it is False, test(c)==b neglected?
Thanks.
The a==b will be calculated first, and if it's true then the second expression will be evaluated. This is known as 'short-circuiting', see the docs.

Python Comprehensions troubleshooting [duplicate]

This question already has answers here:
How to test multiple variables for equality against a single value?
(31 answers)
Closed 7 years ago.
I have problems to set up correctly my if statement.
This is my code:
def task_13():
Main_meal=['Meat','Cheese','Fish']
addons=['Potatoes','Rice','Salad']
my_meal=[(x+y) for x in Main_meal for y in addons if (x+y)!= 'FishRice' and 'CheeseRice']
print(my_meal)
My question is why Python filter out the 'CheeseRice' when is it stated there but only filter out the 'FishRice' option.
This is my output:
['MeatPotatoes', 'MeatRice', 'MeatSalad', 'CheesePotatoes', 'CheeseRice', 'CheeseSalad', 'FishPotatoes', 'FishSalad']
Thank you for your advice.
Here's the official reference on Python operator precedence, note that and is lower precedence than !=, so the != is evaluated first. Also and is a simple operator that takes the booleans on either side and returns a boolean representing their logical AND, it doesn't do what you tried to make it do.
Instead of
if (x+y)!= 'FishRice' and 'CheeseRice'
you need:
if (x+y)!= 'FishRice' and (x+y) != 'CheeseRice'
or alternatively
if (x+y) not in ('FishRice', 'CheeseRice')

Categories

Resources