Using logical operators in building a Pandas DataFrame - python

I have two snippets of pandas code which I think should be equivalent, but the second one doesn't do what I expect.
# snippet 1
data = all_data[[((np.isfinite(all_data[self.design_metric][i])
and all_data['Source'][i] == 2))
or ((np.isfinite(all_data[self.actual_metric][i])
and all_data['Source'][i] != 2))
for i in range(len(all_data))]]
# snippet 2
data = all_data[(all_data['Source'] == 2 &
np.isfinite(all_data[self.design_metric])) |
(all_data['Source'] != 2 &
np.isfinite(all_data[self.actual_metric]))]
Each section (e.g. all_data['Source'] == 2 ) does what I expect on its own but it seems that I'm doing something wrong with the logical operators as the final result is coming out with a different result to the list comprehension version.

The & operator binds more tightly than == (or any comparison operator). See the documentation. A simpler example is:
>>> 2 == 2 & 3 == 3
False
This is because it is grouped as 2 == (2 & 3) == 3, and then comparison chaining is invoked. This is what is happening in your case. You need to put parentheses around each comparison.
data = all_data[((all_data['Source'] == 2) &
np.isfinite(all_data[self.design_metric])) |
((all_data['Source'] != 2) &
np.isfinite(all_data[self.actual_metric]))]
Note the extra parentheses around the == and != comparisons.

Along with priority, there is a difference between AND and & operators, first one being boolean and the latter being binary bitwise. Also, you must be aware of boolead expressions.
See examples in the following snippet:
logical expressions
>>> 1 and 2
1
>>> '1' and '2'
'1'
>>> 0 == 1 and 2 == 0 or 0
0
bitwise operators
>>> 1 & 2
0
>>> '1' & '2'
Traceback (most recent call last):
...
TypeError: unsupported operand type(s) for &: 'str' and 'str'
>>> 0 == 1 & 2 == 0 | 0
True

Related

bitwise AND ini python for comparison

>>> i=5
>>> i>4 & i<5
True
>>> i>4 and i<5
False
I am not able to understand how bitwise AND is used here? The second statement can be understood as 5 is not less than 5, hence it returns false. Can someone throw some light on the first statement?
I have done a bit of experimentation in the python shell and i believe that i know what is going on here. I ran:
>>> 5>4 & 5<5
True
>>> 1 & 0
0
>>> True & False
False
>>> (5>4) & (5<5)
False
>>> (5>4 & 5)<5
True
So I believe what is happening is it is performing (5>4 & 5)<5 instead of (5>4) & (5<5)
& applies before > which applies before and
a > b and a > c, it's parsed as (a > b) and (a > c)
a > b & a > c, it's parsed as a > (b & a) > c

How to make a string interpreted as a condition with Python?

I need to use the following syntax to filter the list operations:
a = [ope for ope in operations if ope[0] == 1]
The if statement condition is variable and may contain multiple conditions:
a = [ope for ope in operations if ope[0] == 1 and ope[1] == "test"]
I use a function to build the condition and return it as a string:
>>>> c = makeCondition(**{"id": 1, "title": 'test'})
>>>> c
"ope[0] == 1 and ope[1] == 'test'"
Is there a way to integrate the c variable into the list filtering? Something like this (of course, the c variable is evaluated as a string in the below example):
a = [ope for ope in operations if c]
Thanks for help!
eval is considered unsafe and is generally avoided.
You can use [filter][1] with functions. For this you should put your test conditions in a function.
Here's an example to create a list of numbers between 1 and 100 that are multiples of 3 and 7
def mult3(n):
return n % 3 == 0
def mult7(n):
return n % 7 == 0
def mult3_and_7(n):
return mult3(n) and mult7(n)
list(filter(mult3_and_7, range(1, 101)))
A more consice way is to use lambdas:
list(filter(lambda n: (n % 3 == 0) and (n % 7 == 0), range(1, 101))
The cool thing is you can chain filters like so:
list(filter(lambda n: n % 3 == 0, filter(lambda n: n % 7 == 0, range(1, 101))))
They all give [21, 42, 63, 84]
This approach should help you chain multiple conditions clearly.
As commented, if you want to change the string to be considered as an expression, you can use eval(string).

Strange if statement

I found this strange if-statement in somebody else’s code:
if variable & 1 == 0:
I don't understand it. It should have two ==, right?
Can somebody explain this?
The conditional is a bitwise operator comparison:
>>> 1 & 1
1
>>> 0 & 1
0
>>> a = 1
>>> a & 1 == 0
False
>>> b = 0
>>> b & 1 == 0
True
As many of the comments say, for integers this conditional is True for evens and False for odds. The prevalent way to write this is if variable % 2 == 0: or if not variable % 2:
Using timeit we can see that there isn't much difference in performance.
n & 1("== 0" and "not")
>>> timeit.Timer("bitwiseIsEven(1)", "def bitwiseIsEven(n): return n & 1 == 0").repeat(4, 10**6)
[0.2037370204925537, 0.20333600044250488, 0.2028651237487793, 0.20192503929138184]
>>> timeit.Timer("bitwiseIsEven(1)", "def bitwiseIsEven(n): return not n & 1").repeat(4, 10**6)
[0.18392395973205566, 0.18273091316223145, 0.1830739974975586, 0.18445897102355957]
n % 2("== 0" and "not")
>>> timeit.Timer("modIsEven(1)", "def modIsEven(n): return n % 2 == 0").repeat(4, 10**6)
[0.22193098068237305, 0.22170782089233398, 0.21924591064453125, 0.21947598457336426]
>>> timeit.Timer("modIsEven(1)", "def modIsEven(n): return not n % 2").repeat(4, 10**6)
[0.20426011085510254, 0.2046220302581787, 0.2040550708770752, 0.2044820785522461]
Overloaded Operators:
Both the % and & operators are overloaded.
The bitwise and operator is overloaded for set. s.intersection(t) is equivalent to s & t and returns a "new set with elements common to s and t".
>>> {1} & {1}
set([1])
This doesn't effect our conditional:
>>> def bitwiseIsEven(n):
... return n & 1 == 0
>>> bitwiseIsEven('1')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in bitwiseIsEven
TypeError: unsupported operand type(s) for &: 'str' and 'int'
>>> bitwiseIsEven({1})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in bitwiseIsEven
TypeError: unsupported operand type(s) for &: 'set' and 'int'
The modulo operator will also throw TypeError: unsupported operand type(s) for most non-ints.
>>> def modIsEven(n):
... return n % 2 == 0
>>> modIsEven({1})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in modIsEven
TypeError: unsupported operand type(s) for %: 'set' and 'int'
It is overloaded as a string interpolation operator for the old %-formatting. It throws TypeError: not all arguments converted during string formatting if a string is used for the comparison.
>>> modIsEven('1')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in modIsEven
TypeError: not all arguments converted during string formatting
This won't throw if the string includes a valid conversion specifier.
>>> modIsEven('%d')
False
This code just checks if the lowest bit of variable is a 0. Based on operator precedence this is:
if (variable & 1) == 0:
First AND the lowest bit with one (extract just the lowest bit), then check if it is 0.
The & is a bitwise operator. It returns an integer with 1 bit for every bit of its two operands that are both 1, and 0 in all other places. For example:
a = 10 # 0b1010
b = 6 # 0b0110
a & b # 0b0010
Now, if you have variable & 1, you're comparing variable against 0b1 which will only return 1 if that last digit in the binary representation is a 1, otherwise a 0.
Your only concern is probably the operator &. It is a bitwise and which takes the binary format of the two operands and perform "logic and" on each pair of bits.
For your example, consider the following:
variable = 2 #0b0010
if variable & 1 == 0:
print "condition satisfied" # satisfied, 0b0010 & 0b0001 = 0
variable = 5 #0b0101
if variable & 1 == 0:
print "condition satisfied" # not satisfied, 0b0101 & 0b0001 = 1
Note:
variable = 6 #0b0110
if variable & 2 == 0:
print "condition satisfied" # not satisfied, 0b0110 & 0b0010 = 2 (0b0010)

Usage of the "==" operator for three objects

Is there any computational difference between these two methods of checking equality between three objects?
I have two variables: x and y. Say I do this:
>>> x = 5
>>> y = 5
>>> x == y == 5
True
Is that different from:
>>> x = 5
>>> y = 5
>>> x == y and x == 5
True
What about if they are False?
>>> x = 5
>>> y = 5
>>> x == y == 4
False
And:
>>> x = 5
>>> y = 5
>>> x == y and x == 4
False
Is there any difference in how they are calculated?
In addition, how does x == y == z work?
Thanks in advance!
Python has chained comparisons, so these two forms are equivalent:
x == y == z
x == y and y == z
except that in the first, y is only evaluated once.
This means you can also write:
0 < x < 10
10 >= z >= 2
etc. You can also write confusing things like:
a < b == c is d # Don't do this
Beginners sometimes get tripped up on this:
a < 100 is True # Definitely don't do this!
which will always be false since it is the same as:
a < 100 and 100 is True # Now we see the violence inherent in the system!
Adding a little visual demonstration to already accepted answer.
For testing equality of three values or variables. We can either use:
>>> print(1) == print(2) == print(3)
1
2
3
True
>>> print(1) == print(2) and print(2) == print(3)
1
2
2
3
True
The above statements are equivalent but not equal to, since accesses are only performed once. Python chains relational operators naturally. See this docs:
Comparisons can be chained arbitrarily, e.g., x < y <= z is equivalent to x < y and y <= z, except that y is evaluated only once (but in both cases z is not evaluated at all when x < y is found to be false).
If the functions called (and you are comparing return values) have no side-effects, then the two ways are same.
In both examples, the second comparison will not be evaluated if the first one evaluates to false. However: beware of adding parentheses. For example:
>>> 1 == 2 == 0
False
>>> (1 == 2) == 0
True
See this answer.

Python: Set Bits Count (popcount)

Few blob's have been duplicated in my database(oracle 11g), performed XOR operations on the blob using UTL_RAW.BIT_XOR. After that i wanted to count the number of set bits in the binary string, so wrote the code above.
During a small experiment, i wanted to see what is the hex and the integer value produced and wrote this procedure..
SQL> declare
2
3 vblob1 blob;
4
5 BEGIN
6
7 select leftiriscode INTO vblob1 FROM irisdata WHERE irisid=1;
8
9 dbms_output.put_line(rawtohex(vblob1));
10
11
12 dbms_output.put_line(UTL_RAW.CAST_TO_binary_integer(vblob1));
13
14
15 END;
16 /
OUTPUT: HEXVALUE:
0F0008020003030D030C1D1C3C383C330A3311373724764C54496C0A6B029B84840547A341BBA83D
BB5FB9DE4CDE5EFE96E1FC6169438344D604681D409F9F9F3BC07EE0C4E0C033A23B37791F59F84F
F94E4F664E3072B0229DA09D9F0F1FC600C2E380D6988C198B39517D157E7D66FE675237673D3D28
3A016C01411003343C76740F710F0F4F8FE976E1E882C186D316A63C0C7D7D7D7D397F016101B043
0176C37E767C7E0C7D010C8302C2D3E4F2ACE42F8D3F3F367A46F54285434ABB61BDB53CBF6C7CC0
F4C1C3F349B3F7BEB30E4A0CFE1C85180DC338C2C1C6E7A5CE3104303178724CCC5F451F573F3B24
7F24052000202003291F130F1B0E070C0E0D0F0E0F0B0B07070F1E1B330F27073F3F272E2F2F6F7B
2F2E1F2E4F7EFF7EDF3EBF253F3D2F39BF3D7F7FFED72FF39FE7773DBE9DBFBB3FE7A76E777DF55C
5F5F7ADF7FBD7F6AFE7B7D1FBE7F7F7DD7F63FBFBF2D3B7F7F5F2F7F3D7F7D3B3F3B7FFF4D676F7F
5D9FAD7DD17F7F6F6F0B6F7F3F767F1779364737370F7D3F5F377F2F3D3F7F1F2FE7709FB7BCB77B
0B77CF1DF5BF1F7F3D3E4E7F197F571F7D7E3F7F7F7D7F6F4F75FF6F7ECE2FFF793EFFEDB7BDDD1F
FF3BCE3F7F3FBF3D6C7FFF7F7F4FAF7F6FFFFF8D7777BF3AE30FAEEEEBCF5FEEFEE75FFEACFFDF0F
DFFFF77FFF677F4FFF7F7F1B5F1F5F146F1F1E1B3B1F3F273303170F370E250B
INTEGER VALUE: 15
There was a variance between the hex code and the integer value produced, so used the following python code to check the actual integer value.
print int("0F0008020003030D030C1D1C3C383C330A3311373724764C54496C0A6B029B84840547A341BBA83D
BB5FB9DE4CDE5EFE96E1FC6169438344D604681D409F9F9F3BC07EE0C4E0C033A23B37791F59F84F
F94E4F664E3072B0229DA09D9F0F1FC600C2E380D6988C198B39517D157E7D66FE675237673D3D28
3A016C01411003343C76740F710F0F4F8FE976E1E882C186D316A63C0C7D7D7D7D397F016101B043
0176C37E767C7E0C7D010C8302C2D3E4F2ACE42F8D3F3F367A46F54285434ABB61BDB53CBF6C7CC0
F4C1C3F349B3F7BEB30E4A0CFE1C85180DC338C2C1C6E7A5CE3104303178724CCC5F451F573F3B24
7F24052000202003291F130F1B0E070C0E0D0F0E0F0B0B07070F1E1B330F27073F3F272E2F2F6F7B
2F2E1F2E4F7EFF7EDF3EBF253F3D2F39BF3D7F7FFED72FF39FE7773DBE9DBFBB3FE7A76E777DF55C
5F5F7ADF7FBD7F6AFE7B7D1FBE7F7F7DD7F63FBFBF2D3B7F7F5F2F7F3D7F7D3B3F3B7FFF4D676F7F
5D9FAD7DD17F7F6F6F0B6F7F3F767F1779364737370F7D3F5F377F2F3D3F7F1F2FE7709FB7BCB77B
0B77CF1DF5BF1F7F3D3E4E7F197F571F7D7E3F7F7F7D7F6F4F75FF6F7ECE2FFF793EFFEDB7BDDD1F
FF3BCE3F7F3FBF3D6C7FFF7F7F4FAF7F6FFFFF8D7777BF3AE30FAEEEEBCF5FEEFEE75FFEACFFDF0F
DFFFF77FFF677F4FFF7F7F1B5F1F5F146F1F1E1B3B1F3F273303170F370E250B",16)
Answer:
611951595100708231079693644541095422704525056339295086455197024065285448917042457
942011979060274412229909425184116963447100932992139876977824261789243946528467423
887840013630358158845039770703659333212332565531927875442166643379024991542726916
563271158141698128396823655639931773363878078933197184072343959630467756337300811
165816534945075483141582643531294791665590339000206551162697220540050652439977992
246472159627917169957822698172925680112854091876671868161705785698942483896808137
210721991100755736178634253569843464062494863175653771387230991126430841565373390
924951878267929443498220727531299945275045612499928105876210478958806304156695438
684335624641395635997624911334453040399012259638042898470872203581555352191122920
004010193837249388365999010692555403377045768493630826307316376698443166439386014
145858084176544890282148970436631175577000673079418699845203671050174181808397880
048734270748095682582556024378558289251964544327507321930196203199459115159756564
507340111030285226951393012863778670390172056906403480159339130447254293412506482
027099835944315172972281427649277354815211185293109925602315480350955479477144523
387689192243720928249121486221114300503766209279369960344185651810101969585926336
07333771272398091
To get the set-bit count I have written the following code in C:
int bitsoncount(unsigned x)
{
unsigned int b=0;
if(x > 1)
b=1;
while(x &= (x - 1))
b++;
return b;
}
When I tried the same code in python it did not work. I am new to python through curiosity I'm experimenting, excuse me if am wrong.
def bitsoncount(x):
b=0;
if(x>1):
b=1;
while(x &= (x-1)):
I get an error at the last line, need some help in resolving this and implementing the logic in python :-)
I was interested in checking out the set bits version in python after what i have seen!
Related question: Best algorithm to count the number of set bits in a 32-bit integer?
In Python 3.10+, there is int.bit_count():
>>> 123 .bit_count()
6
Python 2.6 or 3.0:
def bitsoncount(x):
return bin(x).count('1')
Example:
>>> x = 123
>>> bin(x)
'0b1111011'
>>> bitsoncount(x)
6
Or
Matt Howells's answer in Python:
def bitsoncount(i):
assert 0 <= i < 0x100000000
i = i - ((i >> 1) & 0x55555555)
i = (i & 0x33333333) + ((i >> 2) & 0x33333333)
return (((i + (i >> 4) & 0xF0F0F0F) * 0x1010101) & 0xffffffff) >> 24
Starting with Python 3.10 you can use int.bit_count():
x = 826151739
print(x.bit_count()) # 16
what you're looking for is called the Hamming Weight.
in python 2.6/3.0 it can be found rather easily with:
bits = sum( b == '1' for b in bin(x)[2:] )
What version of Python are you using?
First off, Python uses white space not semicolon's, so to start it should look something like this...
def bitsoncount(x):
b=0
while(x > 0):
x &= x - 1
b+=1
return b
The direct translation of your C algorithm is as follows:
def bitsoncount(x):
b = 0
while x > 0:
x &= x - 1
b += 1
return b
Maybe this is what you mean?
def bits_on_count(x):
b = 0
while x != 0:
if x & 1: # Last bit is a 1
b += 1
x >>= 1 # Shift the bits of x right
return b
There's also a way to do it simply in Python 3.0:
def bits_on_count(x):
return sum(c=='1' for c in bin(x))
This uses the fact that bin(x) gives a binary representation of x.
Try this module:
import sys
if sys.maxint < 2**32:
msb2= 2**30
else:
msb2= 2**62
BITS=[-msb2*2] # not converted into long
while msb2:
BITS.append(msb2)
msb2 >>= 1
def bitcount(n):
return sum(1 for b in BITS if b&n)
This should work for machine integers (depending on your OS and the Python version). It won't work for any long.
How do you like this one:
def bitsoncount(x):
b = 0
bit = 1
while bit <= x:
b += int(x & bit > 0)
bit = bit << 1
return b
Basically, you use a test bit that starts right and gets shifted all the way through up to the bit length of your in parameter. For each position the bit & x yields a single bit which is on, or none. Check > 0 and turn the resulting True|False into 1|0 with int(), and add this to the accumulator. Seems to work nicely for longs :-) .
How to count the number of 1-bits starting with Python 3.10: https://docs.python.org/3/library/stdtypes.html#int.bit_count
# int.bit_count()
n = 19
bin(n)
# '0b10011'
n.bit_count() # <-- this is how
# 3
(-n).bit_count()
# 3
Is equivalent to (as per page linked above), but more efficient than:
def bit_count(self):
return bin(self).count("1")

Categories

Resources