Theano: Why does indexing fail in this case? - python

I'm trying to get the max of a vector given a boolean value.
With Numpy:
>>> this = np.arange(10)
>>> this[~(this>=5)].max()
4
But with Theano:
>>> that = T.arange(10, dtype='int32')
>>> that[~(that>=5)].max().eval()
9
>>> that[~(that>=5).nonzero()].max().eval()
Traceback (most recent call last):
File "<pyshell#146>", line 1, in <module>
that[~(that>=5).nonzero()].max().eval()
AttributeError: 'TensorVariable' object has no attribute 'nonzero'
Why does this happen? Is this a subtle nuance that i'm missing?

You are using a version of Theano that is too old. In fact, tensor_var.nonzero() isn't in any released version. You need to update to the development version.
With the development version I have this:
>>> that[~(that>=5).nonzero()].max().eval()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: bad operand type for unary ~: 'tuple'
This is because you are missing parenthesis in your line. Here is the good line:
>>> that[(~(that>=5)).nonzero()].max().eval()
array(9, dtype=int32)
But we still have unexpected result! The problem is that Theano do not support bool. Doing ~ on int8, is doing the bitwise invert on 8 bits, not 1 bit. It give this result:
>>> (that>=5).eval()
array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1], dtype=int8)
>>> (~(that>=5)).eval()
array([-1, -1, -1, -1, -1, -2, -2, -2, -2, -2], dtype=int8)
You can remove the ~ with this:
>>> that[(that<5).nonzero()].max().eval()
array(4, dtype=int32)

Related

Why do set operations work with iterables only when using methods?

Why do set operations work with arbitrary iterables when using set methods, but not operators? To show what I mean:
>>> {0, 1, 2, 3}.intersection([0, 1])
{0, 1}
>>> {0, 1, 2, 3} & [0, 1]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for &: 'set' and 'list'
>>>
>>> {0, 1, 2, 3}.union([4, 5])
{0, 1, 2, 3, 4, 5}
>>> {0, 1, 2, 3} | [4, 5]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for |: 'set' and 'list'
From the docs:
Note, the non-operator versions of union(), intersection(), difference(), and symmetric_difference(), issubset(), and issuperset() methods will accept any iterable as an argument. In contrast, their operator based counterparts require their arguments to be sets. This precludes error-prone constructions like set('abc') & 'cbs' in favor of the more readable set('abc').intersection('cbs').
It was considered less error-prone this way.

How to use qflll() in the PARI library?

I wanted to use the function qflll from the PARI library in python, so I downloaded pari-python-cygwin-0.1.zip, however when I attempted to use qflll in python, i.e.
qflll([[1,0,0],[0,1,0],[0,0,1]])
I got this error message
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Too few parameters provided: 1
So I how do I invoke the function qflll in python properly without any error?
As you can see in these docs, the qflll function takes a PARI matrix as input. Therefore, you have to do something like:
sage: M = Matrix([[1,0,0],[0,1,0],[0,0,1]])
sage: p = pari(M)
sage: p.qflll()
[1, 0, 0; 0, 1, 0; 0, 0, 1]
Or, if you prefer, one sentence:
sage: pari(Matrix([[1,0,0],[0,1,0],[0,0,1]])).qflll()
[1, 0, 0; 0, 1, 0; 0, 0, 1]

pandas: bad argument to internal function ( in iterators.c)

Why do I get an error here? Using Python 2.6 and pandas v.0.13.1
In [2]: df = pd.DataFrame({'x': [1, 1, 2, 2, 1, 1], 'y':[1, 2, 2, 2, 2, 1]})
In [3]: print pd.factorize(pd.lib.fast_zip([df.x, df.y]))[0]
---------------------------------------------------------------------------
SystemError Traceback (most recent call last)
<ipython-input-3-d98d985f2794> in <module>()
----> 1 print pd.factorize(pd.lib.fast_zip([df.x, df.y]))[0]
/usr/lib64/python2.6/site-packages/pandas/lib.so in pandas.lib.fast_zip (pandas/lib.c:8026)()
SystemError: numpy/core/src/multiarray/iterators.c:370: bad argument to internal function
You have to use df.x.values and df.y.values instead, in order to access the np.ndarray objects needed in pd.lib.fast_zip():
print(pd.factorize(pd.lib.fast_zip([df.x.values, df.y.values]))[0])

How do I select components from a particular list of indices of a Python vector?

So here I have x1vals:
>>> x1Vals
[-0.33042515829906227, -0.1085082739900165, 0.93708611747433213, -0.19289496973017362, -0.94365384912207761, 0.43385903975568652, -0.46061140566051262, 0.82767432358782367, -0.24257307936591843, -0.1182761514447952, -0.29794617763330011, -0.87410892638408, -0.34732294121174467, 0.40646145339571249, -0.64082861589870865, -0.45680189916940073, 0.4688889876175073, -0.89399689430691298, 0.53549621114138612]
And here is the list of x1Vals indices that I want to select
>>> np.where(np.dot(XValsOnly,newweights) > 0)
>>>(array([ 1, 2, 4, 5, 6, 8, 9, 13, 15, 16]),)
But when I try to get the values of x1Vals the Matlab way, I get this error:
>>> x1Vals[np.where(np.dot(XValsOnly,newweights) > 0)]
Traceback (most recent call last):
File "<pyshell#69>", line 1, in <module>
x1Vals[np.where(np.dot(XValsOnly,newweights) > 0)]
TypeError: list indices must be integers, not tuple
>>> np.where(np.dot(XValsOnly,newweights) > 0)
Is there a way around this?
The problem is that your x1Vals is a list object, which does not support fancy indexing. You just have to build an array out of it:
x1Vals = np.array(x1Vals)
and your approach will work.
A faster approach would be to use np.take:
np.take(x1Vals, np.where(np.dot(XValsOnly,newweights) > 0))

How do I fix this error? TypeError: 'str' does not support the buffer interface

>>> import struct
>>> s = '\x00\x00\x00\x01\x00\x00\x00\xff\xff\x00\x00'
>>> struct.unpack('11B', s)
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
struct.unpack('11B', s)
TypeError: 'str' does not support the buffer interface
What is wrong with this? Please help.
On python 3, struct.unpack() expects an object that implements the buffer protocol, such as a bytes value, not a unicode str:
>>> import struct
>>> s = b'\x00\x00\x00\x01\x00\x00\x00\xff\xff\x00\x00'
>>> struct.unpack('11B', s)
(0, 0, 0, 1, 0, 0, 0, 255, 255, 0, 0)
If you are reading this data from a file, open the file in binary mode instead of text mode to get bytes.

Categories

Resources