PyTorch CUDA: "to(device)" vs "device=device" for tensors?

PyTorch CUDA: "to(device)" vs "device=device" for tensors? - python

I found a somewhat similar question here What is the difference between model.to(device) and model=model.to(device)?, but I would like to check again if the same applies to my example:
Using .to(self.device)
mask = torch.tril(torch.ones(len_q, len_k)).type(torch.BoolTensor).to(self.device)`
and
Using device=self.device
mask = torch.tril(torch.ones((trg_len, trg_len), device = self.device)).bool()
Are they both accomplishing the same thing - ensuring that mask goes to the GPU?

The torch.Tensor.to function will make a copy of your tensor on the destination device. While setting the device option on initialization will place it there on init, so there is no copy involved.
So in your case you would rather do:
>>> mask = torch.tril(torch.ones(len_q, len_k), device=self.device)
But to give an answer to your question, both have the effect of placing mask on self.device. The only difference is that in the former you will have a copy of your data on both devices.
The same can be said for torch.Tensor.bool vs. initializing with dtype:
>>> torch.randint(0, 1, (10,)).bool()
Will make a copy, while the following won't:
>>> torch.randint(0, 1, (10,), dtype=torch.bool)
However, torch.tril doesn't provide a dtype option, so it is not relevant here.

Related

Pytorch: torch.int32 to torch.long

I'm new in stackoverflow, hope this post respects all the requirements.
As in the tile, I was wondering how to change the type of a data from torch.int32 to torch.long, as I obtain this error in my code:
ValueError: Argument edge_index needs to be of type torch.long but found type torch.int32.
Thank you in advance.

There are two easy ways to convert tensor data to torch.long and they do the same thing. Check the below snippet.
# Example tensor
a = torch.tensor([1, 2, 3], dtype = torch.int32)
# One Way
a = a.to(torch.long)
# Second Way
a = a.type(torch.long)
# Test it out (Should print long version of dtype)
print(a.dtype)
Sarthak Jain

How can I restore Tensors to a past value, without saving the value to disk?

I'm doing some experimentation with TensorFlow, and have run into a snag. I'm trying to use TF to evalute a change in a model, then either retain or revert the model, based on the resultant change in loss function. I've got the hard part (conditional control) figured out, but I'm stuck on something that should be fairly straightforward: I can't seem to store a tf.trainable_variables for an iteration, then restore it if needed.
Let's say a build an Op:
...
store_trainable_vars = []
for v in tf.trainable_variables():
store_trainable_vars.append(v)
...
Then later, I want to restore tf.trainable_variables to the value it had when this Op was last run. I'd want to do something like:
def reject_move():
revert_state = []
for (v, s) in zip(tf.trainable_variables(), store_trainable_vars):
revert_state.append(tf.assign(v, s, name="revert_state"))
return(revert_state)
Obviously, this will re-evaluate store_trainable_vars, which in turn links to the present value of tf.trainable_variables(), obviating the revert_state Op. I need some way to store and retrieve the value of Tensors without calling back to the present value of those Tensors. Something like
...
store_trainable_vars = []
for v in tf.trainable_variables():
store_trainable_vars.append(v.value_right_now())
...
where v.value_right_now() returns a constant that won't change until overwritten.
I know I could use Saver, but that solution writes to the disk, which is not acceptable for this application as it will run inside a training loop.
I'm probably missing something obvious - any guidance would be appreciated.

To restore a graph state manually you need to use tf.tuple or tf.group operation, that will modify the flow for a bulk change:
This creates a tuple of tensors with the same values as the tensors
argument, except that the value of each tensor is only returned after
the values of all tensors have been computed.
[Update] Here's how I would do it:
import numpy as np
import tensorflow as tf
x = tf.placeholder(shape=[None, 5], dtype=tf.float32, name='x')
W = tf.Variable(np.zeros([5, 5]), dtype=tf.float32, name='W')
b = tf.Variable(np.zeros([5]), dtype=tf.float32, name='b')
y = tf.add(tf.matmul(x, W), b)
with tf.Session() as session:
batch = np.ones([2, 5])
session.run(tf.global_variables_initializer())
print session.run(y, feed_dict={x: batch}) # prints [2, 5] zeros
# store the current value
store = {v.name: v.eval(session) for v in tf.trainable_variables()}
print store # prints [5, 5] and [5] zeros
# update
new = {'W:0': np.ones([5, 5]), 'b:0': np.ones([5])}
session.run(tf.tuple([tf.assign(var, new[var.name]) for var in tf.trainable_variables()]))
print session.run(y, feed_dict={x: batch}) # prints [2, 5] sixes
# restore
session.run(tf.tuple([tf.assign(var, store[var.name]) for var in tf.trainable_variables()]))
print session.run(y, feed_dict={x: batch}) # prints [2, 5] zeros again
But I really think you should reconsider your decision about Saver, because it was designed to be used inside a training loop as well. Internally, Saver does all the tricky work for you (in particular, it's restore op calls tf.group and tf.control_dependencies if needed), which may otherwise become the source of pretty nasty bugs. Besides, the disk is (almost) always bigger than your GPU and main memory, so if you can afford to store the model in memory, you should be able to store on disk as well.
Here are some parameters that help to control the proliferation of checkpoint files on disk:
max_to_keep indicates the maximum number of recent checkpoint files to
keep. As new files are created, older files are deleted. If None or 0, all checkpoint files are kept. Defaults to 5 (that is, the 5 most recent
checkpoint files are kept).
keep_checkpoint_every_n_hours: In addition to keeping the most recent
max_to_keep checkpoint files, you might want to keep one checkpoint file
for every N hours of training. This can be useful if you want to later
analyze how a model progressed during a long training session. For
example, passing keep_checkpoint_every_n_hours=2 ensures that you keep one checkpoint file for every 2 hours of training. The default value of 10,000 hours effectively disables the feature.
[Update] As clarified in the comments, the main concern is disk latency, that may slow down the training if accessed too often. If you're using Linux, it caches frequently used disk pages, Windows does it as well. But if you want to be absolutely sure, consider using tmpfs.

It wasn't my original intent to answer this question myself, but I've come up with a method that works fairly well. So, I thought I'd share it. The key insight came from this very clever answer. The approach is to reuse the assignment nodes created for inital variable assignment. A complete class implementing that approach is given below.
import tensorflow as tf
class TensorFlowState(object):
def __init__(self):
# Get the graph.
graph = tf.get_default_graph()
# Extract the global varibles from the graph.
self.gvars = graph.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
# Exract the Assign operations for later use.
self.assign_ops = [graph.get_operation_by_name(v.op.name + "/Assign")
for v in self.gvars]
# Extract the initial value ops from each Assign op for later use.
self.init_values = [op.inputs[1] for op in self.assign_ops]
def start(self, sess):
self.sess = sess
def store(self):
# Record the current state of the TF global varaibles
self.state = self.sess.run(self.gvars)
def restore(self):
# Create a dictionary of the iniailizers and stored state of globals.
feed_dict = {init_value: val
for init_value, val in zip(self.init_values, self.state)}
# Use the initializer ops for each variable to load the stored values.
return(self.sess.run(self.assign_ops, feed_dict=feed_dict))
To use, simply instantiate the class, call the start method to pass a tf.Session, and call the store and restore methods as needed inside your imperative training loop. I've used this implementation to build an optimizer, which runs about as fast as the gradient descent optimizers included with TensorFlow.

How to resuse a pyfftw object?

Perhaps it's just my misunderstanding, but how do you reuse a pyfftw object?
When I run something like the following code, img1_fft and img2_fft are the same despite receiving different input. When I uncomment the line that reconstructs the fftwObj, I get the desired output though.
inArray = pyfftw.empty_aligned(optimalSize, dtype='complex64')
inArray[ 0:img1.shape[0] , 0:img1.shape[1] ] = img1;
fftwObj = pyfftw.builders.fft2(inArray)
img1_fft = fftwObj(inArray)
inArray = pyfftw.empty_aligned(optimalSize, dtype='complex64')
inArray[ 0:img2.shape[0] , 0:img2.shape[1] ] = img2;
# fftwObj = pyfftw.builders.fft2(inArray)
img2_fft = fftwObj(inArray)
Am I doing something wrong since the whole point of "planning" was to not have to reconstruct pyfftw objects? I would like to just use the same pyfftw object (since all of my images are the same size) and just change the input to the object.

This is by design. There is no copy of the output array except explicitly, so img2_fft is img1_fft returns True.
You can copy the output using .copy(), or you can explicitly set the output array from your own array.

Why does scan upcast?

This code to calculate the trace of a matrix (based on an example in the Theano "loop" tutorial) works fine:
import numpy as np
import theano as th
import theano.tensor as T
floatX = 'float32'
X = T.matrix()
results = th.scan(lambda i,j,t_f : T.cast(X[i,j] + t_f, floatX),
sequences=[T.arange(X.shape[0]), T.arange(X.shape[1])],
outputs_info=np.asarray(0., dtype=floatX))[0]
result = results[-1]
compute_trace = th.function([X], result)
x = np.eye(5, dtype=floatX)
x[0] = np.arange(5, dtype=floatX)
print compute_trace(x)
But if I remove the cast operation from the lambda function like this:
lambda i,j,t_f : X[i,j] + t_f
The following error message is produced:
ValueError: When compiling the inner function of scan the following error has been encountered: The initial state (outputs_info in scan nomenclature) of variable IncSubtensor{Set;:int64:}.0 (argument number 2) has dtype float32, while the result of the inner function (fn) has dtype float64. This can happen if the inner function of scan results in an upcast or downcast.
Why so? X and outputs_info are explicitly set to float32. How does the result of adding them get to be float64?

This is a very late answer, but we're working on a fork of Theano called Aesara, and, since people still run into problems like this, it seems worthwhile to provide a public explanation.
That said, the issue is X = T.matrix(). T.matrix creates a float64 matrix when theano.config.floatX == "float64" (the default), and the result is an upcast to float64 for the sum in the body of the scan's loop function.
If X = T.fmatrix() is used, a float32 matrix is created instead and the problem is no longer present; otherwise, as mentioned in the comments, one can also set theano.config.floatX to "float32".

Using python binding for flycapture to retrieve color image

I am working with the CMLN-13S2C-CS CCD camera from PointGrey Systems. It uses FlyCapture API to grab images. I would like to grab these images and do some stuff in OpenCV with them using python.
I am aware of the following python binding: pyflycapture2. With this binding I am able to retrieve images. However, I cannot retrieve the images in color, which is what the camera should be able to do.
The videomode and framerate that the camera is able to handle are VIDEOMODE_1280x960Y8, and FRAMERATE_15, respectively. I think it has something to do with the pixel_format, which I think should be raw8.
Is anyone able to retrieve a color image using this or any existing python binding for flycapture? Note that I am working on Linux.

You don't need to use the predefined modes. The Context class has the set_format7_configuration(mode, x_offset, y_offset, width, height, pixel_format) method with which you can use your custom settings. Using this you can at least change the resolution of the grabbed image.
Usage example:
c.set_format7_configuration(fc2.MODE_0, 320, 240, 1280, 720, fc2.PIXEL_FORMAT_MONO8)
As for the coloring issue. I've so far managed to get a colored image using PIXEL_FORMAT_RGB8 and modifying the Image class in flycapture2.pyx as follows:
def __array__(self):
cdef np.ndarray r
cdef np.npy_intp shape[3] # From 2 to 3
cdef np.dtype dtype
numberofdimensions = 2 # New variable
if self.img.format == PIXEL_FORMAT_MONO8:
dtype = np.dtype("uint8")
elif self.img.format == PIXEL_FORMAT_MONO16:
dtype = np.dtype("uint16")
elif self.img.format == PIXEL_FORMAT_RGB8: # New condition
dtype = np.dtype("uint8")
numberofdimensions = 3
shape[2] = 3
else:
dtype = np.dtype("uint8")
Py_INCREF(dtype)
shape[0] = self.img.rows
shape[1] = self.img.cols
# nd value (numberofdimensions) was always 2; stride set to NULL
r = PyArray_NewFromDescr(np.ndarray, dtype,
numberofdimensions, shape, NULL,
self.img.pData, np.NPY_DEFAULT, None)
r.base = <PyObject *>self
Py_INCREF(self)
return r
This code is most likely not flawless (i.e I removed the stride stuff) for the simple reason that I have pretty much 0 experience with C and Cython but this way I at least managed to get a colored frame (now in the process of trying to get the PIXEL_FORMAT_RAW8 working).
And just as a reminder: the flycapture2.pyx is a Cython file so you need to recompile it before you can use it (I just run the pyflycap2 install script again).

I'm using the same camera with Matlab and also got an issues with "raw8" format. So, I've chose "rgb8", specifically, "F7_RGB_644x482_Mode1" and all things starts to work (not sure, how it should look at Python).
P.S. At the moment I'm trying to start work with Python and pyflycapture2, let's see, if I would be able to find workaround.
UPD: Okay, now I know the things. :)
Your (and mine) issue reasons are buried inside the pyflycapture2 itself, especially "Image" class definition. You can have a look here: https://github.com/jordens/pyflycapture2/blob/eec14acd761e89d8e63a0961174e7f5900180d54/src/flycapture2.pyx
if self.img.format == PIXEL_FORMAT_MONO8:
dtype = np.dtype("uint8")
stride[1] = 1
elif self.img.format == PIXEL_FORMAT_MONO16:
dtype = np.dtype("uint16")
stride[1] = 2
else:
dtype = np.dtype("uint8")
stride[1] = self.img.stride/self.img.cols
ANY image will be converted into grayscale, even if it was RGB initially. So, we need to update that file somehow.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

PyTorch CUDA: "to(device)" vs "device=device" for tensors? - python

Related

Pytorch: torch.int32 to torch.long

How can I restore Tensors to a past value, without saving the value to disk?

How to resuse a pyfftw object?

Why does scan upcast?

Using python binding for flycapture to retrieve color image

Categories

Resources