Python equivalent of R's save()?

Python equivalent of R's save()? - python

In R I can save multiple objects to harddrive using:
a = 3; b = "c", c = 2
save(a, b, filename = "filename.R")
I can then use load("filename.R") to get all objects back in workspace. Is there an equaivalent for Python?
I know I can use
import pickle
a = 3; b = "c", c = 2
with open("filename.pkl", 'wb') as f:
pickle.dump([a,b], f)
and load it back as:
with open("filename.pkl", 'rb') as f:
a,b = pickle.load(f)
but this requires that I know what is inside filename.pkl in order to do the assignment a,b = pickle.load(f). Is there another way of doing it that is closer to what I did in R? If not, is there a reason for this that I currently fail to see?
--
edit: I don't agree that the linked question discusses the same issue. I am not asking for all variables, only specific ones. Might well be that there is no way to dump all variables (maybe since some variables in the global env cannot be exported or whatnot...) but still possible to export some.

Related

Python - function calling exec() does not see variable

I have following script that works well on it's own, but once I wrap it all into a function does not return data.
The command changes based on input data structure. This is an example of the command I want to feed into the exec():
cross_data=pd.crosstab(src_data['result'],[src_data['c1'],src_data['c2']],normalize='index')
This is my function I want to wrap the code in and call:
def calcct(file_path='src_data.csv', separator = ",", res_col = 'result'):
#define function
src_data = csv_import(file_path, separator) #import data
reorder_cols = reorder_columns(src_data, res_col) #work with data
head_list=list(reorder_cols.columns.values) #get dataframe headers
# create command based on headers and execute that. Should return dataframe called cross_data.
exec(crosstabcmd(head_list))
return cross_data
Results in:
NameError: name 'cross_data' is not defined
I cannot seem to find the correct syntax for calling exec inside a function.
I tried defining and passing the cross_data variable, but I just get an error it doesnt see pandas when I do that.
Or is there some better way? I need to compose the command of 2-x column names, count and names of columns are variable.

First up
You probably don't mean to be using exec - that's a pretty low-level functionality! There isn't really enough context to understand how to fix this yet. Could you write out (in your question) what the crosstabcmd function looks like?
The error
NameError: name 'cross_data' is not defined
is because you've never defined a variable called cross_data in the scope of function calcct, i.e. you have never done cross_data = "something".
I'll give it a go
Assuming you have something like
import pandas as pd
def crosstabcmd(head_list):
# ? I can only guess what your crosstabcmd does, this won't work though
return pd.crosstab(*head_list, normalize='index')
then the solution would look like:
def calcct(file_path = 'src_data.csv', separator = ",", res_col = 'result'):
src_data = csv_import(file_path, separator) #import data
reorder_cols = reorder_columns(src_data, res_col) #work with data
head_list=list(reorder_cols.columns.values) #get dataframe headers
cross_data = crosstabcmd(head_list)
return cross_data

In my case I had main script which called a second script. I needed to use the "c" variable within the second script. Therefore I used locals(),loc as arguments for exec().
loc = {}
a = 10
b = 5
def abc(a,b):
qwerty = "c = %d + %d"%(a,b)
exec(qwerty, locals(), loc)
c = loc['c']
d = c+2
print(d)
abc(a,b)

Unpacking data with h5py

I want to write numpy arrays to a file and easily load them in again.
I would like to have a function save() that preferably works in the following way:
data = [a, b, c, d]
save('data.h5', data)
which then does the following
h5f = h5py.File('data.h5', 'w')
h5f.create_dataset('a', data=a)
h5f.create_dataset('b', data=b)
h5f.create_dataset('c', data=c)
h5f.create_dataset('d', data=d)
h5f.close()
Then subsequently I would like to easily load this data with for example
a, b, c, d = load('data.h5')
which does the following:
h5f = h5py.File('data.h5', 'r')
a = h5f['a'][:]
b = h5f['b'][:]
c = h5f['c'][:]
d = h5f['d'][:]
h5f.close()
I can think of the following for saving the data:
h5f = h5py.File('data.h5', 'w')
data_str = ['a', 'b', 'c', 'd']
for name in data_str:
h5f.create_dataset(name, data=eval(name))
h5f.close()
I can't think of a similar way of using data_str to then load the data again.

Rereading the question (was this edited or not?), I see load is supposed to function as:
a, b, c, d = load('data.h5')
This eliminates the global variable names issue that I worried about earlier. Just return the 4 arrays (as a tuple), and the calling expression takes care of assigning names. Of course this way, the global variable names do not have to match the names in the file, nor the names used inside the function.
def load(filename):
h5f = h5py.File(filename, 'r')
a = h5f['a'][:]
b = h5f['b'][:]
c = h5f['c'][:]
d = h5f['d'][:]
h5f.close()
return a,b,c,d
Or using a data_str parameter:
def load(filename, data_str=['a','b','c','d']):
h5f = h5py.File(filename, 'r')
arrays = []
for name in data_str:
var = h5f[name][:]
arrays.append(var)
h5f.close()
return arrays
For loading all the variables in the file, see Reading ALL variables in a .mat file with python h5py
An earlier answer that assumed you wanted to take the variable names from the file key names.
This isn't a h5py issue. It's about creating global (or local) variables using names from a dictionary (or other structure). In other words, how creat a variable, using a string as name.
This issue has come up often in connection with argparse, an commandline parser. It gives an object like args=namespace(a=1, b='value'). It is easy to turn that into a dictionary (with vars(args)), {'a':1, 'b':'value'}. But you have to do something tricky, and not Pythonic, to create a and b variables.
It's even worse if you create that dictionary inside a function, and then want to create global variables (i.e. outside the function).
The trick involves assigning to locals() or globals(). But since it's un-pythonic I'm reluctant to be more specific.
In so many words I'm saying the same thing as the accepted answer in https://stackoverflow.com/a/4467517/901925
For loading variables from a file into an Ipython environment, see
https://stackoverflow.com/a/28258184/901925 ipython-loading-variables-to-workspace

I would use deepdish (deepdish.io):
import deepdish as dd
dd.io.save(filename, {'dict1': dict1, 'obj2': obj2}, compression=('blosc', 9))

Create variable name from two string in python [duplicate]

This question already has answers here:
How do I create variable variables?
(17 answers)
Closed last year.
I am looking to create a variable name from two strings in Python, e.g.:
a = "column_number"
b = "_url1"
and then be able to get a variable name "column_number_url1" that I can use.
I appreciate this is in general not a good idea - there are numerous posts which go into why it is a bad idea (e.g. How do I create a variable number of variables? , Creating multiple variables ) - I mainly want to be able to do it because these are all variables which get defined elsewhere in the code, and want a easy way of being able to re-access them (i.e. rather than to create thousands of unique variables, which I agree a dictionary etc. would be better for).
As far as I can tell, the answers in the other posts I have found are all alternative ways of doing this, rather than how to create a variable name from two strings.

>>> a = "column_number"
>>> b = "_url1"
>>> x = 1234
>>> globals()[a + b] = x
>>> column_number_url1
1234
The reason that there aren't many posts explaining how to do this is because (as you might have gathered) it's not a good idea. I guarantee that your use case is no exception.
In case you didn't notice, globals() is essentially a dictionary of global variables. Which implies that you should be using a dictionary for this all along ;)

You can use a dictionary:
a = "column_number"
b = "_url1"
obj = {}
obj[a+b] = None
print obj #{"column_number_url1": None}
Alternatively, you could use eval, but remember to always watch yourself around usage of eval/exec:
a = "column_number"
b = "_url1"
exec(a+b+" = 0")
print column_number_url1 #0
eval is evil

As an alternative to Joel's answer, a dictionary would be much nicer:
a = "column_number"
b = "_url1"
data = {}
data[a+b] = 42

Export function that names the export file after the input variable

I'm looking to get a function to export Numpy arrays, but to use the name of the variable input to the function as the name of the exported file. Something like:
MyArray = [some numbers]
def export(Varb):
return np.savetxt("%s.dat" %Varb.name, Varb)
export(MyArray)
that will output a file called 'MyArray.dat' filled with [some numbers]. I can't work out how to do the 'Varb.name' bit. Any suggestions?
I'm new to Python and programming so I hope there is something simple I've missed!
Thanks.

I don't recommend such a code style, but you can do it this way:
import copy
myarray = range(4)
for k, v in iter(copy.copy(locals()).items()):
if myarray == v:
print k
This gives myarray as output. To do this in a function useful for exports use:
import copy
def export_with_name(arg):
""" Export variable's string representation with it's name as filename """
for k, v in iter(copy.copy(globals()).items()):
if arg == v:
with open(k, 'w') as handle:
handle.writelines(repr(arg))
locals() and globals() both give dictionaries holding the variable names as keys and the variable values as values.
Use the function the following way:
some_data = list(range(4))
export_with_name(some_data)
gives a file called some_data with
[0, 1, 2, 3]
as content.
Tested and compatible with Python 2.7 and 3.3

You can't. Python objects don't know what named variables happen to be referencing them at any particular time. By the time the variable has been dereferenced and sent to the function, you don't know where it came from. Consider this bit of code:
map(export, myvarbs)
Here, the varbs were in some sort of container and didn't even have a named variable referencing them.

import os
import inspect
import re
number_array = [8,3,90]
def varname(p):
for line in inspect.getframeinfo(inspect.currentframe().f_back)[3]:
m = re.search(r'\bvarname\s*\(\s*([A-Za-z_][A-Za-z0-9_]*)\s*\)', line)
if m:
return m.group(1)
def export(arg):
file_name = "%s.dat" % varname(arg)
fd = open(file_name, "w")
fd.writelines(str(arg))
fd.close()
export(number_array)
Refer How can you print a variable name in python? to get more details about def varname()

Python 3.0 - Dynamic Class Instance Naming

I want to use a while loop to initialize class objects with a simple incremented naming convention. The goal is to be able to scale the number of class objects at will and have the program generate the names automatically. (ex. h1...h100...h1000...) Each h1,h2,h3... being its own instance.
Here is my first attempt... have been unable to find a good example.
class Korker(object):
def __init__(self,ident,roo):
self.ident = ident
self.roo = roo
b = 1
hwinit = 'h'
hwstart = 0
while b <= 10:
showit = 'h' + str(b)
print(showit) #showit seems to generate just fine as demonstrated by print
str(showit) == Korker("test",2) #this is the line that fails
b += 1
The errors I get range from a string error to a cannot use function type error.... Any help would be greatly appreciated.

If you want to generate a number of objects, why not simply put them in an array / hash where they can be looked up later on:
objects = {}
for b in range(1,11):
objects['h'+str(b)] = Korker("test", 2)
# then access like this:
objects['h3']
Of course there are ways to make the names available locally, but that's not a very good idea unless you know why you need it (via globals() and locals()).

Variables are names that point to objects that hold data. You are attempting to stick data into the variable names. That's the wrong way around.
instead of h1 to h1000, just call the variable h, and make it a list. Then you get h[0] to h[999].

Slightly different solution to viraptor's: use a list.
h = []
for i in range(10):
h.append(Korker("test",2))
In fact, you can even do it on one line with a list comprehension:
h = [Korker("test", 2) for i in range(10)]
Then you can get at them with h[0], h[1] etc.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python equivalent of R's save()? - python

Related

Python - function calling exec() does not see variable

Unpacking data with h5py

Create variable name from two string in python [duplicate]

Export function that names the export file after the input variable

Python 3.0 - Dynamic Class Instance Naming

Categories

Resources