creating variables from external data in python script - python

I want to read an external data source (excel) and create variables containing the data. Suppose the data is in columns and each column has a header with the variable name.
My first idea is to write a function so i can easily reuse it. Also, I could easily give some additional keyword arguments to make the function more versatile.
The problem I'm facing is that I want to refer to the data in python (interactively) via the variable names. I don't know how to do that (with a function). The only solution I see is returning the variable names and the data from my function (eg as lists), and do something like this:
def get_data()
(...)
return names, values
names, values = get_data(my_excel)
for n,v in zip(names, values):
exec(''.join([n, '= v']))
Can I get the same result directly?
Thanks,
Roel

Use a dictionary to store your mapping from name to value instead of creating local variable.
def get_data(excel_document):
mapping = {}
mapping['name1'] = 'value1'
# ...
return mapping
mapping = get_data(my_excel)
for name, value in mapping:
# use them
If you really want to populate variables from the mapping, you can modify globals() (or locals()), but it is generally considered bad practice.
mapping = get_data(my_excel)
globals().update(mapping)

If you just want to set local variables for each name in names, use:
for n, v in zip(names, values):
locals()[n] = v
If you'd rather like to have a single object to access the data, which is much cleaner, simply use a dict, and return that from your function.
def get_data():
(...)
return dict(zip(names, values))
To access the value of the name "a", simply use get_data()["a"].
Finally, if you want to access the data as attributes of an object, you can update the __dict__ of an object (unexpected behaviour may occur if any of your column names are equal to any special python methods).
class Data(object):
def __init__(self, my_excel):
(...)
self.__dict__.update(zip(names, values))
data = Data("test.xls")
print data.a

The traditional approach would be to stuff the key/value pairs into a dict so that you can easily pass the whole structure around to other functions. If you really want to store them as attributes instead of dict keys, consider creating a class to hold them:
class Values(object): pass
store = Values()
for key, value in zip(names, values):
setattr(store, key, value)
That keeps the variables in their own namespace, separate from your running code. That's almost always a Good Thing. What if you get a spreadsheet with a header called "my_excel"? Suddenly you've lost access to your original my_excel object, which would be very inconvenient if you needed it again.
But in any case, you should never use exec unless you know exactly what you're doing. And even then, don't use exec. For instance, I know how your code works and send you a spreadsheet with "os.system('echo rm -rf *')" in a cell. You probably don't really want to execute that.

Related

Run for loop over subsets in Python

Can you run a for loop over the names of multiple subsets?
For instance, I now have subsets dfVC1 up until dfVC20 and I would like to do something like:
for x in range(20):
print(dfVC[x])
I get this doesn't work... but wonder if there is a way to do this.
I'm going to assume your 'subsets' in this case are variables, named dbVC0, dbVC1, etc. Then, your problem is that you want to print all of them by number, but since they're variables, you can't.
One way to solve this would be to change how the 'subsets' are declared. Instead of
dfVC0 = ...
dfVC1 = ...
you could make one dfVC variable that's a dict, that holds all the others at their proper indices.
dfVC = {}
dfVC[0] = ...
dfVC[1] = ...
which would then allow you to access the various dbVC subsets in the way you're currently trying to.
But changing such a large part of the program isn't always possible. What you might be able to do instead is to figure out which object the dfVCs are attached to, and grab them by string.
If they're in the local namespace (i.e. were declared in the same function as you're currently executing in), you can call the built-in locals() to get a dict that you can then try to find your key in:
for x in range(20):
sname = f'dfVC{x}'
print(locals()[sname])
globals() can be used similarly, if your 'subsets' are in the global scope (i.e. declared outside of the current function).
And if your dfVC variables are attached to a class or module (or something else that behaves like a namespace), you can retrieve them using the built-in getattr() function:
for x in range(20):
sname = f'dfVC{x}'
print(getattr(self, sname)) # replace self with whichever object has the dbVC attached to it

Can you call/use a function returned from a list in Python?

I'm trying to store a function in a list, retrieve the function from the list later, and then call on that function. This is basically what I want to do, without any specifics. It doesn't show my purpose, but it's the same issue.
elements: list = [] # List meant to contain a tuple with the name of the item and the function of the item.
def quit_code():
exit()
element.append(("quit", quit_code))
Now, somewhere else in the code, I want to be able to use an if statement to check the name of the item and, if it's the right one at that time, run the function.
user_input = "quit" # For brevity, I'm just writing this. Let's just imagine the user actually typed this.
if elements[0][0] == user_input:
#This is the part I don't understand so I'm just going to make up some syntax.
run_method(elements[0][1])
The method run_method that I arbitrarily made is the issue. I need a way to run the method returned by elements[0][1], which is the quit_code method. I don't need an alternative solution to this example because I just made it up to display what I want to do. If I have a function or object that contains a function, how can I run that function.
(In the most simplified way I can word it) If I have object_a (for me it's a tuple) that contains str_1 and fun_b, how can I run fun_b from the object.
To expand on this a little more, the reason I can't just directly call the function is because in my program, the function gets put into the tuple via user input and is created locally and then stored in the tuple.
__list_of_stuff: list = []
def add_to_list(name, function):
__list_of_stuff.append((name, function))
And then somewhere else
def example_init_method():
def stop_code():
exit()
add_to_list("QUIT", stop_code())
Now notice that I can't access the stop_code method anywhere else in the code unless I use it through the __list_of_stuff object.
Finally, It would be nice to not have to make a function for the input. By this, I mean directly inserting code into the parameter without creating a local function like stop_code. I don't know how to do this though.
Python treats functions as first-class citizens. As such, you can do things like:
def some_function():
# do something
pass
x = some_function
x()
Since you are storing functions and binding each function with a word (key), the best approach would be a dictionary. Your example could be like this:
def quit_code():
exit()
operations = dict(quit=quit_code)
operations['quit']()
A dictionary relates a value with a key. The only rule is the key must be immutable. That means numbers, strings, tuples and other immutable objects.
To create a dictionary, you can use { and }. And to get a value by its key, use [ and ]:
my_dictionary = { 'a' : 1, 'b' : 10 }
print(my_dictionary['a']) # It will print 1
You can also create a dictionary with dict, like so:
my_dictionary = dict(a=1, b=10)
However this only works for string keys.
But considering you are using quit_code to encapsulate the exit call, why not using exit directly?
operations = dict(quit=exit)
operations['quit']()
If dictionaries aren't an option, you could still use lists and tuples:
operations = [('quit',exit)]
for key, fun in operations:
if key == 'quit':
fun()

How to set an argument as a dictionary or variable in python

I am trying to make a function in python that creates dictionaries with custom names. The code I am using so far looks like this:
def PCreate(P):
P = {}
print('Blank Party Created')
The problem that I am having is that whenever I use the function, no matter what I put down for P, for example:
PCreate('Party1')
It creates a blank dictionary with the name 'P'. is there a way to make it create a dictionary with the name Party1?
It looks like you're confused with how variable names, and strings, and objects interact withing Python. When you have the function PCreate(P) you are saying that when the function is called, it will take on parameter, and within the function that parameter will be called P. This means that if you have the function,
def func(P):
print(P)
and call it three times,
func('two words')
func(4)
func([3, 'word'])
you will get the output:
two words
4
[3, 'word']
This is because the parameter P has no explicit type in Python. So, when you called your function with the argument 'Party1' the values looked like this
def PCreate(P):
# P is currently 'Party1'
P = {}
# P no longer is Party1, and now references {}
...
So you didn't assign {} to the variable with the name Party1, you overwrote the local variable P with a new empty dict.
I think you probably do not want to be doing what you're doing, but see this answer for more information on setting a variable using a string variable as its name.
What I recommend you do is create a function that returns your custom dictionaries, and assign the returned value to your custom name.
def new_custom_dict():
my_dict = {} # Pretend this is somehow custom
return my_dict
Party1 = my_custom_dict()
If you need the reference key to your new dictionary to be stored in a string, then you're in luck because that's what dictionaries are for!
You can first create a dictionary that will be used to store your custom named dictionaries:
dictionaries = {}
and when you want to add a new dictionary with a custom name, call this function
def insert_new_dictionary(dictionaries, dictionary_name):
dictionaries[dictionary_name] = {}
e.g.
insert_new_dictionary(dictionaries, 'Party1')
insert_new_dictionary(dictionaries, 'Party2')
would leave you with two dictionaries accessible by dictionaries['Party1'] and dictionaries['Party2']

How do I use text from a file as a variable name?

How do I use text from a file as a variable name?
I am pulling values out of an excel file.
I am using xlrd and xlutils with python 3.
class employee(object):
def __init__(self, name):
self.name = name
emp_list.append(name)
def bulk_hours(self,sunday=0,monday=0,tuesday=0,wednesday=0,thursday=0,friday=0,saturday=0):
self.sunday = sunday
self.monday = monday
self.tuesday = tuesday
self.wednesday = wednesday
self.thursday = thursday
self.friday = friday
self.saturday = saturday
I'm pulling employees out of a spreadsheet.
I'm trying to use their actual names.
I would love to know any working solution.
Thanks!
Edit: Pardon my ignorance regarding programming and my horrible post.
I'm trying to make a simple program that allows me to load an employees name and work schedule from Excel.
I will also make sure any edits are saved back into the spreadsheet.
The employees are labeled by their names. I'm trying to load their name as a variable so I can do:
John = employee('John')
John.bulk_hours(0,8,8,8,8,8,0)
Stacy = employee('Stacy')
print(John.monday)
I'm aiming to use their name as the variable I can use dot notation on.
Is this feasible? Is their some other way I should approach this?
def load(row):
employee2 = employee(s.cell(row, 0).value)
employee2.bulk_hours(s.cell(row, 1).value, s.cell(row, 2).value, s.cell(row, 3).value, s.cell(row, 4).value,
s.cell(row, 5).value, s.cell(row, 6).value, s.cell(row, 7).value)
print(employee2.saturday)
I'm trying to use a function like this to load multiple employees and their hours.
Could I use a list like this somehow?
worker = ['Joe']
worker[0] = employee('Joe')
worker[0].bulk_hours(0,8,8,8,8,8,0)
print(worker[0].monday)
Thank you for your valuable time.
Override __getattr__ to transparently access an internal dictionary.
class employee(object):
def __init__(self, ...):
self._internal_d = extract_data() # replace extract_data with however you extract CSV values to a dictionary
... # perform other initialization
def __getattr__(self, name):
try:
return self._internal_d[name]
except KeyError:
raise AttributeError()
Optionally, you can implement __setattr__ to allow writing properties.
def __setattr__(self, name, value):
return self._internal_d[name] = value
Explanation: when python does variable assignment and can't find a variable name "normally", it checks if an object has __getattr__. If it does, it calls __getattr__ to get the value with the specified name. Thus, you can have dynamic variable names. Likewise for __setattr__.
You don't want to use variable names comming from the spreadsheet.
or one: variable names are internal to the running program, and are not meant to be exported again to an output file.
It is meaningless that the variable is bamed John to represent John's data when the program is running. For example, let's suppose it would be possible to create a special markup to use teh variable name - say a ? prefix to fetch the name from another variable. Your example would be something like this:
def row_data(emp_name, *args):
?emp_name = employee(emp_name)
?emp_name.bulk_hours(*args)
print(?emp_name.monday)
So, even if at runtime ?emp_name would be exchanged by the contents of the variable name, yur program would still look the same to someone reading the code. So, it makes more sense to simply let the variable be named person or employee or anything, since it can represent any employee (and in fact will, as you loop through the spreadsheet contents, usually the variable will carry the data about one person a time).
That said, there are times when we do want to have data in the program which do have programmatic labeling. but still on those cases - that is what dictionaries are for - create an employees dict, and fill it - and then you can have the names as the keys:
employees = dict()
def row_data(emp_name, name):
person = employee(emp_name)
person.bulk_hours(*args)
employes[emp_name] = person
def print_employeers():
for person_name, person_data in employees.items():
print(person_name, person_data)
As you can see, it is possible to print all employees data without having to type in their names. Also, if it is an interactive program, it is possible to find the data related to a name that is input by the user, using it as the dictionary key.
However if you intend to have a program to generate other Python source files themselves, and end-up with a .py file for each employee. In that case just make use of a templating engine, like Jinja2, or simply use str's format method. (It still hard to imagine why you would need such a thing).
And, just to have a complete answer, it is possible to create dynamic variable names in Python. However, you will be in the exact same situation I described in the first example here.
Global variables for any running code are kept in a regular Python dictionary, which is returned by a call to the globals() function. And similarly, values for local variables are kept in a dictionary that returned by a call to locals() - although these do not behave nicely for variables known at compile time (in that case, the local variables are cached in the frame object, and the locals dict is only synchornized with them when it is read, but they can't be written via locals)
So:
def row_data(emp_name, *args):
globals()[emp_name] = employee(emp_name)
globals()[emp_name].bulk_hours(*args)
print(globals()[emp_name].monday)
will work just as you asked - but it is easy to see it is useless.

Python - replace exec for dynamic variable creation

I have been told that using exec is a Very Bad Thing.
However, I'm new to python and trying to figure out how to dynamically create a bunch of global variables (I'm aware that this is also supposed to be a Bad Thing, but let's burn one bridge at a time, shall we?).
What this is doing: get a list of the current variables that need to be created (currently sitting in a CSV), get the unique ID's within that list, then create the necessary objects by appending the ID to the name and reading the content of another CSV into it.
import pandas as pd
def importtest():
ilist = pd.read_csv('Z:/fakepath/ID.csv')
for i in range(0, len(ilist['ID'].unique())):
tempID = ilist['ID'].unique()[i]
exec("variable%s = pd.read_csv('%s')" % (
str(tempID), 'Z:/fakepath/'+str(tempID)+'.csv'), globals())
i = i + 1
Is there another/better way to dynamically create/update the variables I need so they show up in the global scope?
String keys in globals() dictionary correspond to variable names, so, you don't need to use exec, you can write variable to globals hash directly:
globals()["variable" + str(tempID)] = pd.read_csv('Z:/fakepath/'+str(tempID)+'.csv')

Categories

Resources