Making a database with custom commands in python - python

I'm trying to make a simple, local database using Python where I can set values, get values, etc and I keep getting this error:
#Simple Database
#Functions include Set[name][value]
# Get[name]
# Unset[name]
# Numequalto[value]
# End
# Begin, Rollback, Commit
varlist = []
ops = []
class item:
def __init__(self,name,value):
self.name = name
self.value = value
class db:
def __init__(self):
self.varlist = []
self.ops = []
def Set(self,nm,val):
changed = False #Bool for keeping track of update
for item in varlist: #run through current list
if item.name == nm: #If the name is found
item.value = val #update the value
changed = True #found it
break #exit if found
if not changed:
newitem = item() #Create a new one and append it
newitem.name = nm
newitem.value = val
varlist.append(newitem)
def Get(key):
for item in varlist:
if item.name == key:
return item.value
break
def Unset(key):
for item in varlist:
if item.name == key:
item.value = -1
break
def Numequalto(key):
count = 0
for item in varlist:
if item.value == key:
count+=1
return count
def main():
newdb = db()
varlist=[]
comm = "" #prime it
while comm.lower() != "end":
comm = input("Command: ")
if comm.lower() == "begin":
print("----SESSION START---")
while comm.lower() != "end":
comm = input("Command: ")
part = []
for word in comm.split():
part.append(word.lower())
print(part)
if part[0].lower()=="set":
newdb.Set(part[1],part[2])
print(varlist)
elif part[0].lower()=="get":
gotten = Get(part[1])
print(gotten)
elif part[0].lower()=="unset":
Unset(part[1])
elif part[0].lower()=="numequalto":
numequal = Numequalto(part[1])
print(numequal)
print("Finished")
else:
print("--ERROR: Must BEGIN--")
if __name__ == "__main__":
main()
When I run this, and try to create a new item in my list using the command
set a 25
I get this error:
Traceback (most recent call last):
File "/Volumes/CON/LIFE/SimpleDatabase.py", line 81, in <module>
main()
File "/Volumes/CON/LIFE/SimpleDatabase.py", line 65, in main
newdb.Set(part[1],part[2])
File "/Volumes/CON/LIFE/SimpleDatabase.py", line 27, in Set
newitem = item() #Create a new one and append it
UnboundLocalError: local variable 'item' referenced before assignment
Any help would be much appreciated. I'm pretty new to Python

The line
for item in varlist:
shadows the class item with a local variable. So that when you get to your item() it thinks you are trying to call the local variable instead of the class. You can tell that your class item is never being constructed because it would fail as you are passing no parameters to the __init__
Also you should really call your class Item. Once I did that I got the constructor error as expected.

You have a few issues with your code:
You are shadowing the class item with a local variable of the same name.
You are using varlist instead of self.varlist.
Some of your class methods doesn't recieve a self first argument.
Also there is a strong convention in python to name classes with a first capital letter.

Not trying to be implite, just constructive here. I'm concerned that while there are comments questioning the intent to implement your own dictionary, no answer stated this forcefully. I say this only because part of Python (beyond the semantics, language, etc...) is the culture. We speak of things being 'Pythonic' for a reason - part of the value of this language is the culture. There are two aspects here to pay attention to - first, 'Batteries Included' and second, "Don't Reinvent the Wheel". You're reimplimenting the most fundamental composite (oxymoron, I know) data type in Python.
>>> a = {}
>>> a['bob'] = 1
>>> a['frank'] = 2
>>> a
{'frank': 2, 'bob': 1}
>>> del a['frank']
>>> a
{'bob': 1}
>>> del a['bob']
>>> a
{}
>>> a['george'] = 2
>>> b = len([x for x in a.values() if x == 2])
>>> b
1
And there you have it - the pythonic way of handling the functionality you're after.
If you're trying to add functionality or limitations beyond that, you're better off starting from the dict class and extending rather than rolling your own. Since Python is "duck-typed" there's a HUGE benefit to using the existing structure as your basis because it all falls into the same patterns.

Related

How to know the name of a classs loade like parameter on other class - Pyhton [duplicate]

This question already has answers here:
Getting the name of a variable as a string
(32 answers)
Closed 4 months ago.
Is it possible to get the original variable name of a variable passed to a function? E.g.
foobar = "foo"
def func(var):
print var.origname
So that:
func(foobar)
Returns:
>>foobar
EDIT:
All I was trying to do was make a function like:
def log(soup):
f = open(varname+'.html', 'w')
print >>f, soup.prettify()
f.close()
.. and have the function generate the filename from the name of the variable passed to it.
I suppose if it's not possible I'll just have to pass the variable and the variable's name as a string each time.
EDIT: To make it clear, I don't recommend using this AT ALL, it will break, it's a mess, it won't help you in any way, but it's doable for entertainment/education purposes.
You can hack around with the inspect module, I don't recommend that, but you can do it...
import inspect
def foo(a, f, b):
frame = inspect.currentframe()
frame = inspect.getouterframes(frame)[1]
string = inspect.getframeinfo(frame[0]).code_context[0].strip()
args = string[string.find('(') + 1:-1].split(',')
names = []
for i in args:
if i.find('=') != -1:
names.append(i.split('=')[1].strip())
else:
names.append(i)
print names
def main():
e = 1
c = 2
foo(e, 1000, b = c)
main()
Output:
['e', '1000', 'c']
To add to Michael Mrozek's answer, you can extract the exact parameters versus the full code by:
import re
import traceback
def func(var):
stack = traceback.extract_stack()
filename, lineno, function_name, code = stack[-2]
vars_name = re.compile(r'\((.*?)\).*$').search(code).groups()[0]
print vars_name
return
foobar = "foo"
func(foobar)
# PRINTS: foobar
Looks like Ivo beat me to inspect, but here's another implementation:
import inspect
def varName(var):
lcls = inspect.stack()[2][0].f_locals
for name in lcls:
if id(var) == id(lcls[name]):
return name
return None
def foo(x=None):
lcl='not me'
return varName(x)
def bar():
lcl = 'hi'
return foo(lcl)
bar()
# 'lcl'
Of course, it can be fooled:
def baz():
lcl = 'hi'
x='hi'
return foo(lcl)
baz()
# 'x'
Moral: don't do it.
Another way you can try if you know what the calling code will look like is to use traceback:
def func(var):
stack = traceback.extract_stack()
filename, lineno, function_name, code = stack[-2]
code will contain the line of code that was used to call func (in your example, it would be the string func(foobar)). You can parse that to pull out the argument
You can't. It's evaluated before being passed to the function. All you can do is pass it as a string.
#Ivo Wetzel's answer works in the case of function call are made in one line, like
e = 1 + 7
c = 3
foo(e, 100, b=c)
In case that function call is not in one line, like:
e = 1 + 7
c = 3
foo(e,
1000,
b = c)
below code works:
import inspect, ast
def foo(a, f, b):
frame = inspect.currentframe()
frame = inspect.getouterframes(frame)[1]
string = inspect.findsource(frame[0])[0]
nodes = ast.parse(''.join(string))
i_expr = -1
for (i, node) in enumerate(nodes.body):
if hasattr(node, 'value') and isinstance(node.value, ast.Call)
and hasattr(node.value.func, 'id') and node.value.func.id == 'foo' # Here goes name of the function:
i_expr = i
break
i_expr_next = min(i_expr + 1, len(nodes.body)-1)
lineno_start = nodes.body[i_expr].lineno
lineno_end = nodes.body[i_expr_next].lineno if i_expr_next != i_expr else len(string)
str_func_call = ''.join([i.strip() for i in string[lineno_start - 1: lineno_end]])
params = str_func_call[str_func_call.find('(') + 1:-1].split(',')
print(params)
You will get:
[u'e', u'1000', u'b = c']
But still, this might break.
You can use python-varname package
from varname import nameof
s = 'Hey!'
print (nameof(s))
Output:
s
Package below:
https://github.com/pwwang/python-varname
For posterity, here's some code I wrote for this task, in general I think there is a missing module in Python to give everyone nice and robust inspection of the caller environment. Similar to what rlang eval framework provides for R.
import re, inspect, ast
#Convoluted frame stack walk and source scrape to get what the calling statement to a function looked like.
#Specifically return the name of the variable passed as parameter found at position pos in the parameter list.
def _caller_param_name(pos):
#The parameter name to return
param = None
#Get the frame object for this function call
thisframe = inspect.currentframe()
try:
#Get the parent calling frames details
frames = inspect.getouterframes(thisframe)
#Function this function was just called from that we wish to find the calling parameter name for
function = frames[1][3]
#Get all the details of where the calling statement was
frame,filename,line_number,function_name,source,source_index = frames[2]
#Read in the source file in the parent calling frame upto where the call was made
with open(filename) as source_file:
head=[source_file.next() for x in xrange(line_number)]
source_file.close()
#Build all lines of the calling statement, this deals with when a function is called with parameters listed on each line
lines = []
#Compile a regex for matching the start of the function being called
regex = re.compile(r'\.?\s*%s\s*\(' % (function))
#Work backwards from the parent calling frame line number until we see the start of the calling statement (usually the same line!!!)
for line in reversed(head):
lines.append(line.strip())
if re.search(regex, line):
break
#Put the lines we have groked back into sourcefile order rather than reverse order
lines.reverse()
#Join all the lines that were part of the calling statement
call = "".join(lines)
#Grab the parameter list from the calling statement for the function we were called from
match = re.search('\.?\s*%s\s*\((.*)\)' % (function), call)
paramlist = match.group(1)
#If the function was called with no parameters raise an exception
if paramlist == "":
raise LookupError("Function called with no parameters.")
#Use the Python abstract syntax tree parser to create a parsed form of the function parameter list 'Name' nodes are variable names
parameter = ast.parse(paramlist).body[0].value
#If there were multiple parameters get the positional requested
if type(parameter).__name__ == 'Tuple':
#If we asked for a parameter outside of what was passed complain
if pos >= len(parameter.elts):
raise LookupError("The function call did not have a parameter at postion %s" % pos)
parameter = parameter.elts[pos]
#If there was only a single parameter and another was requested raise an exception
elif pos != 0:
raise LookupError("There was only a single calling parameter found. Parameter indices start at 0.")
#If the parameter was the name of a variable we can use it otherwise pass back None
if type(parameter).__name__ == 'Name':
param = parameter.id
finally:
#Remove the frame reference to prevent cyclic references screwing the garbage collector
del thisframe
#Return the parameter name we found
return param
If you want a Key Value Pair relationship, maybe using a Dictionary would be better?
...or if you're trying to create some auto-documentation from your code, perhaps something like Doxygen (http://www.doxygen.nl/) could do the job for you?
I wondered how IceCream solves this problem. So I looked into the source code and came up with the following (slightly simplified) solution. It might not be 100% bullet-proof (e.g. I dropped get_text_with_indentation and I assume exactly one function argument), but it works well for different test cases. It does not need to parse source code itself, so it should be more robust and simpler than previous solutions.
#!/usr/bin/env python3
import inspect
from executing import Source
def func(var):
callFrame = inspect.currentframe().f_back
callNode = Source.executing(callFrame).node
source = Source.for_frame(callFrame)
expression = source.asttokens().get_text(callNode.args[0])
print(expression, '=', var)
i = 1
f = 2.0
dct = {'key': 'value'}
obj = type('', (), {'value': 42})
func(i)
func(f)
func(s)
func(dct['key'])
func(obj.value)
Output:
i = 1
f = 2.0
s = string
dct['key'] = value
obj.value = 42
Update: If you want to move the "magic" into a separate function, you simply have to go one frame further back with an additional f_back.
def get_name_of_argument():
callFrame = inspect.currentframe().f_back.f_back
callNode = Source.executing(callFrame).node
source = Source.for_frame(callFrame)
return source.asttokens().get_text(callNode.args[0])
def func(var):
print(get_name_of_argument(), '=', var)
If you want to get the caller params as in #Matt Oates answer answer without using the source file (ie from Jupyter Notebook), this code (combined from #Aeon answer) will do the trick (at least in some simple cases):
def get_caller_params():
# get the frame object for this function call
thisframe = inspect.currentframe()
# get the parent calling frames details
frames = inspect.getouterframes(thisframe)
# frame 0 is the frame of this function
# frame 1 is the frame of the caller function (the one we want to inspect)
# frame 2 is the frame of the code that calls the caller
caller_function_name = frames[1][3]
code_that_calls_caller = inspect.findsource(frames[2][0])[0]
# parse code to get nodes of abstract syntact tree of the call
nodes = ast.parse(''.join(code_that_calls_caller))
# find the node that calls the function
i_expr = -1
for (i, node) in enumerate(nodes.body):
if _node_is_our_function_call(node, caller_function_name):
i_expr = i
break
# line with the call start
idx_start = nodes.body[i_expr].lineno - 1
# line with the end of the call
if i_expr < len(nodes.body) - 1:
# next expression marks the end of the call
idx_end = nodes.body[i_expr + 1].lineno - 1
else:
# end of the source marks the end of the call
idx_end = len(code_that_calls_caller)
call_lines = code_that_calls_caller[idx_start:idx_end]
str_func_call = ''.join([line.strip() for line in call_lines])
str_call_params = str_func_call[str_func_call.find('(') + 1:-1]
params = [p.strip() for p in str_call_params.split(',')]
return params
def _node_is_our_function_call(node, our_function_name):
node_is_call = hasattr(node, 'value') and isinstance(node.value, ast.Call)
if not node_is_call:
return False
function_name_correct = hasattr(node.value.func, 'id') and node.value.func.id == our_function_name
return function_name_correct
You can then run it as this:
def test(*par_values):
par_names = get_caller_params()
for name, val in zip(par_names, par_values):
print(name, val)
a = 1
b = 2
string = 'text'
test(a, b,
string
)
to get the desired output:
a 1
b 2
string text
Since you can have multiple variables with the same content, instead of passing the variable (content), it might be safer (and will be simpler) to pass it's name in a string and get the variable content from the locals dictionary in the callers stack frame. :
def displayvar(name):
import sys
return name+" = "+repr(sys._getframe(1).f_locals[name])
If it just so happens that the variable is a callable (function), it will have a __name__ property.
E.g. a wrapper to log the execution time of a function:
def time_it(func, *args, **kwargs):
start = perf_counter()
result = func(*args, **kwargs)
duration = perf_counter() - start
print(f'{func.__name__} ran in {duration * 1000}ms')
return result

In python, is there a way to automatically log information any time you create a variable?

Not sure if this makes sense at all, but here's an example:
Let's say I have a script. In this script, I create a list
list = [1,2,3,4]
Maybe I just don't have the technical vocabulary to find what I'm looking for, but is there any way I could set some logging up so that any time I created a variable I could store information in a log file? Given the above example, maybe I'd want to see how many elements are in the list?
I understand that I could simply write a function and call that over and over again, but let's say I might want to know information about a ton of different data types, not just lists. It wouldn't be clean to call a function repeatedly.
this is hackery but what the heck
class _LoggeryType(type):
def __setattr__(cls,attr,value):
print("SET VAR: {0} = {1}".format(attr,value))
globals().update({attr:value})
# Python3
class Loggery(metaclass=_LoggeryType):
pass
# python2
class Loggery:
__metaclass__=_LoggeryType
Loggery.x = 5
print("OK set X={0}".format(x))
note i wouldn't really recommend using this
One method would be to use the powerful sys.settrace. I've written up a small (but somewhat incomplete) example:
tracer.py:
import inspect
import sys
import os
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger('tracing-logger')
FILES_TO_TRACE = [os.path.basename(__file__), 'tracee.py']
print(FILES_TO_TRACE)
def new_var(name, value, context):
logger.debug(f"New {context} variable called {name} = {value}")
# do some analysis here, for example
if type(value) == list:
logger.debug(f"\tNumber of elements: {len(value)}")
def changed_var(name, value, context):
logger.debug(f"{context} variable called {name} of was changed to: {value}")
def make_tracing_func():
current_locals = {}
current_globals = {}
first_line_executed = False
def tracing_func(frame, event, arg):
nonlocal first_line_executed
frame_info = inspect.getframeinfo(frame)
filename = os.path.basename(frame_info.filename)
line_num = frame_info.lineno
if event == 'line':
# check for difference with locals
for var_name in frame.f_code.co_varnames:
if var_name in frame.f_locals:
var_value = frame.f_locals[var_name]
if var_name not in current_locals:
current_locals[var_name] = var_value
new_var(var_name, var_value, 'local')
elif current_locals[var_name] != var_value:
current_locals[var_name] = var_value
changed_var(var_name, var_value, 'local')
for var_name, var_value in frame.f_globals.items():
if var_name not in current_globals:
current_globals[var_name] = var_value
if first_line_executed:
new_var(var_name, var_value, 'global')
elif current_globals[var_name] != var_value:
current_globals[var_name] = var_value
changed_var(var_name, var_value, 'global')
first_line_executed = True
return tracing_func
elif event == 'call':
if os.path.basename(filename) in FILES_TO_TRACE:
return make_tracing_func()
return None
return tracing_func
sys.settrace(make_tracing_func())
import tracee
tracee.py
my_list = [1, 2, 3, 4]
a = 3
print("tracee: I have a list!", my_list)
c = a + sum(my_list)
print("tracee: A number:", c)
c = 12
print("tracee: I changed it:", c)
Output:
DEBUG:tracing-logger:New global variable called my_list = [1, 2, 3, 4]
DEBUG:tracing-logger: Number of elements: 4
DEBUG:tracing-logger:New global variable called a = 3
tracee: I have a list! [1, 2, 3, 4]
DEBUG:tracing-logger:New global variable called c = 13
tracee: A number: 13
DEBUG:tracing-logger:global variable called c was changed to: 12
tracee: I changed it: 12
There are some additional cases you may want to handle (duplicated changes to globals due to function calls, closure variables, etc.). You can also use linecache to find the contents of the lines, or use the line_num variable in the logging.

Create objects in Python based on file

I'm coding a game in Python 3 and I need to create an unknown number of objects with each objects properties based on the contents of a file.
To explain, I'll dump some code here:
class attack(object):
def __init__(self, name, power):
self.name = name
self.element = int(power)
import getline from linecache
Attacks = []
count = 1
while 1==1:
line=getline("Attacks.txt", count)
line = line.rstrip()
if line == "":
break
else:
linelist = line.split()
#something involving "attack(linelist[1], linelist[2])"
Attacks.append(item)
count += 1
"Attacks.txt" contains this:
0 Punch 2
1 Kick 3
2 Throw 4
3 Dropkick 6
4 Uppercut 8
When the code is done, the list "Attacks" should contain 5 attack objects, one for each line of "Attacks.txt" with the listed name and power. The name is for the user only; in the code, each object will only be called for by its place in its list.
The idea is that the end user can change "Attacks.txt" (and other similar files) to add, remove or change entries; that way, they can modify my game without digging around in the actual code.
The issue is I have no idea how to create objects on the fly like this or if I even can. I already have working code that builds a list from a file; the only problem is the object creation.
My question, simply put, is how do I do this?
I had the same problem someday:
How to call class constructor having its name in text variable? [Python]
You obviously have to define classes which names are in file. I assume that is done. And you need to have them in current module namespace globals()
from somelib import Punch, Kick, Throw, Dropkick, Uppercut
globals()[class_name](x, y)
line = getline("Attacks.txt", count)
line = line.rstrip()
linelist = line.split()
class_name = linelist[1]
value = linelist[2]
class_object = globals()[class_name]
item = class_object(value)
# or shortly in one line:
# item = globals()[linelist[1]](linelist[2])
You could create a class like so providing overloading operators to support the operations:
class Operation:
def __init__(self, *header):
self.__dict__ = dict(zip(['attack', 'power'], header))
class Attack:
def __init__(self, *headers):
self.__dict__ = {"attack{}".format(i):Operation(*a) for i, a in enumerate(headers, start=1)}
def __setitem__(self, attack_type, new_power):
self.__dict__ = {a:Operation(attack_type, new_power) if b.attack == attack_type else b for a, b in self.__dict__.items()}
def __getitem__(self, attack):
return [b.power for _, b in self.__dict__.items() if b.attack == attack]
#property
def power_listings(self):
return '\n'.join(['{} {}'.format(*[b.attack, b.power]) for _, b in self.__dict__.items()])
with open('filename.txt') as f:
f = [i.strip('\n').split() for i in f]
a = Attack(*f)
print(a.power_listings)
a['Throw'] = 6 #updating the power of any occurrence of Throw
Output:
Throw 6
Kick 3
Punch 2
Uppercut 8
Dropkick 6

NameError in Python - not sure if the class or the function is causing the error

I have a text file describing resumes where each line looks like:
name university sex filename
So one line would say something like
John Texas M resume1.doc
The file has standard formatting and does not contain any errors. There are four possible names and four possible universities, randomized to create 64 resumes. I'm trying to write a program that reads through the text file, creates a resume object with attributes for the name, university, sex, and filename, and adds these objects to a list of resume objects. I have a lot of experience in C++, but this is my first Python program and I'm getting thrown off by an error:
File "mycode.py", line 142, in <module>
resumes()
File "mycode.py", line 65, in resumes
r = resume(name,uni,sex,filename)
NameError: global name "name" is not defined
My code looks like:
class resume:
def __init__(self, name, uni, sex, filename)
self.name = name
self.uni = uni
self.sex = sex
self.filename = filename
mylist[]
def resumes():
f = open("resumes.txt",'r')
for line in f:
for word in line.split():
if word == ("John" or "Fred" or "Jim" or "Michael"):
name = word
elif word == ("Texas" or "Georgia" or "Florida" or "Montana"):
uni = word
elif word == "M":
sex = word
elif re.match(r'\w\.doc',word):
filename = word
r = resume(name,uni,sex,filename)
mylist.insert(r)
I'm not sure if the error is in the class or the function. My computer isn't showing any syntax errors but I'm new to this so if there are, please feel free to tell me how to fix them.
I've tried defining name, uni, etc. outside the "for word in line.split()" loop but the program still had an issue with the line "r = resume(name,uni,sex,filename)" so I'm not sure what the issue is. I've read through other answers about NameError but I'm new to Python and couldn't figure out the equivalent problem in my code.
The NameError is caused by undefined variables in cases where no values are found in the text file. Define them within the function before you try to assign values from the text file to them:
def resumes():
f = open("resumes.txt",'r')
for line in f:
name = ""
uni = ""
sex = ""
filename = ""
for word in line.split():
...
You can also pre-define the variables in your class initialization by using keyword arguments if you like (this isn't the cause of the NameError though):
class resume:
def __init__(self, name="", uni="", sex="", filename="")
self.name = name
self.uni = uni
self.sex = sex
self.filename = filename
Defining a list in python is done by typing mylist = [], not mylist[]. Also, at the moment, the list would be defined in the global namespace which is generally discouraged. Instead, you can make resumes return a list and assign this value to mylist:
def resumes():
resume_list = []
f = open("resumes.txt",'r')
for line in f:
for word in line.split():
if word == ("John" or "Fred" or "Jim" or "Michael"):
name = word
elif word == ("Texas" or "Georgia" or "Florida" or "Montana"):
uni = word
elif word == "M":
sex = word
elif re.match(r'\w\.doc',word):
filename = word
r = resume(name,uni,sex,filename)
resume_list.insert(r)
return resume_list
Then you can do the following anywhere in your code:
mylist = resumes()
Remember to close files after opening them; in your case by calling f.close() after processing all the lines. Even better, have python manage it automatically by using the context manager with so you don't have to call f.close():
def resumes():
with open("resumes.txt",'r') as f:
for line in f:
...
Typically, you'd use append rather than insert when working with lists. insert takes two arguments (position/index, and the element to insert) so mylist.insert(r) should raise a TypeError: insert() takes exactly 2 arguments (1 given). Instead, do mylist.append(r) to insert r after the last element in the list.
As, johnrsharpe pointed out in the comments, your word comparisons probably aren't doing what you expect. See this example:
>>> word = "John"
>>> word == ("John" or "Fred" or "Jim" or "Michael")
True
>>> word = "Fred"
>>> word == ("John" or "Fred" or "Jim" or "Michael")
False
>>>
Instead, use a tuple or a set and the keyword in to check if word equals any of the four names:
>>> word = "John"
>>> word in {"John", "Fred", "Jim", "Michael"}
True
>>> word = "Fred"
>>> word in {"John", "Fred", "Jim", "Michael"}
True
>>>
>>> type({"John", "Fred", "Jim", "Michael"})
<type 'set'>
>>>
Finally, as Daniel pointed out, remember the colon, :, after function definitions such as def __init__(...)
Your code is throwing a NameError because at some point in the iteration of your file, some word variable doesn't fulfill any of the conditionals in this line of your function: if word == ("John" or "Fred" or "Jim" or "Michael"):, and name doesn't get defined.
The simplest way to workaround this error is to assign default values to your variables outside the scopes of your class and function (or within the scope of your function):
name = "name"
uni = "uni"
sex = "sex"
filename = "filename"
class resume:
# rest of your code
As an alternative, you could include conditional checks within your function for your variables; if the variable isn't yet defined, assign it a default value:
if "name" not in locals():
name = "name"
r = resume(name,uni,sex,filename)
Finally, you'll want to append a colon to this line, from this:
def __init__(self, name, uni, sex, filename)
to this:
def __init__(self, name, uni, sex, filename):
change this line where you intialize mylist from this:
mylist[]
to this:
mylist = []
and change:
mylist.insert(r)
to:
mylist.append(r)

Python recursive setattr()-like function for working with nested dictionaries [duplicate]

This question already has answers here:
Is it possible to index nested lists using tuples in python?
(7 answers)
Closed 7 months ago.
There are a lot of good getattr()-like functions for parsing nested dictionary structures, such as:
Finding a key recursively in a dictionary
Suppose I have a python dictionary , many nests
https://gist.github.com/mittenchops/5664038
I would like to make a parallel setattr(). Essentially, given:
cmd = 'f[0].a'
val = 'whatever'
x = {"a":"stuff"}
I'd like to produce a function such that I can assign:
x['f'][0]['a'] = val
More or less, this would work the same way as:
setattr(x,'f[0].a',val)
to yield:
>>> x
{"a":"stuff","f":[{"a":"whatever"}]}
I'm currently calling it setByDot():
setByDot(x,'f[0].a',val)
One problem with this is that if a key in the middle doesn't exist, you need to check for and make an intermediate key if it doesn't exist---ie, for the above:
>>> x = {"a":"stuff"}
>>> x['f'][0]['a'] = val
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'f'
So, you first have to make:
>>> x['f']=[{}]
>>> x
{'a': 'stuff', 'f': [{}]}
>>> x['f'][0]['a']=val
>>> x
{'a': 'stuff', 'f': [{'a': 'whatever'}]}
Another is that keying for when the next item is a lists will be different than the keying when the next item is a string, ie:
>>> x = {"a":"stuff"}
>>> x['f']=['']
>>> x['f'][0]['a']=val
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
...fails because the assignment was for a null string instead of a null dict. The null dict will be the right assignment for every non-list in dict until the very last one---which may be a list, or a value.
A second problem, pointed out in the comments below by #TokenMacGuy, is that when you have to create a list that does not exist, you may have to create an awful lot of blank values. So,
setattr(x,'f[10].a',val)
---may mean the algorithm will have to make an intermediate like:
>>> x['f']=[{},{},{},{},{},{},{},{},{},{},{}]
>>> x['f'][10]['a']=val
to yield
>>> x
{"a":"stuff","f":[{},{},{},{},{},{},{},{},{},{},{"a":"whatever"}]}
such that this is the setter associated with the getter...
>>> getByDot(x,"f[10].a")
"whatever"
More importantly, the intermediates should /not/ overwrite values that already exist.
Below is the junky idea I have so far---I can identify the lists versus dicts and other data types, and create them where they do not exist. However, I don't see (a) where to put the recursive call, or (b) how to 'build' the deep object as I iterate through the list, and (c) how to distinguish the /probing/ I'm doing as I construct the deep object from the /setting/ I have to do when I reach the end of the stack.
def setByDot(obj,ref,newval):
ref = ref.replace("[",".[")
cmd = ref.split('.')
numkeys = len(cmd)
count = 0
for c in cmd:
count = count+1
while count < numkeys:
if c.find("["):
idstart = c.find("[")
numend = c.find("]")
try:
deep = obj[int(idstart+1:numend-1)]
except:
obj[int(idstart+1:numend-1)] = []
deep = obj[int(idstart+1:numend-1)]
else:
try:
deep = obj[c]
except:
if obj[c] isinstance(dict):
obj[c] = {}
else:
obj[c] = ''
deep = obj[c]
setByDot(deep,c,newval)
This seems very tricky because you kind of have to look-ahead to check the type of the /next/ object if you're making place-holders, and you have to look-behind to build a path up as you go.
UPDATE
I recently had this question answered, too, which might be relevant or helpful.
I have separated this out into two steps. In the first step, the query string is broken down into a series of instructions. This way the problem is decoupled, we can view the instructions before running them, and there is no need for recursive calls.
def build_instructions(obj, q):
"""
Breaks down a query string into a series of actionable instructions.
Each instruction is a (_type, arg) tuple.
arg -- The key used for the __getitem__ or __setitem__ call on
the current object.
_type -- Used to determine the data type for the value of
obj.__getitem__(arg)
If a key/index is missing, _type is used to initialize an empty value.
In this way _type provides the ability to
"""
arg = []
_type = None
instructions = []
for i, ch in enumerate(q):
if ch == "[":
# Begin list query
if _type is not None:
arg = "".join(arg)
if _type == list and arg.isalpha():
_type = dict
instructions.append((_type, arg))
_type, arg = None, []
_type = list
elif ch == ".":
# Begin dict query
if _type is not None:
arg = "".join(arg)
if _type == list and arg.isalpha():
_type = dict
instructions.append((_type, arg))
_type, arg = None, []
_type = dict
elif ch.isalnum():
if i == 0:
# Query begins with alphanum, assume dict access
_type = type(obj)
# Fill out args
arg.append(ch)
else:
TypeError("Unrecognized character: {}".format(ch))
if _type is not None:
# Finish up last query
instructions.append((_type, "".join(arg)))
return instructions
For your example
>>> x = {"a": "stuff"}
>>> print(build_instructions(x, "f[0].a"))
[(<type 'dict'>, 'f'), (<type 'list'>, '0'), (<type 'dict'>, 'a')]
The expected return value is simply the _type (first item) of the next tuple in the instructions. This is very important because it allows us to correctly initialize/reconstruct missing keys.
This means that our first instruction operates on a dict, either sets or gets the key 'f', and is expected to return a list. Similarly, our second instruction operates on a list, either sets or gets the index 0 and is expected to return a dict.
Now let's create our _setattr function. This gets the proper instructions and goes through them, creating key-value pairs as necessary. Finally, it also sets the val we give it.
def _setattr(obj, query, val):
"""
This is a special setattr function that will take in a string query,
interpret it, add the appropriate data structure to obj, and set val.
We only define two actions that are available in our query string:
.x -- dict.__setitem__(x, ...)
[x] -- list.__setitem__(x, ...) OR dict.__setitem__(x, ...)
the calling context determines how this is interpreted.
"""
instructions = build_instructions(obj, query)
for i, (_, arg) in enumerate(instructions[:-1]):
_type = instructions[i + 1][0]
obj = _set(obj, _type, arg)
_type, arg = instructions[-1]
_set(obj, _type, arg, val)
def _set(obj, _type, arg, val=None):
"""
Helper function for calling obj.__setitem__(arg, val or _type()).
"""
if val is not None:
# Time to set our value
_type = type(val)
if isinstance(obj, dict):
if arg not in obj:
# If key isn't in obj, initialize it with _type()
# or set it with val
obj[arg] = (_type() if val is None else val)
obj = obj[arg]
elif isinstance(obj, list):
n = len(obj)
arg = int(arg)
if n > arg:
obj[arg] = (_type() if val is None else val)
else:
# Need to amplify our list, initialize empty values with _type()
obj.extend([_type() for x in range(arg - n + 1)])
obj = obj[arg]
return obj
And just because we can, here's a _getattr function.
def _getattr(obj, query):
"""
Very similar to _setattr. Instead of setting attributes they will be
returned. As expected, an error will be raised if a __getitem__ call
fails.
"""
instructions = build_instructions(obj, query)
for i, (_, arg) in enumerate(instructions[:-1]):
_type = instructions[i + 1][0]
obj = _get(obj, _type, arg)
_type, arg = instructions[-1]
return _get(obj, _type, arg)
def _get(obj, _type, arg):
"""
Helper function for calling obj.__getitem__(arg).
"""
if isinstance(obj, dict):
obj = obj[arg]
elif isinstance(obj, list):
arg = int(arg)
obj = obj[arg]
return obj
In action:
>>> x = {"a": "stuff"}
>>> _setattr(x, "f[0].a", "test")
>>> print x
{'a': 'stuff', 'f': [{'a': 'test'}]}
>>> print _getattr(x, "f[0].a")
"test"
>>> x = ["one", "two"]
>>> _setattr(x, "3[0].a", "test")
>>> print x
['one', 'two', [], [{'a': 'test'}]]
>>> print _getattr(x, "3[0].a")
"test"
Now for some cool stuff. Unlike python, our _setattr function can set unhashable dict keys.
x = []
_setattr(x, "1.4", "asdf")
print x
[{}, {'4': 'asdf'}] # A list, which isn't hashable
>>> y = {"a": "stuff"}
>>> _setattr(y, "f[1.4]", "test") # We're indexing f with 1.4, which is a list!
>>> print y
{'a': 'stuff', 'f': [{}, {'4': 'test'}]}
>>> print _getattr(y, "f[1.4]") # Works for _getattr too
"test"
We aren't really using unhashable dict keys, but it looks like we are in our query language so who cares, right!
Finally, you can run multiple _setattr calls on the same object, just give it a try yourself.
>>> class D(dict):
... def __missing__(self, k):
... ret = self[k] = D()
... return ret
...
>>> x=D()
>>> x['f'][0]['a'] = 'whatever'
>>> x
{'f': {0: {'a': 'whatever'}}}
You can hack something together by fixing two problems:
List that automatically grows when accessed out of bounds (PaddedList)
A way to delay the decision of what to create (list of dict) until you accessed it by the first time (DictOrList)
So the code will look like this:
import collections
class PaddedList(list):
""" List that grows automatically up to the max index ever passed"""
def __init__(self, padding):
self.padding = padding
def __getitem__(self, key):
if isinstance(key, int) and len(self) <= key:
self.extend(self.padding() for i in xrange(key + 1 - len(self)))
return super(PaddedList, self).__getitem__(key)
class DictOrList(object):
""" Object proxy that delays the decision of being a List or Dict """
def __init__(self, parent):
self.parent = parent
def __getitem__(self, key):
# Type of the structure depends on the type of the key
if isinstance(key, int):
obj = PaddedList(MyDict)
else:
obj = MyDict()
# Update parent references with the selected object
parent_seq = (self.parent if isinstance(self.parent, dict)
else xrange(len(self.parent)))
for i in parent_seq:
if self == parent_seq[i]:
parent_seq[i] = obj
break
return obj[key]
class MyDict(collections.defaultdict):
def __missing__(self, key):
ret = self[key] = DictOrList(self)
return ret
def pprint_mydict(d):
""" Helper to print MyDict as dicts """
print d.__str__().replace('defaultdict(None, {', '{').replace('})', '}')
x = MyDict()
x['f'][0]['a'] = 'whatever'
y = MyDict()
y['f'][10]['a'] = 'whatever'
pprint_mydict(x)
pprint_mydict(y)
And the output of x and y will be:
{'f': [{'a': 'whatever'}]}
{'f': [{}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {'a': 'whatever'}]}
The trick consist on creating a defaultdict of objects that can be either a dict or a list depending how you access it.
So when you have the assigment x['f'][10]['a'] = 'whatever' it will work the following way:
Get X['f']. It wont exist so it will return a DictOrList object for the index 'f'
Get X['f'][10]. DictOrList.getitem will be called with an integer index. The DictOrList object will replace itself in the parent collection by a PaddedList
Access the 11th element in the PaddedList will grow it by 11 elements and will return the MyDict element in that position
Assign "whatever" to x['f'][10]['a']
Both PaddedList and DictOrList are bit hacky, but after all the assignments there is no more magic, you have an structure of dicts and lists.
It is possible to synthesize recursively setting items/attributes by overriding __getitem__ to return a return a proxy that can set a value in the original function.
I happen to be working on a library that does a few things similar to this, so I was working on a class that can dynamically assign its own subclasses at instantiation. It makes working with this sort of thing easier, but if that kind of hacking makes you squeamish, you can get similar behavior by creating a ProxyObject similar to the one I create and by creating the individual classes used by the ProxyObject dynamically in the a function. Something like
class ProxyObject(object):
... #see below
def instanciateProxyObjcet(val):
class ProxyClassForVal(ProxyObject,val.__class__):
pass
return ProxyClassForVal(val)
You can use dictionary like I've used in FlexibleObject below would make that implementation significantly more efficient if this is the way you implement it. The code I will providing uses the FlexibleObject though. Right now it only supports classes that, like almost all of Python's builtin classes are capable of being generated by taking an instance of themselves as their sole argument to their __init__/__new__. In the next week or two, I'll add support for anything pickleable, and link to a github repository that contains it. Here's the code:
class FlexibleObject(object):
""" A FlexibleObject is a baseclass for allowing type to be declared
at instantiation rather than in the declaration of the class.
Usage:
class DoubleAppender(FlexibleObject):
def append(self,x):
super(self.__class__,self).append(x)
super(self.__class__,self).append(x)
instance1 = DoubleAppender(list)
instance2 = DoubleAppender(bytearray)
"""
classes = {}
def __new__(cls,supercls,*args,**kws):
if isinstance(supercls,type):
supercls = (supercls,)
else:
supercls = tuple(supercls)
if (cls,supercls) in FlexibleObject.classes:
return FlexibleObject.classes[(cls,supercls)](*args,**kws)
superclsnames = tuple([c.__name__ for c in supercls])
name = '%s%s' % (cls.__name__,superclsnames)
d = dict(cls.__dict__)
d['__class__'] = cls
if cls == FlexibleObject:
d.pop('__new__')
try:
d.pop('__weakref__')
except:
pass
d['__dict__'] = {}
newcls = type(name,supercls,d)
FlexibleObject.classes[(cls,supercls)] = newcls
return newcls(*args,**kws)
Then to use this to use this to synthesize looking up attributes and items of a dictionary-like object you can do something like this:
class ProxyObject(FlexibleObject):
#classmethod
def new(cls,obj,quickrecdict,path,attribute_marker):
self = ProxyObject(obj.__class__,obj)
self.__dict__['reference'] = quickrecdict
self.__dict__['path'] = path
self.__dict__['attr_mark'] = attribute_marker
return self
def __getitem__(self,item):
path = self.__dict__['path'] + [item]
ref = self.__dict__['reference']
return ref[tuple(path)]
def __setitem__(self,item,val):
path = self.__dict__['path'] + [item]
ref = self.__dict__['reference']
ref.dict[tuple(path)] = ProxyObject.new(val,ref,
path,self.__dict__['attr_mark'])
def __getattribute__(self,attr):
if attr == '__dict__':
return object.__getattribute__(self,'__dict__')
path = self.__dict__['path'] + [self.__dict__['attr_mark'],attr]
ref = self.__dict__['reference']
return ref[tuple(path)]
def __setattr__(self,attr,val):
path = self.__dict__['path'] + [self.__dict__['attr_mark'],attr]
ref = self.__dict__['reference']
ref.dict[tuple(path)] = ProxyObject.new(val,ref,
path,self.__dict__['attr_mark'])
class UniqueValue(object):
pass
class QuickRecursiveDict(object):
def __init__(self,dictionary={}):
self.dict = dictionary
self.internal_id = UniqueValue()
self.attr_marker = UniqueValue()
def __getitem__(self,item):
if item in self.dict:
val = self.dict[item]
try:
if val.__dict__['path'][0] == self.internal_id:
return val
else:
raise TypeError
except:
return ProxyObject.new(val,self,[self.internal_id,item],
self.attr_marker)
try:
if item[0] == self.internal_id:
return ProxyObject.new(KeyError(),self,list(item),
self.attr_marker)
except TypeError:
pass #Item isn't iterable
return ProxyObject.new(KeyError(),self,[self.internal_id,item],
self.attr_marker)
def __setitem__(self,item,val):
self.dict[item] = val
The particulars of the implementation will vary depending on what you want. It's obviously significantly easier to just override __getitem__ in the proxy than it is to override both __getitem__ and __getattribute__ or __getattr__. The syntax you are using in setbydot makes it look like you would be happiest with some solution that overrides a mixture of the two.
If you are just using the dictionary to compare values, using =,<=,>= etc. Overriding __getattribute__ works really nicely. If you are wanting to do something more sophisticated, you will probably be better off overriding __getattr__ and doing some checks in __setattr__ to determine whether you want to be synthesizing setting the attribute by setting a value in the dictionary or whether you want to be actually setting the attribute on the item you've obtained. Or you might want to handle it so that if your object has an attribute, __getattribute__ returns a proxy to that attribute and __setattr__ always just sets the attribute in the object (in which case, you can completely omit it). All of these things depend on exactly what you are trying to use the dictionary for.
You also may want to create __iter__ and the like. It takes a little bit of effort to make them, but the details should follow from the implementation of __getitem__ and __setitem__.
Finally, I'm going to briefly summarize the behavior of the QuickRecursiveDict in case it's not immediately clear from inspection. The try/excepts are just shorthand for checking to see whether the ifs can be performed. The one major defect of synthesizing the recursive setting rather than find a way to do it is that you can no longer be raising KeyErrors when you try to access a key that hasn't been set. However, you can come pretty close by returning a subclass of KeyError which is what I do in the example. I haven't tested it so I won't add it to the code, but you may want to pass in some human-readable representation of the key to KeyError.
But aside from all that it works rather nicely.
>>> qrd = QuickRecursiveDict
>>> qrd[0][13] # returns an instance of a subclass of KeyError
>>> qrd[0][13] = 9
>>> qrd[0][13] # 9
>>> qrd[0][13]['forever'] = 'young'
>>> qrd[0][13] # 9
>>> qrd[0][13]['forever'] # 'young'
>>> qrd[0] # returns an instance of a subclass of KeyError
>>> qrd[0] = 0
>>> qrd[0] # 0
>>> qrd[0][13]['forever'] # 'young'
One more caveat, the things being returned is not quite what it looks like. It's a proxy to what it looks like. If you want the int 9, you need int(qrd[0][13]) not qrd[0][13]. For ints this doesn't matter much since, +,-,= and all that bypass __getattribute__ but for lists, you would lose attributes like append if you didn't recast them. (You'd keep len and other builtin methods, just not attributes of list. You lose __len__.)
So that's it. The code's a little bit convoluted, so let me know if you have any questions. I probably can't answer them until tonight unless the answer's really brief. I wish I saw this question sooner, it's a really cool question, and I'll try to update a cleaner solution soon. I had fun trying to code a solution into the wee hours of last night. :)

Categories

Resources