if i've a program running on a server, which one will use more memory:
a = operation1()
b = operation2()
c = doOperation(a, b)
or directy:
a = doOperation(operation1(), operation2())
Edit:
1: i'm using CPython.
2: i'm asking this question because sometimes, i love readability in my code, so instead of writing looong sequences of operation, u just split them into variables.
Edit2:
here is the full code:
class Reset(BaseHandler):
#tornado.web.asynchronous
#tornado.gen.engine
def get(self, uri):
uri = self.request.uri
try:
debut = time.time()
tim = uri[7:]
print tim
cod = yield tornado.gen.Task(db.users.find_one, ({"reset.timr":tim})) # this is temporary variable
code = cod[0]["reset"][-1]["code"] # this one too
dat = simpleencode.decode(tim, code)
now = datetime.datetime.now() # this one too
temps = datetime.datetime.strptime(dat[:19], "%Y-%m-%d %H:%M:%S") # this one too
valid = now - temps # what if i put them all here
if valid.days < 2:
print time.time() - debut # here time.time() has not been set to another variable, used directly
self.render("reset.html")
else:
self.write("hohohohoo")
self.finish()
except (ValueError, TypeError, UnboundLocalError):
self.write("pirate")
self.finish()
as you can see there are variables that are only temporarly useful.
Provided doOperation() does not clear it's own references to the arguments passed in, or create more references to the arguments, until doOperation() completes, the two approaches are exactly the same.
The latter will use less memory once doOperation() completes, because by then the local variables of the function are cleaned up. In the first option, because a and b still hold references, the ref count does not drop to 0.
CPython uses reference counting to clean up any objects that are no longer used; once the reference count drops to 0 objects are automatically cleaned up.
If memory and readability are a concern, you can delete references explicitly:
a = operation1()
b = operation2()
c = doOperation(a, b)
del a, b
but remember that local variables inside a function are cleaned up automatically, so the following would also result in the a and b references being removed:
def foo():
a = operation1()
b = operation2()
c = doOperation(a, b)
Memory occupied by values will only be reclaimed when the values are no longer referenced. Just looking at the examples you gave, it's impossible to tell when those values are no longer referenced, because we don't know what doOperation does.
One thing to keep in mind: assignment never copies values, so merely assigning a value to a name won't increase the memory use.
Also, unless you have an actual memory problem, don't worry about it. :)
Related
Why does CPython (no clue about other Python implementations) have the following behavior?
tuple1 = ()
tuple2 = ()
dict1 = {}
dict2 = {}
list1 = []
list2 = []
# makes sense, tuples are immutable
assert(id(tuple1) == id(tuple2))
# also makes sense dicts are mutable
assert(id(dict1) != id(dict2))
# lists are mutable too
assert(id(list1) != id(list2))
assert(id(()) == id(()))
# why no assertion error on this?
assert(id({}) == id({}))
# or this?
assert(id([]) == id([]))
I have a few ideas why it may, but can't find a concrete reason why.
EDIT
To further prove Glenn's and Thomas' point:
[1] id([])
4330909912
[2] x = []
[3] id(x)
4330909912
[4] id([])
4334243440
When you call id({}), Python creates a dict and passes it to the id function. The id function takes its id (its memory location), and throws away the dict. The dict is destroyed. When you do it twice in quick succession (without any other dicts being created in the mean time), the dict Python creates the second time happens to use the same block of memory as the first time. (CPython's memory allocator makes that a lot more likely than it sounds.) Since (in CPython) id uses the memory location as the object id, the id of the two objects is the same. This obviously doesn't happen if you assign the dict to a variable and then get its id(), because the dicts are alive at the same time, so their id has to be different.
Mutability does not directly come into play, but code objects caching tuples and strings do. In the same code object (function or class body or module body) the same literals (integers, strings and certain tuples) will be re-used. Mutable objects can never be re-used, they're always created at runtime.
In short, an object's id is only unique for the lifetime of the object. After the object is destroyed, or before it is created, something else can have the same id.
CPython is garbage collecting objects as soon as they go out of scope, so the second [] is created after the first [] is collected. So, most of the time it ends up in the same memory location.
This shows what's happening very clearly (the output is likely to be different in other implementations of Python):
class A:
def __init__(self): print("a")
def __del__(self): print("b")
# a a b b False
print(A() is A())
# a b a b True
print(id(A()) == id(A()))
I have a bit of python code that's set to run on a schedule. (I think my problem is the same as if it were within a loop.)
Let's say that in its most basic form, the code snippet looks something like this:
A = 1
B = 2
renameMe = A + B
Let's say the scheduler runs the same snippet of code every 5 minutes. The values of variables A & B are different each time the code is run, but the operation renameMe = A + B is always the same.
The values for A & B are grabbed out of a dataframe that's updated every 5 minutes, so I don't know what they are in advance, but if I need to do something with them beforehand instead of assigning them to A & B right away, I can.
I recently found out that for other things to work, I need to be able to rename the variable renameMe every time that snippet of code runs. In other words, I want the variable's name to be renameMe1 the first time the code snippet runs, then renameMe2 when it runs 5 minutes later, and so on.
It doesn't really matter in which way the variable's name changes (ints, strs, whatever) as long as I'm able to find out what the new variable name is, and use it elsewhere.
Do NOT use a variable variable name, you will have problems, use a container:
a list:
# first time
container = []
# each loop/run
container.append(A+B)
## last value
container[-1]
a dictionary:
# first time
container = {}
# each loop/run
container['new_id'] = A+B
# access arbitrary value
container['my_previous_id']
If you need persistence, use a flat file or a database.
I think it is suitable to use a class so that setattr can be used:
class newVal:
def __init__(self):
self.n = 1
def addVal(self, a, b):
setattr(self, f”val{self.n}”, a+b)
self.n += 1
Values = newVal()
Values.addVal(a, b)
Values.val1 would now be assigned
I aggree with Mozway when saying variables names are likely to cause problems, but this is also something you could strictly manage.
globals() stores all variables names and values in the form of a collection of 2-tuples, like this one :
dict_items([('__name__', '__main__'), ..., ('thisName', 'renaMe1'), ('renaMe18', 10)])
So you should register your new variable name but not forget to delete the previous one in order to avoid overloading.
If you follow a natural law of equal births and deaths, you will avoid overpopulation.
I propose you this bunch of code (with comments inside) :
basename = 'renaMe'
def varUpdate():
# Get previous variable name
thisName = [i for i, j in globals().items() if i[:len(basename)] == basename][0]
# Define the new variable name
newName = basename + '%d'%sum([int(thisName[len(basename):]), 1])
# Register the new variable name
globals()[newName] = globals()[thisName]
# Delete previous variable name from global
del globals()[thisName]
def process(i):
# Isolate from process content for readibility
varUpdate()
# PROCESS BELOW
# ....
newVar = [i for i, j in globals().items() if i[:len(basename)] == basename][0]
print(newVar, " : ", globals()[newVar])
# With this for` loop we simulate 4 entries in process
for i in range(4):
### we enter in the process
process(i)
Test in the shell
First restart your shell and let's suppose we have at the beginning renaMe12 = 12 :
>>> renaMe12 = 12
>>> Proposed Script ...
Result
Variable increments it's proper name at each iteration.
renaMe13 : 12
renaMe14 : 12
renaMe15 : 12
renaMe16 : 12
If you check in the shell now, you could see at the end of iteration, renaMe12 to renaMe15 no longer exist.
Only the variable renaMe16 exists with value 12.
>>> renaMe16
12
>>>> renaMe15
Retraçage (dernier appel le plus récent) :
Shell Python, prompt 4, line 1
builtins.NameError: name 'renaMe15' is not defined
Conclusion
This discussion is just for the sake of experimentation, but if I were you I would do my possible to avoid such code complexification unless it's necessary.
I agree Mozway when thinking you should avoid pain headaches...
Why does CPython (no clue about other Python implementations) have the following behavior?
tuple1 = ()
tuple2 = ()
dict1 = {}
dict2 = {}
list1 = []
list2 = []
# makes sense, tuples are immutable
assert(id(tuple1) == id(tuple2))
# also makes sense dicts are mutable
assert(id(dict1) != id(dict2))
# lists are mutable too
assert(id(list1) != id(list2))
assert(id(()) == id(()))
# why no assertion error on this?
assert(id({}) == id({}))
# or this?
assert(id([]) == id([]))
I have a few ideas why it may, but can't find a concrete reason why.
EDIT
To further prove Glenn's and Thomas' point:
[1] id([])
4330909912
[2] x = []
[3] id(x)
4330909912
[4] id([])
4334243440
When you call id({}), Python creates a dict and passes it to the id function. The id function takes its id (its memory location), and throws away the dict. The dict is destroyed. When you do it twice in quick succession (without any other dicts being created in the mean time), the dict Python creates the second time happens to use the same block of memory as the first time. (CPython's memory allocator makes that a lot more likely than it sounds.) Since (in CPython) id uses the memory location as the object id, the id of the two objects is the same. This obviously doesn't happen if you assign the dict to a variable and then get its id(), because the dicts are alive at the same time, so their id has to be different.
Mutability does not directly come into play, but code objects caching tuples and strings do. In the same code object (function or class body or module body) the same literals (integers, strings and certain tuples) will be re-used. Mutable objects can never be re-used, they're always created at runtime.
In short, an object's id is only unique for the lifetime of the object. After the object is destroyed, or before it is created, something else can have the same id.
CPython is garbage collecting objects as soon as they go out of scope, so the second [] is created after the first [] is collected. So, most of the time it ends up in the same memory location.
This shows what's happening very clearly (the output is likely to be different in other implementations of Python):
class A:
def __init__(self): print("a")
def __del__(self): print("b")
# a a b b False
print(A() is A())
# a b a b True
print(id(A()) == id(A()))
Why does CPython (no clue about other Python implementations) have the following behavior?
tuple1 = ()
tuple2 = ()
dict1 = {}
dict2 = {}
list1 = []
list2 = []
# makes sense, tuples are immutable
assert(id(tuple1) == id(tuple2))
# also makes sense dicts are mutable
assert(id(dict1) != id(dict2))
# lists are mutable too
assert(id(list1) != id(list2))
assert(id(()) == id(()))
# why no assertion error on this?
assert(id({}) == id({}))
# or this?
assert(id([]) == id([]))
I have a few ideas why it may, but can't find a concrete reason why.
EDIT
To further prove Glenn's and Thomas' point:
[1] id([])
4330909912
[2] x = []
[3] id(x)
4330909912
[4] id([])
4334243440
When you call id({}), Python creates a dict and passes it to the id function. The id function takes its id (its memory location), and throws away the dict. The dict is destroyed. When you do it twice in quick succession (without any other dicts being created in the mean time), the dict Python creates the second time happens to use the same block of memory as the first time. (CPython's memory allocator makes that a lot more likely than it sounds.) Since (in CPython) id uses the memory location as the object id, the id of the two objects is the same. This obviously doesn't happen if you assign the dict to a variable and then get its id(), because the dicts are alive at the same time, so their id has to be different.
Mutability does not directly come into play, but code objects caching tuples and strings do. In the same code object (function or class body or module body) the same literals (integers, strings and certain tuples) will be re-used. Mutable objects can never be re-used, they're always created at runtime.
In short, an object's id is only unique for the lifetime of the object. After the object is destroyed, or before it is created, something else can have the same id.
CPython is garbage collecting objects as soon as they go out of scope, so the second [] is created after the first [] is collected. So, most of the time it ends up in the same memory location.
This shows what's happening very clearly (the output is likely to be different in other implementations of Python):
class A:
def __init__(self): print("a")
def __del__(self): print("b")
# a a b b False
print(A() is A())
# a b a b True
print(id(A()) == id(A()))
I am trying to build some code and I have defined a function as this to test how a counter works inside of the function:
def errorPrinting(x):
x += 1
return x
I then use the function in some conditional logic where I want the counter to increase if the conditions are met.
x = 1
for row in arcpy.SearchCursor(fc):
if not row.INCLUSION_TYPE or len(row.TYPE.strip()) == 0:
errorPrinting(x)
print x
elif len(row.TYPE) not in range(2,5):
errorPrinting(x)
print x
elif row.INCLUSION_TYPE.upper() not in [y.upper() for y in TableList]:
errorPrinting(x)
print x
I'm still fairly new with using functions, so maybe I am not understanding how to return the value back outside of the function to be used in the next iteration of the for loop. It keeps returning 1 on me. Can anyone show me how to return the x outside of the function after it has been increased by one x+= 1?
Thanks,
Mike
You're not incrementing your global x, you're incrementing the local paramater that also happens to be named x! (Your parameter to errorPrinting could have been named anything. I'm calling it xLocal.)
As you can see here, x isn't incremented by the function.
>>> def inc(xLocal):
... xLocal += 1
... return xLocal
...
>>> x = 4
>>> inc(x)
5
>>> x
4
You need to reassign the value of x to the return value of the function each time. Like this
x = 1
for row in arcpy.SearchCursor(fc):
if not row.INCLUSION_TYPE or len(row.TYPE.strip()) == 0:
x = errorPrinting(x) # <=== here
print x
elif len(row.TYPE) not in range(2,5):
x = errorPrinting(x) # <=== here
print x
elif row.INCLUSION_TYPE.upper() not in [y.upper() for y in TableList]:
x = errorPrinting(x) # <=== here
print x
Integral parameters and other primitives aren't normally passed by reference in Python. (Lists, dicts, etc. are. Modifying lists unintentionally is actually a very common mistake in Python.)
Edit: passing by "reference" and "value" isn't really correct to talk about in Python. See this nice question for more details.
So, using my previous example:
>>> x = 4
>>> x = inc(x)
>>> x
5
Note that if this had been parameter that is passed by reference, like a list, this strategy would have worked.
>>> def incList(xList):
... for i in range(len(xList)):
... xList[i] += 1
...
>>> xList
[1]
>>> incList(xList)
>>> xList
[2]
Note that normal, Pythonic syntax:
for i in xList:
i += 1
would not increment the global value.
Note: If you're looking to keep tabs on a lot of things, I also recommend the logging module that #SB. mentioned. It's super useful and makes debugging large programs a lot easier. You can get time, type of message, etc.
You're bit by scope. You may want to check out this link for a quick primer.
You can do something simple and say x = errorPrinting(x) in all cases you call errorPrinting and get what you want. But I think there are better solutions where you'll learn more.
Consider implementing an error logger object that maintains a count for you. Then you can do logger.errorPrinting() and your instance of logger will manage the counter. You may also want to look into python's built in logging facilities.
Edited for the OP's benefit, since if functions are a new concept, my earlier comments may be a little hard to follow.
I personally think the nicest way to address this issue is to wrap your related code in an object.
Python is heavily based on the concept of objects, which you can think of as grouping data with functions that operate on that data. An object might represent a thing, or in some cases might just be a convenient way to let a few related functions share some data.
Objects are defined as "classes," which define the type of the object, and then you make "instances," each of which are a separate copy of the grouping of data defined in the class.
class MyPrint(object):
def __init__(self):
self.x = 1
def errorPrinting(self):
self.x += 1
return self.x
def myPrint(self):
for row in arcpy.SearchCursor(fc):
if not row.INCLUSION_TYPE or len(row.TYPE.strip()) == 0:
self.errorPrinting()
print self.x
elif len(row.TYPE) not in range(2,5):
self.errorPrinting()
print self.x
elif row.INCLUSION_TYPE.upper() not in [y.upper() for y in TableList]:
self.errorPrinting()
print self.x
p = MyPrint()
p.myPrint()
The functions __init__(self), errorPrinting(self), and myPrint(self), are all called "methods," and they're the operations defined for any object in the class. Calling those functions for one of the class's instance objects automatically sticks a self argument in front of any arguments that contains a reference to the particular instance that the function is called for. self.x refers to a variable that's stored by that instance object, so the functions can share that variable.
What looks like a function call to the class's name:
p = MyPrint()
actually makes a new instance object of class MyPrint, calls MyPrint.__init__(<instance>), where <instance> is the new object, and then assigns the instance to p. Then, calling
p.myprint()
calls MyPrint.myprint(p).
This has a few benefits, in that variables you use this way only last as long as the object is needed, you can have multiple counters for different tasks that are doing the same thing, and scope is all taken care of, plus you're not cluttering up the global namespace or having to pass the value around between your functions.
The simplest fix, though perhaps not the best style:
def errorPrinting():
global x
x += 1
Then convert x=errorPrinting(x) to errorPrinting ()
"global x" makes the function use the x defined globally instead of creating one in the scope of the function.
The other examples are good though. Study all of them.