Pass list to function by value [duplicate] - python

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Closed 5 months ago.
I want to pass a list into function by value.
By default, lists and other complex objects passed to function by reference.
Here is some desision:
def add_at_rank(ad, rank):
result_ = copy.copy(ad)
.. do something with result_
return result_
Can this be written shorter?
In other words, I wanna not to change ad.

You can use [:], but for list containing lists(or other mutable objects) you should go for copy.deepcopy():
lis[:] is equivalent to list(lis) or copy.copy(lis), and returns a shallow copy of the list.
In [33]: def func(lis):
print id(lis)
....:
In [34]: lis = [1,2,3]
In [35]: id(lis)
Out[35]: 158354604
In [36]: func(lis[:])
158065836
When to use deepcopy():
In [41]: lis = [range(3), list('abc')]
In [42]: id(lis)
Out[42]: 158066124
In [44]: lis1=lis[:]
In [45]: id(lis1)
Out[45]: 158499244 # different than lis, but the inner lists are still same
In [46]: [id(x) for x in lis1] = =[id(y) for y in lis]
Out[46]: True
In [47]: lis2 = copy.deepcopy(lis)
In [48]: [id(x) for x in lis2] == [id(y) for y in lis]
Out[48]: False

This might be an interesting use case for a decorator function. Something like this:
def pass_by_value(f):
def _f(*args, **kwargs):
args_copied = copy.deepcopy(args)
kwargs_copied = copy.deepcopy(kwargs)
return f(*args_copied, **kwargs_copied)
return _f
pass_by_value takes a function f as input and creates a new function _f that deep-copies all its parameters and then passes them to the original function f.
Usage:
#pass_by_value
def add_at_rank(ad, rank):
ad.append(4)
rank[3] = "bar"
print "inside function", ad, rank
a, r = [1,2,3], {1: "foo"}
add_at_rank(a, r)
print "outside function", a, r
Output:
"inside function [1, 2, 3, 4] {1: 'foo', 3: 'bar'}"
"outside function [1, 2, 3] {1: 'foo'}"

A shallow copy is usually good enough, and potentially mush faster than deep copy.
You can take advantage of this if the modifications you are making to result_ are not mutating the items/attributes it contains.
For a simple example if you have a chessboard
board = [[' ']*8 for x in range(8)]
You could make a shallow copy
board2 = copy.copy(board)
It's safe to append/insert/pop/delete/replace items from board2, but not the lists it contains. If you want to modify one of the contianed lists you must create a new list and replace the existing one
row = list(board2[2])
row[3] = 'K'
board2[2] = row
It's a little more work, but a lot more efficient in time and storage

In case of ad is list you can simple call your function as add_at_rank(ad + [], rank).
This will create NEW instance of list every time you call function, that value equivalented of ad.
>>>ad == ad + []
True
>>>ad is ad +[]
False
Pure pythonic :)

Related

List operation, keeping track of old list

After I apply an operation to a list, I would like to get access to both the modified list and the original one.
Somehow I am not able to.
In the following code snippet, I define two functions with which I modify the original list.
Afterwards, I get my values from a class and apply the transformation.
def get_min_by_col(li, col): # get minimum from list
return min(li, key=lambda x: x[col - 1])[col - 1]
def hashCluster(coords): # transform to origin
min_row = get_min_by_col(coords,0)
min_col = get_min_by_col(coords,1)
for pix in coords:
pix[1] = pix[1] - min_row
pix[0] = pix[0] - min_col
return (coords)
pixCoords = hashCoords = originalPixCoords = [] # making sure they are empty
for j in dm.getPixelsForCluster(dm.clusters[i]):
pixCoords.append([j['m_column'], j['m_row']]) # getting some values from a class -- ex: [[613, 265], [613, 266]] or [[615, 341], [615, 342], [616, 341], [616, 342]]
originalPixCoords = pixCoords.copy() # just to be safe, I make a copy of the original list
print ('Original : ', originalPixCoords)
hashCoords = hashCluster(pixCoords) # apply transformation
print ('Modified : ', hashCoords)
print ('Original : ', originalPixCoords) # should get the original list
Some results [Jupyter Notebook]:
Original : [[607, 268]]
Modified : [[0, 0]]
Original : [[0, 0]]
Original : [[602, 264], [603, 264]]
Modified : [[0, 0], [1, 0]]
Original : [[0, 0], [1, 0]]
Original : [[613, 265], [613, 266]]
Modified : [[0, 0], [0, 1]]
Original : [[0, 0], [0, 1]]
Is the function hashCluster able to modify the new list as well? Even after the .copy()?
What am I doing wrong? My goal is to have access to both the original and modified lists, with as less operations and copies of lists as possible (since I am looping over a very large document).
You have a list of lists, and are modifying the inner lists. The operation pixCoords.copy() creates a shallow copy of the outer list. Both pixCoords and originalPixCoords now have two list buffers pointing to the same mutable objects. There are two ways to handle this situation, each with its own pros and cons.
The knee-jerk method that most users seem to have is to make a deep copy:
originalPixCoords = copy.deepcopy(pixCoords)
I would argue that this method is the less pythonic and more error prone approach. A better solution would be to make hashCluster actually return a new list. By doing that, you will make it treat the input as immutable, and eliminate the problem entirely. I consider this more pythonic because it reduces the maintenance burden. Also, conventionally, python functions that return a value create a new list without modifying the input while in-place operations generally don't return a value.
def hashCluster(coords):
min_row = get_min_by_col(coords, 0)
min_col = get_min_by_col(coords, 1)
return [[pix[0] - min_col, pix[1] - min_row] for pix in coords]
use
import copy
OriginalPixCoords= copy.deepcopy(pixCoords)
What you're using is a shallow copy. It effectively means you created a new list and just pointed to the old memory spaces. Meaning if those object got modified, your new list will still reflect those updates since they occurred in the same memory space.
>>> # Shallow Copy
>>> mylist = []
>>> mylist.append({"key": "original"})
>>> mynewlist = mylist.copy()
>>> mynewlist
[{'key': 'original'}]
>>> mylist[0]["key"] = "new value"
>>> mylist
[{'key': 'new value'}]
>>> mynewlist
[{'key': 'new value'}]
>>> # Now Deep Copy
>>> mylist = []
>>> mylist.append({"key": "original"})
>>> from copy import deepcopy
>>> mynewlist = deepcopy(mylist)
>>> mynewlist
[{'key': 'original'}]
>>> mylist[0]["key"] = "new value"
>>> mylist
[{'key': 'new value'}]
>>> mynewlist
[{'key': 'original'}]
Another similar question: What is the difference between shallow copy, deepcopy and normal assignment operation?
Settings multiple variables equal to the same value is the equivalent of a pointer in Python.
Check this out
a = b = [1,2,3]
a == b # True
a is b # True (same memory location)
b[1] = 3
print(b) # [1,3,3]
print(a) #[1,3,3]
Right now, you are creating shallow copies. If you need both copies (with different values and data history), you can simply assign the variables in the following manner:
import copy
original = data
original_copy = copy.deepcopy(data)
original_copy == original == data # True
original_copy is original # False
original_copy[0] = 4
original_copy == original # False

Is there any reason to prefer dict() to {} in Python? [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 months ago.
The community reviewed whether to reopen this question 6 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I understand that they are both essentially the same thing, but in terms of style, which is the better (more Pythonic) one to use to create an empty list or dict?
In terms of speed, it's no competition for empty lists/dicts:
>>> from timeit import timeit
>>> timeit("[]")
0.040084982867934334
>>> timeit("list()")
0.17704233359267718
>>> timeit("{}")
0.033620194745424214
>>> timeit("dict()")
0.1821558326547077
and for non-empty:
>>> timeit("[1,2,3]")
0.24316302770330367
>>> timeit("list((1,2,3))")
0.44744206316727286
>>> timeit("list(foo)", setup="foo=(1,2,3)")
0.446036018543964
>>> timeit("{'a':1, 'b':2, 'c':3}")
0.20868602015059423
>>> timeit("dict(a=1, b=2, c=3)")
0.47635635255323905
>>> timeit("dict(bar)", setup="bar=[('a', 1), ('b', 2), ('c', 3)]")
0.9028228448029267
Also, using the bracket notation lets you use list and dictionary comprehensions, which may be reason enough.
In my opinion [] and {} are the most pythonic and readable ways to create empty lists/dicts.
Be wary of set()'s though, for example:
this_set = {5}
some_other_set = {}
Can be confusing. The first creates a set with one element, the second creates an empty dict and not a set.
The dict literal might be a tiny bit faster as its bytecode is shorter:
In [1]: import dis
In [2]: a = lambda: {}
In [3]: b = lambda: dict()
In [4]: dis.dis(a)
1 0 BUILD_MAP 0
3 RETURN_VALUE
In [5]: dis.dis(b)
1 0 LOAD_GLOBAL 0 (dict)
3 CALL_FUNCTION 0
6 RETURN_VALUE
Same applies to the list vs []
Be careful list() and [] works differently:
>>> def a(p):
... print(id(p))
...
>>> for r in range(3):
... a([])
...
139969725291904
139969725291904
139969725291904
>>> for r in range(3):
... a(list())
...
139969725367296
139969725367552
139969725367616
list() always creates a new object on the heap, but [] can reuse memory cells in many situations.
IMHO, using list() and dict() makes your Python look like C. Ugh.
In the case of difference between [] and list(), there is a pitfall that I haven't seen anyone else point out.
If you use a dictionary as a member of the list, the two will give entirely different results:
In [1]: foo_dict = {"1":"foo", "2":"bar"}
In [2]: [foo_dict]
Out [2]: [{'1': 'foo', '2': 'bar'}]
In [3]: list(foo_dict)
Out [3]: ['1', '2']
A difference between list() and [] not mentioned by anyone, is that list() will convert, for example a tuple, into a list. And [] will put said tuple into a list:
a_tuple = (1, 2, 3, 4)
test_list = list(a_tuple) # returns [1, 2, 3, 4]
test_brackets = [a_tuple] # returns [(1, 2, 3, 4)]
There is no such difference between list() and [] but if you use it with iterators, it gives us:
nums = [1,2,3,4,5,6,7,8]
In: print([iter(nums)])
Out: [<list_iterator object at 0x03E4CDD8>]
In: print(list(iter(nums)))
Out: [1, 2, 3, 4, 5, 6, 7, 8]
there is one difference in behavior between [] and list() as example below shows. we need to use list() if we want to have the list of numbers returned, otherwise we get a map object! No sure how to explain it though.
sth = [(1,2), (3,4),(5,6)]
sth2 = map(lambda x: x[1], sth)
print(sth2) # print returns object <map object at 0x000001AB34C1D9B0>
sth2 = [map(lambda x: x[1], sth)]
print(sth2) # print returns object <map object at 0x000001AB34C1D9B0>
type(sth2) # list
type(sth2[0]) # map
sth2 = list(map(lambda x: x[1], sth))
print(sth2) #[2, 4, 6]
type(sth2) # list
type(sth2[0]) # int
One of the other differences between a list() and []
list_1 = ["Hello World"] # is a list of the word "Hello World"
list_2 = list("Hello World") # is a list of letters 'H', 'e', 'l'...
Something to keep in mind...
A box bracket pair denotes one of a list object, or an index subscript, like my_List[x].
A curly brace pair denotes a dictionary object.
a_list = ['on', 'off', 1, 2]
a_dict = { on: 1, off: 2 }

Different behaviour of functions in python

I've been learning python for some time, but it keeps suprising me.
I have following code:
def update_list(input_list):
input_list.append(len(input_list))
input_list[0] = 11
return input_list
def update_string(input_string):
input_string = 'NEW'
return input_string
my_list = [0,1,2]
print my_list
print update_list(my_list)
print my_list
my_string = 'OLD'
print my_string
print update_string(my_string)
print my_string
This code provides following output:
[0, 1, 2]
[11, 1, 2, 3]
[11, 1, 2, 3]
OLD
NEW
OLD
Why variable my_list is modified without attribution, and my_string value stays the same after update_string() function? I don't understand that mechanism, can you explain it to me?
There is nothing different about the behaviour of functions. What is different is that in one of them you rebound the name:
input_string = 'NEW'
This sets the name input_string to a new object. In the other function you make no assignments to a name. You only call a method on the object, and assign to indices on the object. This happens to alter the object contents:
input_list.append(len(input_list))
input_list[0] = 11
Note that assigning to an index is not the same as assigning to a name. You could assign the list object to another name first, then do the index assignment separately, and nothing would change:
_temp = input_list
_temp[0] = 11
because assigning to an index alters one element contained in the list, not the name that you used to reference the list.
Had you assigned directly to input_list, you'd have seen the same behaviour:
input_list = []
input_list.append(len(input_list))
input_list[0] = 11
You can do this outside a function too:
>>> a_str = 'OLD'
>>> b_str = a_str
>>> b_str = 'NEW'
>>> a_list = ['foo', 'bar', 'baz']
>>> b_list = a_list
>>> b_list.append('NEW')
>>> b_list[0] = 11
>>> a_str
'OLD'
>>> b_str
'NEW'
>>> a_list
[11, 'bar', 'baz', 'NEW']
>>> b_list
[11, 'bar', 'baz', 'NEW']
The initial assignments to b_str and b_list is exactly what happens when you call a function; the arguments of the function are assigned the values you passed to the function. Assignments do not create a copy, they create additional references to the object.
If you wanted to pass in a copy of the list object, do so by creating a copy:
new_list = old_list[:] # slicing from start to end creates a shallow copy

An iterator that doesn't squash references?

I want a for loop in Python that can modify variables in the iterator, not just handle the value of the variables. As a trivial example, the following clearly does not do what I want because b is still a string at the end.
a = 3
b = "4"
for x in (a, b):
x = int(x)
print("b is %s" % type(b))
(Result is "b is a <class 'str'>")
What is a good design pattern for "make changes to each variable in a long list of variables"?
Short answer: you can't do that.
a = "3"
b = "4"
for x in (a, b):
x = int(x)
Variables in Python are only tags that references values. Theres is not such thing as "tags on tags". When you write x = int(x) if the above code, you only change what x points to. Not the pointed value.
What is a good design pattern for "make changes to each variable in a long list of variables"?
I'm not sure to really understand, but if you want to do things like that, maybe you should store your values not as individual variables, but as value in a dictionary, or as instance variables of an object.
my_vars = {'a': "3",
'b': "4" }
for x in my_vars:
my_vars[x] = int(my_vars[x])
print type(my_vars['b'])
Now if you're in the hackish mood:
As your variables are globals they are in fact stored as entries in a dictionary (accessible through the globals() function). So you could change them:
a = "3"
b = "4"
for x in ('a', 'b'):
globals()[x] = int(globals()[x])
print type(b)
But, as of myself, I wouldn't call that "good design pattern"...
As mentioned in another answer, there's no way to update a variable indirectly. The best you can do is assign it explicitly with unpacking:
>>> a = 3
>>> b = 4
>>> a, b = [int(x) for x in a, b]
>>> print "b is %s" % type(b)
b is <type 'int'>
If you have an actual list of variables (as opposed to a number of individual variables you want to modify), then a list comprehension will do what you want:
>>> my_values = [3, "4"]
>>> my_values = [int(value) for value in my_values]
>>> print(my_values)
[3, 4]
If you want to do more complicated processing, you can define a function and use that in the list comprehension:
>>> my_values = [3, "4"]
>>> def number_crunching(value):
... return float(value)**1.42
...
>>> my_values = [number_crunching(value) for value in my_values]
>>> print(my_values)
[4.758961394052794, 7.160200567423779]

How to clone or copy a set in Python?

For copying a list: shallow_copy_of_list = old_list[:].
For copying a dict: shallow_copy_of_dict = dict(old_dict).
But for a set, I was worried that a similar thing wouldn't work, because saying new_set = set(old_set) would give a set of a set?
But it does work. So I'm posting the question and answer here for reference. In case anyone else has the same confusion.
Both of these will give a duplicate of a set:
shallow_copy_of_set = set(old_set)
Or:
shallow_copy_of_set = old_set.copy() #Which is more readable.
The reason that the first way above doesn't give a set of a set, is that the proper syntax for that would be set([old_set]). Which wouldn't work, because sets can't be elements in other sets, because they are unhashable by virtue of being mutable. However, this isn't true for frozensets, so e.g. frozenset(frozenset(frozenset([1,2,3]))) == frozenset([1, 2, 3]).
So a rule of thumb for replicating any of instance of the basic data structures in Python (lists, dict, set, frozenset, string):
a2 = list(a) #a is a list
b2 = set(b) #b is a set
c2 = dict(c) #c is a dict
d2 = frozenset(d) #d is a frozenset
e2 = str(e) #e is a string
#All of the above give a (shallow) copy.
So, if x is either of those types, then
shallow_copy_of_x = type(x)(x) #Highly unreadable! But economical.
Note that only dict, set and frozenset have the built-in copy() method. It would probably be a good idea that lists and strings had a copy() method too, for uniformity and readability. But they don't, at least in Python 2.7.3 which I'm testing with.
Besides the type(x)(x) hack, you can import copy module to make either shallow copy or deep copy:
In [29]: d={1: [2,3]}
In [30]: sd=copy.copy(d)
...: sd[1][0]=321
...: print d
{1: [321, 3]}
In [31]: dd=copy.deepcopy(d)
...: dd[1][0]=987
...: print dd, d
{1: [987, 3]} {1: [321, 3]}
From the docstring:
Definition: copy.copy(x)
Docstring:
Shallow copy operation on arbitrary Python objects.

Categories

Resources