I've been trying to store an array of integers in a field of a Django model. Based on this reply, I've been trying to do so using a CommaSeparatedIntegerField, however this has proved less intuitive than the name would imply.
If I have a comma-separated list of integers (list = [12,23,31]), and I store it in a CommaSeparatedIntegerField, it comes back as a string (retrieved_list outputs u'[1,2,3]'). I cannot simply retrieve my integers : for instance, int(retrieved_list[1]) outputs 1 whereas list[1] would output 23.
So, do I have to do the parsing by hand, or is there any other solution? And how exactly does a CommaSeparatedIntegerField differs from a CharField? Seems to me like they behave pretty much the same...
Eval was accepted as answer above -- avoid the temptation it's just not safe
See: Python: make eval safe
There is a literal_eval function that could be used the same way:
>>> from ast import literal_eval
>>> literal_eval("[1,2,3,4]")
[1, 2, 3, 4]
The only difference between this field type and a CharField is that it validates that the data is in the proper format - only digits separated by ','. You can view the source for the class at http://docs.nullpobug.com/django/trunk/django.db.models.fields-pysrc.html#CommaSeparatedIntegerField (expand the '+').
You can turn such a string into an actual Python list value (not array - list is the proper term for Python) by using eval.
>>> eval("[1,2,3,4]")
[1, 2, 3, 4]
EDIT: eval has safety concerns, however and a better method is to use literal_eval as suggest by Alvin in another answer.
Related
I am wondering if in Python exists a data structure for which is possible induce a custom internal ordering policy. I am aware of OrderedDict and whatnot, but they do not provide explicity what I am asking for. For example, OrderedDict just guarantees insertion order.
I really would like something that in C++ is provided with the use of comparison object: for example in std::set<Type,Compare,Allocator>, Compare is a parameter that define the internal ordering of the data structure. Usually, or probably always, it is a binary predicate that is evaluate for a pair of elements beloning to the data structure.
Is there something similar in Python? Do you know any workaround?
SortedSet & Co support a key:
>>> SortedSet([-3, 1, 4, 1], key=abs)
SortedSet([1, -3, 4], key=<built-in function abs>)
I have a config file, in which items can be single element or list.
pct_accepted=0.75
pct_rejected=0.35, 0.5
Upon reading back, they will all be in string,
config['pct_accepted']='0.75'
config['pct_rejected']=['0.35', '0.5']
Is there a clean method of converted them to float other than having to check whether they are scalar or list
My attempt for now is :
for k in ['pct_accepted','pct_rejected']:
if isinstance(config[k], list) :
config[k]=[float(item) for item in config[k]]
if isinstance(config[k], string) :
config[k]=float(config[k])
Doesn't look so neat.
Since you included the numpy tag:
In [161]: np.array('12.23',float).tolist()
Out[161]: 12.23
In [162]: np.array(['12.23','12.23'],float).tolist()
Out[162]: [12.23, 12.23]
short, sweet and overkill!
There's no clean way, simply because the conversion is not valid on a list: something has to look at the data type. You can hide that in a function, but it's still there.
You can shorten the code a bit by using the available broadcast routines. Something such as map(float, config[k]) will perhaps make it look a little better to you.
You can also store the type in a variable, and test the variable twice, rather than using two isinstance calls. This saves a few characters, and doesn't scale well, but it works nicely for simple applications.
Say, I'm going to construct a probably large dictionary in Python 3 for in-memory operations. The dictionary keys are integers, but I'm going to read them from a file as string at first.
As far as storage and retrieval are concerned, I wonder if it matters whether I store the dictionary keys as integers themselves, or as strings.
In other words, would leaving them as integers help with hashing?
Dicts are fast but can be heavy on the memory.
Normally it shouldn't be a problem but you will only know when you test.
I would advise to first test 1.000 lines, 10.000 lines and so on and have a look on the memory footprint.
If you run out of memory and your data structure allows it maybe try using named tuples.
EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade')
import csv
for emp in map(EmployeeRecord._make, csv.reader(open("employees.csv", "rb"))):
print(emp.name, emp.title)
(Example taken from the link)
If you have ascending integers you could also try to get more fancy by using the array module.
Actually the string hashing is rather efficient in Python 3. I expected this to has the opposite outcome:
>>> timeit('d["1"];d["4"]', setup='d = {"1": 1, "4": 4}')
0.05167865302064456
>>> timeit('d[1];d[4]', setup='d = {1: 1, 4: 4}')
0.06110116100171581
You don't seem to have bothered benchmarking the alternatives. It turns out that the difference is quite slight and I also find inconsistent differences. Besides this is an implementation detail how it's implemented, since both integers and strings are immutable they could possibly be compared as pointers.
What you should consider is which one is the natural choice of key. For example if you don't interpret the key as a number anywhere else there's little reason to convert it to an integer.
Additionally you should consider if you want to consider keys equal if their numeric value is the same or if they need to be lexically identical. For example if you would consider 00 the same key as 0 you would need to interpret it as integer and then integer is the proper key, if on the other hand you want to consider them different then it would be outright wrong to convert them to integers (as they would become the same then).
I just finished LearnPythonTheHardWay as my intro to programming and set my mind on a sudoku related project. I've been reading through the code of a Sudoku Generator that was uploaded here
to learn some things, and I ran into the line available = set(range(1,10)). I read that as available = set([1, 2, 3, 4, 5, 6, 7, 8, 9]) but I'm not sure what set is.
I tried googling python set, looked through the code to see if set had been defined anywhere, and now I'm coming to you.
Thanks.
Set is built-in type. From the documentation:
A set object is an unordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.
A set in Python is the collection used to mimic the mathematical notion of set. To put it very succinctly, a set is a list of unique objects, that is, it cannot contain duplicates, which a list can do.
A set is kind of like an unordered list, with unique elements. Documentation exists though, so I'm not sure why you couldn't find it:
https://docs.python.org/2/library/stdtypes.html#set
to make it easy to understand ,
lets take a list ,
a = [1,2,3,4,5,5,5,6,7,7,9]
print list(set(a))
the output will be ,
[1,2,3,4,5,6,7,9]
You can prevent repetitive number using set.
For more usage of set you have to refer to the docs.
Thanks to my friend here who reminded me about the lack of order ,
Incase if the list 'a' was like,
a =[7,7,5,5,5,1,2,3,4,6,9]
print list(set(a))
will still print the output as
[1,2,3,4,5,6,7,9]
You cant preserve order in set.
For example, in the code below I would like to obtain the list [1,2,3] using x as a reference.
In[1]: pasta=[1,2,3]
In:[2]: pasta
Out[2]: [1, 2, 3]
In [3]: x='pas'+'ta'
In [4]: x
Out[4]: 'pasta'
What you are trying to do is a bad practice.
What you really need is a dict:
>>> dct = {'pasta': [1,2,3]}
>>> x = 'pas' + 'ta'
>>> dct[x]
[1, 2, 3]
This is the right data structure for the actual task you're trying to achieve: using a string to access an object.
Other answers suggested (or just showed with a worning) different ways to do that. Since Python is a very flexible language, you can almost always found such different ways to follow for a given task, but "there should be one-- and preferably only one --obvious way to do it"[1].
All of them will do the work, but not without downsides:
locals() is less readable, needlessly complex and also open to risks in some cases (see Mark Byers answer). If you use locals() you are going to mix the real variables with the database ones, it's messy.
eval() is plain ugly, is a "quick-and-dirty way to get some source code dynamically"[2] and a bad practice.
When in doubt about the right way to choose, tring to follow the Zen of Python might be a start.
And hey, even the InteractiveInterpreter could be used to access an object using a string, but that doesn't mean I'm going to.
Like other pointed out, you should normally avoid doing this and just use either a dictionary (in an example like you give) or in some cases a list (for example instead of using my_var1, my_var2, my_var3 -> my_vars).
However if you still want to do that you have a couple of option.
Your could do:
locals()[x]
or
eval(x) #always make sure you do proper validation before using eval. A very powerfull feature of python imo but very risky if used without care.
If the pasta is an object attribute you can get it safely by:
getattr(your_obj, x)
Well, to do what you literally asked for, you could use locals:
>>> locals()[x]
[1, 2, 3]
However it is almost always a bad idea to do this. As Sven Marnach pointed out in the comments: Keep data out of your variable names. Using variables as data could also be a security risk. For example, if the name of the variable comes from the user they might be able to read or modify variables that you never intended them to have access to. They just need to guess the variable name.
It would be much better to use a dictionary instead.
>>> your_dict = {}
>>> your_dict['pasta'] = [1, 2, 3]
>>> x = 'pas' + 'ta'
>>> your_dict[x]
[1, 2, 3]
Use this
hello = [1,2,3]
print vars()['hello']
Returns [1, 2, 3].