Build a dictionary with a regular expression - python

I am getting some data through a serial connection which I'd like to process so I can do more with it.
My Python script gets the variable which looks like:
data = "P600F600"
and my goal is to get this:
finaldata = {
'P': 600,
'F': 600
}
I like regular expressions and my input format is very strict so I've devised this RegEx to grab the data:
/([A-Z])(\d+)/
Based on my limited knowledge of Python, I've devised this.
finadata = eval( '{' + re.sub(r"([A-Z])(\d+)", r"'\1':\2,", data) + '}' )
but this is clearly a horrible and extremely hacky solution.

In this case, re.findall seems to be really helpful:
>>> import re
>>> re.findall('([A-Z])(\d+)', 'P600F600')
[('P', '600'), ('F', '600')]
It just so happens that a dict can be built from this directly:
>>> dict(re.findall('([A-Z])(\d+)', 'P600F600'))
{'P': '600', 'F': '600'}
Of course, this leaves you with string values rather than integer values. To get ints, you'd need to construct them more explicitly:
>>> items = re.findall('([A-Z])(\d+)', 'P600F600')
>>> {key: int(value) for key, value in items}
{'P': 600, 'F': 600}
Or for python2.6- compatibility:
>>> dict((key, int(value)) for key, value in items)
{'P': 600, 'F': 600}

Since findall already returns sequence of two-element sequences:
re.findall('([A-Z])(\d+)', data) # [('P', '600'), ('F', '600')]
You may simply use dict built-in function:
import re
dict(re.findall('([A-Z])(\d+)', data)) # {'P': '600', 'F': '600'}
Quoting docs:
If no positional argument is given, an empty dictionary is created. If
a positional argument is given and it is a mapping object, a
dictionary is created with the same key-value pairs as the mapping
object. Otherwise, the positional argument must be an iterable object.
Each item in the iterable must itself be an iterable with exactly two
objects. The first object of each item becomes a key in the new
dictionary, and the second object the corresponding value. If a key
occurs more than once, the last value for that key becomes the
corresponding value in the new dictionary.

Related

How to have preserve quotes if the value is a character and remove the quotes if it is a number when creating a dictionary?

I need to be able to convert any string in the format
"var1=var a;var2=var b"
to a dictionary and I managed to do that as follows.
a = "a=b;b=2"
def str_to_conf_dict(input_str):
return dict(u.split('=') for u in input_str.split(';'))
b = str_to_conf_dict(a)
result= {'a': 'b', 'b': '2'}
But the values in the dictionary have quotes regardless whether var a, var b is a number or an alphabet.
I know that the values are always going to be a mix of characters and numbers (int/float/negatives). How would I have the quotes if the variable is a character and remove the quotes if it is a number?
It is crucial to have the quotes only on characters because I will pass the values to functions which work specifically if it meets the criteria, there is no way to modify that end.
Create a separate function for converting the value to its proper type.
Take a look at How to convert list of strings to their correct Python types?, where the answers use either ast.literal_eval or json.loads (amongst other solutions) to deserialize a string to a suitable Python object:
import json
def convert(val):
try:
return json.loads(val)
except ValueError:
return val # return as-is
Then apply that function on each of the values from the original string.
def str_to_conf_dict(input_str):
d = {}
for pair in input_str.split(";"):
k, v = pair.split("=")
d[k] = convert(v)
return d
s = "a=b;b=2;c=-3;d=xyz;e=4ajkl;f=3.14"
print(str_to_conf_dict(s))
{'a': 'b', 'b': 2, 'c': -3, 'd': 'xyz', 'e': '4ajkl', 'f': 3.14}
All the numbers (ints, floats, and negative ones) should be converted to numbers, while others are retained as-is (as strings, with quotes).
If you want to (unnecessarily) force it into a one-liner (for some reason), you'll need to setup a {key : convert(value)} dictionary comprehension. You can either .split twice to get each item of the pair:
def str_to_conf_dict(input_str):
return {
pair.split('=')[0]: convert(pair.split('=')[1])
for pair in input_str.split(';')
}
Or pair up the items from the .split('=') output. You can take inspiration from the pairwise recipe from the itertools package, or implement something simpler if you know the format is always going to be key=val:
def get_two(iterable):
yield iterable[0], iterable[1]
def str_to_conf_dict(input_str):
return {
k: convert(v)
for pair in input_str.split(';')
for k, v in get_two(pair.split('='))
}

Get the name of the instance of an object in python, when __str__ overridden?

I'm creating a simple container system, in which my objects (all children of a class called GeneralWidget) are grouped in some containers, which are in another set of containers, and so on until all is in one Global container.
I have a custom class, called GeneralContainer, in which I had to override the __str__ method to provide a describing name for my container, So I know what kind of objects or containers are stored inside of him.
I am currently writing another class called ObjectTracker in which all positions of my Objects are stored, so when a new object is created, It gives a list with its' name in it in the __init__ method to it's "parent" in my hieracy, which adds itself to the list and passes it on. At some point this list with all objects that are above the new created instance of GeneralWidget will reach the global GeneralWidget (containing all containers and widgets) , which can access the ObjectTracker-object in my main().
This is the bachground of my problem. My ObjectTracker has got a dictionary, in which every "First Level container" is a key, and all objects inside such a container are stored in dictionarys as well. So I have
many encapsulated dictionarys.
As I don't know how many levels of containers there will be, I need a dynamic syntax that is independent of the number of dictionarys I need to pass unil I get to the place in the BIG dictionary that I want. A (static) call inside my ObjectRepository class would need to look something like this:
self._OBJECTREPOSITORY[firstlevelcontainer12][secondlevel8][lastlevel4] = myNewObject
with firstlevelcontainer12 containing secondlevel8 which contains lastlevel4 in which the new object should be placed
But I know neither how the containers will be called, nor how many there will be, so I decided to use exec() and compose a string with all names in it. I will post my actual code here, the definition of ObjectTracker:
class ObjectTracker:
def __init__(self):
self._NAMEREPOSITORY = {}
def addItem(self, pathAsList):
usableList = list(reversed(pathAsList))
string = "self._NAMEREPOSITORY"
for thing in usableList:
if usableList[-1] != [thing]:
string += "[" + str(thing) + "]"
else:
string += "] = " + str(thing)
print(string)
exec(string)
The problem is that I have overridden the __str__ method of the class GeneralContainer and GeneralWidgetTo gie back a describing name. This came in VERY handy at many occasions but now it has become a big problem. The code above only works if the custom name is the same as the name of the instance of the object (of course, I get why!)
The question is : Does a built-in function exist to do the following:
>>> alis = ExampoleClass()
>>> DOESTHISEXIST(alis)
'alis'
If no, how can I write a custom one without destroying my well working naming system?
Note: Since I'm not exactly sure what you want, I'll attempt provide a general solution.
First off, avoid eval/exec like the black plague. There are serious problems one encounters when using them, and there's almost always a better way. This is the way I propose below:
You seems to want a way to find a certain point a nested dictionary given a list of specific keys. This can be done quite easily using a for-loop and recursively traversing said dictionary. For example:
>>> def get_value(dictionary, keys):
value = dictionary
for key in keys:
value = value[key]
return value
>>> d = {'a': 1, 'b': {'c': 2, 'd': 3, 'e': {'f': 4, }, 'g': 5}}
>>> get_value(d, ('b', 'e', 'f'))
4
>>>
If you need to assign to a specific part of a certain nested dictionary, this can also be done using the above code:
>>> dd = get_value(d, ('b', 'e')) # grab a dictionary object
>>> dd
{'f': 4}
>>> dd['h'] = 6
>>> # the d dictionary is changed.
>>> d
{'a': 1, 'b': {'c': 2, 'd': 3, 'e': {'f': 4, 'h': 6}, 'g': 5}}
>>>
Below is a formalized version of the function above, with error testing and documentation (in a custom style):
NO_VALUE = object()
def traverse_mapping(mapping, keys, default=NO_VALUE):
"""
Description
-----------
Given a - often nested - mapping structure and a list of keys, use the
keys to recursively traverse the given dictionary and retrieve a certian
keys value.
If the function reaches a point where the mapping can no longer be
traversed (i.e. the current value retrieved from the current mapping
structure is its self not a mapping type) or a given key is found to
be non-existent, a default value can be provided to return. If no
default value is given, exceptions will be allowed to raise as normal
(a TypeError or KeyError respectively.)
Examples (In the form of a Python IDLE session)
-----------------------------------------------
>>> d = {'a': 1, 'b': {'c': 2, 'd': 3, 'e': {'f': 4, }, 'g': 5}}
>>> traverse_mapping(d, ('b', 'e', 'f'))
4
>>> inner_d = traverse_mapping(d, ('b', 'e'))
>>> inner_d
{'f': 4}
>>> inner_d['h'] = 6
>>> d
{'a': 1, 'b': {'c': 2, 'd': 3, 'e': {'f': 4, 'h': 6}, 'g': 5}}
>>> traverse_mapping(d, ('b', 'e', 'x'))
Traceback (most recent call last):
File "<pyshell#14>", line 1, in <module>
traverse_mapping(d, ('b', 'e', 'x'))
File "C:\Users\Christian\Desktop\langtons_ant.py", line 33, in traverse_mapping
value = value[key]
KeyError: 'x'
>>> traverse_mapping(d, ('b', 'e', 'x'), default=0)
0
>>>
Parameters
----------
- mapping : mapping
Any map-like structure which supports key-value lookup.
- keys : iterable
An iterable of keys to be using in traversing the given mapping.
"""
value = mapping
for key in keys:
try:
value = value[key]
except (TypeError, KeyError):
if default is not NO_VALUE:
return default
raise
return value
I think you might be looking for vars().
a = 5
# prints the value of a
print(vars()['a'])
# prints all the currently defined variables
print(vars())
# this will throw an error since b is not defined
print(vars()['b'])

Iterating over a dictionary with value as a list

I've a dictionary 'mydict'.
{ 'a': ['xyz1', 'xyz2'],
'b': ['xyz3', 'xyz4'],
'c': ['xyz5'],
'd': ['xyz6']}
I'm trying to print out all the keys and values of this dictionary using the following code:
for username, details in mydict.iteritems():
pprint.pprint(username + " " + details)
But, I'm getting the following error:
AttributeError: 'list' object has no attribute 'iteritems'
Any help would be appreciated.
This code works on your example
>>> import pprint
>>> mydict = { 'a': ['xyz1', 'xyz2'],
'b': ['xyz3', 'xyz4'],
'c': ['xyz5'],
'd': ['xyz6']}
>>> for username, details in mydict.iteritems():
pprint.pprint((username, details))
('a', ['xyz1', 'xyz2'])
('c', ['xyz5'])
('b', ['xyz3', 'xyz4'])
('d', ['xyz6'])
I get the same AttributeError when I attempt the original, this arrises becuase the VALUE in each KEY, VALUE pair is a list.
Using mydict.items() you can create a copy of each (KEY, VALUE) pair, which you can then print:
for key, value in mydict.items():
print((key, value))
Though of course, creating a copy using items() is memory expensive if your dictionary is large. Which, is the big advantage (lower memory cost, more efficient AND *optimised for python*) of being able to use iteritems() to iterate through your dictionary.
Equally well, you could do the following:
for key in d:
print((k, mydic[key]))
BUT (in python), this is slower again! Because you have to re-hash the dictionary each time as you look up mydict[key]. So, it seems that mydict.items() is the best option here, as it gives you access to the values directly through tuple unpacking.
Raymond Hettinger (the iteritems and generator guru from Python) gave a great talk at US PyCon, which you can watch here: http://pyvideo.org/video/1780/transforming-code-into-beautiful-idiomatic-pytho

retrieving keys from dictionaries depending on value in python

I'm trying to find the most efficient way in python to create a dictionary of 'guids' (point ids in rhino) and retrieve them depending on the value(s) I assign them, change that value(s) and restoring them back in the dictionary. One catch is that with Rhinoceros3d program the points have a random generated ID number which I don't know so I can only call them depending on the value I give them.
are dictionaries the correct way? should the guids be the value instead of the keys?
a very basic example :
arrPts=[]
arrPts = rs.GetPoints() # ---> creates a list of point-ids
ptsDict = {}
for ind, pt in enumerate(arrPts):
ptsDict[pt] = ('A'+str(ind))
for i in ptsDict.values():
if '1' in i :
print ptsDict.keys()
how can I make the above code print the key that has the value '1' , instead of all the keys? and then change the key's value from 1 to e.g. 2 ?
any help also on the general question would be appreciated to know I'm in the right direction.
Thanks
Pav
You can use dict.items().
An example:
In [1]: dic={'a':1,'b':5,'c':1,'d':3,'e':1}
In [2]: for x,y in dic.items():
...: if y==1:
...: print x
...: dic[x]=2
...:
a
c
e
In [3]: dic
Out[3]: {'a': 2, 'b': 5, 'c': 2, 'd': 3, 'e': 2}
dict.items() returns a list of tuples containing keys and value pairs in python 2.x:
In [4]: dic.items()
Out[4]: [('a', 2), ('c', 2), ('b', 5), ('e', 2), ('d', 3)]
and in python 3.x it returns an iterable view instead of list.
I think you want the GUID's to be values, not keys, since it looks like you want to look them up by something you assign. ...but it really depends on your use case.
# list of GUID's / Rhinoceros3d point ids
arrPts = ['D20EA4E1-3957-11d2-A40B-0C5020524153',
'1D2680C9-0E2A-469d-B787-065558BC7D43',
'ED7BA470-8E54-465E-825C-99712043E01C']
# reference each of these by a unique key
ptsDict = dict((i, value) for i, value in enumerate(arrPts))
# now `ptsDict` looks like: {0:'D20EA4E1-3957-11d2-A40B-0C5020524153', ...}
print(ptsDict[1]) # easy to "find" the one you want to print
# basically make both keys: `2`, and `1` point to the same guid
# Note: we've just "lost" the previous guid that the `2` key was pointing to
ptsDict[2] = ptsDict[1]
Edit:
If you were to use a tuple as the key to your dict, it would look something like:
ptsDict = {(loc, dist, attr3, attr4): 'D20EA4E1-3957-11d2-A40B-0C5020524153',
(loc2, dist2, attr3, attr4): '1D2680C9-0E2A-469d-B787-065558BC7D43',
...
}
As you know, tuples are immutable, so you can't change the key to your dict, but you can remove one key and insert another:
oldval = ptsDict.pop((loc2, dist2, attr3, attr4)) # remove old key and get value
ptsDict[(locx, disty, attr3, attr4)] = oldval # insert it back in with a new key
In order to have one key point to multiple values, you'd have to use a list or set to contain the guids:
{(loc, dist, attr3, attr4): ['D20E...', '1D2680...']}

a(*{'q':'qqq'}),why only print key

def a(*x):
print x
a({'q':'qqq'})
a(*{'q':'qqq'})#why only print key.
traceback:
({'q': 'qqq'},)
('q',)
That's how dictionaries get converted to sequences.
tuple(dictionary) = tuple(dictionary.keys())
for a similar reason
for x in dictionary:
assigns keys, not pairs, to x
When you're calling a function, using an asterisk before a list or dict will pass it in as positional parameters.
For example:
>>> a(*('test', 'testing'))
('test', 'testing')
>>> a(*{'a': 'b', 'c': 'd'})
('a', 'c')
Using * in front of an expression in a function call iterates over the value of the expression (your dict, in this case) and makes each item in the iteration another parameter to the function invocation. Iterating over a dict in Python yields the keys (for better or worse).
Iterating a dictionary will yield its keys.
d = {'a': 1, 'b': 2, 'c': 3 }
for x in d:
print x # prints a, b, c but not necessarily in that order
sorted(d): # Gives a, b, c in that order. No 1/2/3.
If you want to get both keys and values from a dictionary, you can use .items() or .iteritems()
sorted(d.items()) # [('a,' 1), ('b', 2), ('c', 3)]
You are asking for a list of arguments, and then telling python to send a dict as a sequence of arguments. When a dict is converted to a sequence, it uses the keys.
I guess you are really looking for **, not *.
a(*{'q' : 'qqq'})
will try to expand your dict ({'q':'qqq'}) into an itemized list of arguments for the function.
Note that:
tuple({'q' : 'qqq'})
returns ('q',), which is exactly what you're seeing. When you coerce a dictionary to a list/tuple, you only get the list of keys.
Probably because that's what a dictionary returns when you do a standard iteration over it. It gets converted to a sequence containing it's keys. This example exhibits the same behaviour:
>>> for i in {"a": "1", "b": "2"}:
... print i
...
a
b
To get what I assume you expect you would pass it as variable keyword arguments instead, like this:
>>> def a(**kwargs):
... print kwargs
...
>>> a(**{"a": "1", "b": "2"})
{'a': '1', 'b': '2'}
Note that you are now basically back where you began and have gained nothing.

Categories

Resources