Python: apply wildcard match to keys being read from dictionary

Python: apply wildcard match to keys being read from dictionary - python

This is for a script I'm running in Blender, but the question pertains to the Python part of it. It's not specific to Blender.
The script is originally from this answer, and it replaces a given material (the key) with its newer equivalent (the value).
Here's the code:
import bpy
objects = bpy.context.selected_objects
mat_dict = {
"SOLID-WHITE": "Sld_WHITE",
"SOLID-BLACK": "Sld_BLACK",
"SOLID-BLUE": "Sld_BLUE"
}
for obj in objects:
for slot in obj.material_slots:
slot.material = bpy.data.materials[mat_dict[slot.material.name]]
The snag is, how to handle duplicates when the scene may have not only objects with the material "SOLID-WHITE", but also "SOLID-WHITE.001", "SOLID-WHITE.002", and so on.
I was looking at this answer to a question about wildcards in Python and it seems fnmatch might well well-suited for this task.
I've tried working fnmatch into the last line of the code. I've also tried wrapping the dictionary keys with it (very WET, I know). Neither of these approaches has worked.
How can I run a wildcard match on each dictionary key?
So for example, whether an object has "SOLID-WHITE" or "SOLID-WHITE"-dot-some-number, it will still be replaced with "Sld_WHITE"?

I have no clue about Blender so I'm not sure if I'm getting the problem right, but how about the following?
mat_dict = {
"SOLID-WHITE": "Sld_WHITE",
"SOLID-BLACK": "Sld_BLACK",
"SOLID-BLUE": "Sld_BLUE"
}
def get_new_material(old_material):
for k, v in mat_dict.items():
# .split(".")[0] extracts the part to the left of the dot (if there is one)
if old_material.split(".")[0] == k:
return v
return old_material
for obj in objects:
for slot in obj.material_slots:
new_material = get_new_material(slot.material.name)
slot.material = bpy.data.materials[new_material]
Instead of the .split(".")[0] you could use or re.match by storing regexes as keys in your dictionary. As you noticed in the comment, startswith could match too much, and the same would be the case for fnmatch.
Examples of the above function in action:
In [3]: get_new_material("SOLID-WHITE.001")
Out[3]: 'Sld_WHITE'
In [4]: get_new_material("SOLID-WHITE")
Out[4]: 'Sld_WHITE'
In [5]: get_new_material("SOLID-BLACK")
Out[5]: 'Sld_BLACK'
In [6]: get_new_material("test")
Out[6]: 'test'

There are two ways you can approach this.
You can make a smart dictionary that matches vague names. Or you can change the key that is used to look up the a color.
Here is an example of the first approach using fnmatch.
this approach changes the lookup time complexity from O(1) to O(n) when a color contains a number. this approach extends UserDict with a __missing__ method. the __missing__ method gets called if the key is not found in the dictionary. it compares every key with the given key using fnmatch.
from collections import UserDict
import fnmatch
import bpy
objects = bpy.context.selected_objects
class Colors(UserDict):
def __missing__(self, key):
for color in self.keys():
if fnmatch.fnmatch(key, color + "*"):
return self[color]
raise KeyError(f"could not match {key}")
mat_dict = Colors({
"SOLID-WHITE": "Sld_WHITE",
"SOLID-BLACK": "Sld_BLACK",
"SOLID-BLUE": "Sld_BLUE"
})
for obj in objects:
for slot in obj.material_slots:
slot.material = bpy.data.materials[mat_dict[slot.material.name]]
Here is an example of the second approach using regex.
import re
import bpy
objects = bpy.context.selected_objects
mat_dict = {
"SOLID-WHITE": "Sld_WHITE",
"SOLID-BLACK": "Sld_BLACK",
"SOLID-BLUE": "Sld_BLUE"
}
pattern = re.compile(r"([A-Z\-]+)(?:\.\d+)?")
# matches any number of capital letters and dashes
# can be followed by a dot followed by any number of digits
# this pattern can match the following strings
# ["AAAAA", "----", "AA-AA.00005"]
for obj in objects:
for slot in obj.material_slots:
match = pattern.fullmatch(slot.material.name)
if match:
slot.material = bpy.data.materials[mat_dict[match.group(1)]]
else:
slot.material = bpy.data.materials[mat_dict[slot.material.name]]

Related

Reading from nested json and getting None type Error -> try/except

I am reading data from nested json with this code:
data = json.loads(json_file.json)
for nodesUni in data["data"]["queryUnits"]['nodes']:
try:
tm = (nodesUni['sql']['busData'][0]['engine']['engType'])
except:
tm = ''
try:
to = (nodesUni['sql']['carData'][0]['engineData']['producer']['engName'])
except:
to = ''
json_output_for_one_GU_owner = {
"EngineType": tm,
"EngineName": to,
}
I am having an issue with None type error (eg. this one doesn't exists at all nodesUni['sql']['busData'][0]['engine']['engType'] cause there are no data, so I am using try/except. But my code is more complex and having a try/except for every value is crazy. Is there any other option how to deal with this?
Error: "TypeError: 'NoneType' object is not subscriptable"

This is non-trivial as your requirement is to traverse the dictionaries without errors, and get an empty string value in the end, all that in a very simple expression like cascading the [] operators.
First method
My approach is to add a hook when loading the json file, so it creates default dictionaries in an infinite way
import collections,json
def superdefaultdict():
return collections.defaultdict(superdefaultdict)
def hook(s):
c = superdefaultdict()
c.update(s)
return(c)
data = json.loads('{"foo":"bar"}',object_hook=hook)
print(data["x"][0]["zzz"]) # doesn't exist
print(data["foo"]) # exists
prints:
defaultdict(<function superdefaultdict at 0x000001ECEFA47160>, {})
bar
when accessing some combination of keys that don't exist (at any level), superdefaultdict recursively creates a defaultdict of itself (this is a nice pattern, you can read more about it in Is there a standard class for an infinitely nested defaultdict?), allowing any number of non-existing key levels.
Now the only drawback is that it returns a defaultdict(<function superdefaultdict at 0x000001ECEFA47160>, {}) which is ugly. So
print(data["x"][0]["zzz"] or "")
prints empty string if the dictionary is empty. That should suffice for your purpose.
Use like that in your context:
def superdefaultdict():
return collections.defaultdict(superdefaultdict)
def hook(s):
c = superdefaultdict()
c.update(s)
return(c)
data = json.loads(json_file.json,object_hook=hook)
for nodesUni in data["data"]["queryUnits"]['nodes']:
tm = nodesUni['sql']['busData'][0]['engine']['engType'] or ""
to = nodesUni['sql']['carData'][0]['engineData']['producer']['engName'] or ""
Drawbacks:
It creates a lot of empty dictionaries in your data object. Shouldn't be a problem (except if you're very low in memory) as the object isn't dumped to a file afterwards (where the non-existent values would appear)
If a value already exists, trying to access it as a dictionary crashes the program
Also if some value is 0 or an empty list, the or operator will pick "". This can be workarounded with another wrapper that tests if the object is an empty superdefaultdict instead. Less elegant but doable.
Second method
Convert the access of your successive dictionaries as a string (for instance just double quote your expression like "['sql']['busData'][0]['engine']['engType']", parse it, and loop on the keys to get the data. If there's an exception, stop and return an empty string.
import json,re,operator
def get(key,data):
key_parts = [x.strip("'") if x.startswith("'") else int(x) for x in re.findall(r"\[([^\]]*)\]",key)]
try:
for k in key_parts:
data = data[k]
return data
except (KeyError,IndexError,TypeError):
return ""
testing with some simple data:
data = json.loads('{"foo":"bar","hello":{"a":12}}')
print(get("['sql']['busData'][0]['engine']['engType']",data))
print(get("['hello']['a']",data))
print(get("['hello']['a']['e']",data))
we get, empty string (some keys are missing), 12 (the path is valid), empty string (we tried to traverse a non-dict existing value).
The syntax could be simplified (ex: "sql"."busData".O."engine"."engType") but would still have to retain a way to differentiate keys (strings) from indices (integers)
The second approach is probably the most flexible one.

How to map over a CommentedMap while preserving the comments/style?

Given a ruamel.yaml CommentedMap, and some transformation function f: CommentedMap → Any, I would like to produce a new CommentedMap with transformed keys and values, but otherwise as similar as possible to the original.
If I don't care about preserving style, I can do this:
result = {
f(key) : f(value)
for key, value in my_commented_map.items()
}
If I didn't need to transform the keys (and I didn't care about mutating the original), I could do this:
for key, value in my_commented_map.items():
my_commented_map[key] = f(value)

The style and comment information are each attached to the
CommentedMap via special attributes. The style you can copy, but
the comments are partly indexed to key on which line they occur, and
if you transform that key, you also need to transform that indexed
comment.
In your first example you apply f() to both key and value, I'll use
seperate functions in my example, all-capsing the keys, and
all-lowercasing the values (this of course only works on string type
keys and value, so this is a restriction of the example, not of
the solution)
import sys
import ruamel.yaml
from ruamel.yaml.comments import CommentedMap as CM
from ruamel.yaml.comments import Format, Comment
yaml_str = """\
# example YAML document
abc: All Strings are Equal # but some Strings are more Equal then others
klm: Flying Blue
xYz: the End # for now
"""
def fkey(s):
return s.upper()
def fval(s):
return s.lower()
def transform(data, fk, fv):
d = CM()
if hasattr(data, Format.attrib):
setattr(d, Format.attrib, getattr(data, Format.attrib))
ca = None
if hasattr(data, Comment.attrib):
setattr(d, Comment.attrib, getattr(data, Comment.attrib))
ca = getattr(d, Comment.attrib)
# as the key mapping could map new keys on old keys, first gather everything
key_com = {}
for k in data:
new_k = fk(k)
d[new_k] = fv(data[k])
if ca is not None and k in ca.items:
key_com[new_k] = ca.items.pop(k)
if ca is not None:
assert len(ca.items) == 0
ca._items = key_com # the attribute, not the read-only property
return d
yaml = ruamel.yaml.YAML()
data = yaml.load(yaml_str)
# the following will print any new CommentedMap with curly braces, this just here to check
# if the style attribute copying is working correctly, remove from real code
yaml.default_flow_style = True
data = transform(data, fkey, fval)
yaml.dump(data, sys.stdout)
which gives:
# example YAML document
ABC: all strings are equal # but some Strings are more Equal then others
KLM: flying blue
XYZ: the end # for now
Please note:
the above tries (and succeeds) to start a comment in the original
column, if that is not possible, e.g. when a transformed key or
value takes more space, it is pushed further to the right.
if you have a more complex datastructure, recursively walk the tree, descending into mappings
and sequences. In that case it might be more easy to store (key, value, comment) tuples
then pop() all the keys and reinsert the stored values (instead of rebuilding the tree).

More pythonic way to replace keywords in a string?

I am attempting to wrap an API with the following function. The API has end points that look similar to this:
/users/{ids}
/users/{ids}/permissions
The idea is that I'll be able to pass a dictionary to my function that contains a list of ids and those will be formatted as the API expects:
users = {'ids': [1, 2, 3, 5]}
call_api('/users/{ids}/permissions', users)
Then in call_api, I currently do something like this
def call_api(url, data):
for k, value in data.items():
if "{" + key + "}" in url:
url = url.replace("{"+k+"}", ';'.join(str(x) for x in value))
data.pop(k, None)
This works, but I can't imagine that if statement is efficient.
How can I improve it and have it work in both Python 2.7 and Python 3.5?
I've also been told that changing the dictionary while iterating is bad, but in my tests I've never had an issue. I am poping the value, because I later check if there are unexpected parameters (ie. anything left in data). Is what I'm doing now the right way?

Instead of modifying a dictionary as you iterate over it, creating another object to hold the unused keys is probably the way to go. In Python 3.4+, at least, removing keys during iteration will raise a
RuntimeError: dictionary changed size during iteration.
def call_api(url, data):
unused_keys = set()
for k, value in data.items():
key_pattern = "{" + k + "}"
if key_pattern in url:
formatted_value = ';'.join(map(str, value))
url = url.replace(key_pattern, formatted_value)
else:
unused_keys.add(k)
Also, if you think that you're more likely to run into an unused key, reversing the conditions might be the way to go.

Here is the way to do it. First, the string is parsed for the keys. It then remembers all keys not used in the url and saves it in the side. Lastly, it formats the url with the given parameters of the dict. The function returns the unused variables and the formatted url. If you wish you can remove the unused variables from the dict by iterating over them and deleting from the dict.
Here's some documentation with examples regarding the format syntax.
import string
users = {'ids': [1, 2, 3, 5]}
def call_api(url, data):
data_set = set(data)
formatter = string.Formatter()
used_set = {f[1] for f in formatter.parse(url) if f[1] is not None}
unused_set = data_set - used_set
formatted = url.format(**{k: ";".join(str(x) for x in v)
for k, v in data.items()})
return unused_set, formatted
print(call_api('/users/{ids}/permissions', users))

You could use re.subn which returns the number of replacements made:
import re
def call_api(url, data):
for k, value in list(data.items()):
url, n = re.subn(r'\{%s\}' % k, ';'.join(str(x) for x in value), url)
if n:
del data[k]
Note that for compatibilty with both python2 and python3, it is also necessary to create a copy of the list of items when destructively iterating over the dict.
EDIT:
It seems the main bottleneck is checking that the key is in the url. The in operator is easily the most efficient way to do this, and is much faster than a regex for the simple pattern that is being used here. Recording the unused keys separately is also more efficient than destructive iteration, but it doesn't make as much difference (relatively speaking).
So: there's not much wrong with the original solution, but the one given by #wegry is the most efficient.

The formatting keys can be found with a RegEx and then compared to the keys in the dictionary. Your string is already setup to use str.format, so you apply a transformation to the values in data, and then apply that transformation.
import re
from toolz import valmap
def call_api(url, data):
unused = set(data) - set(re.findall('\{(\w+)\}', url))
url = url.format_map(valmap(lambda v: ';'.join(map(str, v)), data))
return url, unused
The usage looks like:
users = {'ids': [1, 2, 3, 5], 'unused_key': 'value'}
print(call_api('/users/{ids}/permissions', users))
# ('/users/1;2;3;5/permissions', {'unused_key'})
This isn't going to time that well, but it's concise. As noted in one of the comments, it seems unlikely that this method is be a bottleneck.

python return double entry in dictionary

I am searching for hours and hours on this problem and tried everything possible but I can't get it cracked, I am quiet a dictionary noob.
I work with maya and got clashing names of lights, this happens when you duplicate a group all children are named the same as before, so having a ALL_KEY in one group results in a clashing name with a key_char in another group.
I need to identify a clashing name of the short name and return the long name so I can do a print long name is double or even a cmds.select.
Unfortunately everything I find on this matter in the internet is about returning if a list contains double values or not and only returns True or False, which is useless for me, so I tried list cleaning and list comparison, but I get stuck with a dictionary to maintain long and short names at the same time.
I managed to fetch short names if they are duplicates and return them, but on the way the long name got lost, so of course I can't identify it clearly anymore.
>import itertools
>import fnmatch
>import maya.cmds as mc
>LIGHT_TYPES = ["spotLight", "areaLight", "directionalLight", "pointLight", "aiAreaLight", "aiPhotometricLight", "aiSkyDomeLight"]
#create dict
dblList = {'long' : 'short'}
for x in mc.ls (type=LIGHT_TYPES, transforms=True):
y = x.split('|')[-1:][0]
dblList['long','short'] = dblList.setdefault(x, y)
#reverse values with keys for easier detection
rev_multidict = {}
for key, value in dblList.items():
rev_multidict.setdefault(value, set()).add(key)
#detect the doubles in the dict
#print [values for key, values in rev_multidict.items() if len(values) > 1]
flattenList = set(itertools.chain.from_iterable(values for key, values in rev_multidict.items() if len(values) > 1))
#so by now I got all the long names which clash in the scene already!
#means now I just need to make a for loop strip away the pipes and ask if the object is already in the list, then return the path with the pipe, and ask if the object is in lightlist and return the longname if so.
#but after many many hours I can't get this part working.
##as example until now print flattenList returns
>set([u'ALL_blockers|ALL_KEY', u'ABCD_0140|scSet', u'SARAH_TOPShape', u'ABCD_0140|scChars', u'ALL|ALL_KEY', u'|scChars', u'|scSet', u'|scFX', ('long', 'short'), u'ABCD_0140|scFX'])
#we see ALL_KEY is double! and that's exactly what I need returned as long name
#THIS IS THE PART THAT I CAN'T GET WORKING, CHECK IN THE LIST WHICH VALUES ARE DOUBLE IN THE LONGNAME AND RETURN THE SHORTNAME LIST.
THE WHOLE DICTIONARY IS STILL COMPLETE AS
seen = set()
uniq = []
for x in dblList2:
if x[0].split('|')[-1:][0] not in seen:
uniq.append(x.split('|')[-1:][0])
seen.add(x.split('|')[-1:][0])
thanks for your help.

I'm going to take a stab with this. If this isn't what you want let me know why.
If I have a scene with a hierarchy like this:
group1
nurbsCircle1
group2
nurbsCircle2
group3
nurbsCircle1
I can run this (adjust ls() if you need it for selection or whatnot):
conflictObjs = {}
objs = cmds.ls(shortNames = True, transforms = True)
for obj in objs:
if len( obj.split('|') ) > 1:
conflictObjs[obj] = obj.split('|')[-1]
And the output of conflictObjs will be:
# Dictionary of objects with non-unique short names
# {<long name>:<short name>}
{u'group1|nurbsCircle1': u'nurbsCircle1', u'group3|nurbsCircle1': u'nurbsCircle1'}
Showing me what objects don't have unique short names.

This will give you a list of all the lights which have duplicate short names, grouped by what the duplicated name is and including the full path of the duplicated objects:
def clashes_by_type(*types):
long_names = cmds.ls(type = types, l=True) or []
# get the parents from the lights, not using ls -type transform
long_names = set(cmds.listRelatives(*long_names, p=True, f=True) or [])
short_names = set([i.rpartition("|")[-1] for i in long_names])
short_dict = dict()
for sn in short_names:
short_dict[sn] = [i for i in long_names if i.endswith("|"+ sn)]
clashes = dict((k,v) for k, v in short_dict.items() if len(v) > 1)
return clashes
clashes_by_type('directionalLight', 'ambientLight')The main points to note:
work down from long names. short names are inherently unreliable!
when deriving the short names, include the last pipe so you don't get accidental overlaps of common names
short_names will always be a list of lists since it's created by a comprehension
once you have a dict of (name, [objects with that shortname]) it's easy to get clashes by looking for values longer than 1

Lookup for a key in dictionary with... regular expressions?

I have a dictionary that have the following structure: The key is a link between a source and a destination, the value is the instance of an object wire.
wire_dict = { source1_destination1_1 : object,
source1_destination1_2 : object
source2_destination1_3 : object
source2_destination1_4 : object
source2_destination2_1 : object
source2_destination2_2 : object }
Let's suppose that I only have a destination value, and with that I want to find, perhaps with regular expressions, the key that have the destination1_1. As you can see, same sources can have several destinations, but different sources cannot have the same destinations. So I want to find the key that ends with the destination.
Since the wire_dict could contain a lot of key-value entries, please tell me how this approach can affect the performance of the application. Perhaps I should create another dictionary only for the relationship between source and destination?
UPDATE: I change the dictionary with tuples as keys:
wire_dict = { ('source1','destination1_1') : object1,
('source1','destination1_2') : object2
('source2','destination1_3') : object3
('source2','destination1_4') : object4
('source2','destination2_1') : object5
('source2','destination2_2') : object6 }
The logic of the application is the same. A destination cannot have more than one source. So, only a coincidence should be found when a destination is provided.

Having string searches through dict keys is going to be linear time with standard python dictionaries. But it can be done with dict.keys() and re module as #avim helpfully told.
For the second concern, instead of string keys, how about having tuples as keys:
{(begin, end): connection_object}
It won't speed up at all (search is likely stay linear) but it enables better code behind the logic you want to express.

import re
wire_dict = {'source1_destination1_1' : 'object1',
'source1_destination1_2' : 'object2',
'source2_destination1_3' : 'object3',
'source2_destination1_4' : 'object4',
'source2_destination2_1' : 'object5',
'source2_destination2_2' : 'object6' }
pattern = 'source1_destination1_1'
print [value for key, value in wire_dict.items() if re.search(pattern, key)]
Output:
['object1']

It's easy to run over all dict keys and find the ones that match your pattern, but it's slow for big dicts.
I think you need another dict with keys matching your destinations (as you thought).

You just need str.endswith and to iterate over the dict checking each key.
print([k for k in wire_dict if k.endswith("destination1_1")])
If there is only over one use next and a generator expression:
k = next((k for k in wire_dict if k.endswith("destination1_1")),"")
If you want the value use wire_dict.get(k) in case there is no match and you get an empty string returned from the next call.
In [18]: k = next((k for k in wire_dict if k.endswith("destination1_1")),"")
In [19]: wire_dict[k]
Out[19]: object
In [20]: k
Out[20]: 'source1_destination1_1'
You should also never use dict.keys in python2 unless you actually want a list. You can simply iterate over the dict object to access each key efficiently.

Object oriented programming my friend
class Uberdict():
def init(source, destination, obj):
self.source, self.destination, self.obj = source, destination, obj
def has_destination(destination)
# True or False
return self.desination == destination
def has_source(source)
return self.source == source
wire_object_list = [
# list of the objects
]
# how to create them
example_obj = Uberdict(some_source, some_destination, some_instance)
wire_object_list.append(example_obj)
# filter
example_destination = 'some destination'
filtered_list = [item for item in wire_object_list if item.has_destination(example_destination)
only psuedo code could have errors

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: apply wildcard match to keys being read from dictionary - python

Related

Reading from nested json and getting None type Error -> try/except

How to map over a CommentedMap while preserving the comments/style?

More pythonic way to replace keywords in a string?

python return double entry in dictionary

Lookup for a key in dictionary with... regular expressions?

Categories

Resources