Python Lambda Comparing Dictionaries - python

I have some code here to compare two dictionaries using Lambda and filter. Basically I have a required Tags dictionary and a Tags Dictionary for each EC2 Instance.
I need to be able to process two conditions. The first condition is only check whether the all the Keys in Required Tags exist in Instance Tags and they are not blank.
requiredTags = {'Name' : ['WebSense','NAT-V2'] }
instanceTags = i['Instances'][0]['Tags']
requiredTagsPresent = filter(lambda x: x['Key'] in requiredTags and
x['Value'] is not '', instanceTags)
The next condition is the most common - check whethere all the keys and their corresponding values are
requiredTagsPresent = filter(lambda x: x['Key'] in requiredTags and x['Value'] in requiredTags, instanceTags)
So far, I haven't been able to accomplish both of the above in a single script.
The last condition is the one I am having trouble with. I want to have a specific tag value that if present, we only check for the existence of the corresponding key regardless of the value. I have no idea how to do something like that.
Any tips?

This kind of thing is much easier to do if you use the built-in function all rather than lambda and filter. To check if all the keys in required_tags exist in instance_tags and they are not blank, use:
all_present = all(k in instance_tags and instance_tags[k] for k in required_tags.keys())
To check whether all the keys and values in instance_tags are in required_tags, use:
all_present2 = all(k in required_tags and v in required_tags for k, v in instance_tags.items())
This assumes Python3.
But I am not sure this is what you want, since your description of the second test condition has words left off at the end: "check whethere [sic] all the keys and their corresponding values are ". Are what? Also, when you told me in your comment what the structure of instance_tags was, you had unmatched square brackets. You said it was a dictionary but it looks like a list of dictionaries, each containing one item.

Related

How to look up keys in dictionary when shortened key names are in another list?

I'm trying to look up values in a dictionary by shortened keys (The keys in the dictionary are full length), where the shortened keys are in a different list.
For example, the list might be
names = ["Bob", "Albert", "Man", "Richard"],
and the dictionary I want to look up is as such:
location_dict {"Bob Franklin":"NYC", "Albert":"Chicago", "Manfred":"San Fransisco", "Richard Walker":"Houston"}
The code I have is:
for name in names:
if name.startswith(f'{name}') in location_dict:
print(location_dict[name.startswith('{name}')])
But this doesn't work (because it's self referential, and I need a list of full key names associated with the shortened ones, I understand that much). How do I get the values of a dictionary with shortened key names? I have a list of over a 1000 names like this, and I don't know the most efficient way to do this.
Expected output: If I look up "Bob" from the list of names as a key for the dictionary, then it should return "NYC" from the dictionary
Your condition is wrong. name.startswith(f'{name}') will always return True.
Try this -
for name in names:
for k, v in location_dict.items():
if k.startswith(name):
print(v)
To stop after the first match add a break statement.
for name in names:
for k, v in location_dict.items():
if k.startswith(name):
print(v)
break
Try this in just one line using any():
[v for k,v in location_dict.items() if any(k.startswith(name) for name in names)]

How to get the key element based on matching the key element in a dictionary?

I have a dictionary which looks like this:
dictionary={
"ABC-6m-RF-200605-1352": "s3://blabla1.com",
"ABC-3m-RF-200605-1352": "s3://blabla2.com",
"DEF-6m-RF-200605-1352": "s3://blabla3.com"
}
Now, I want to do a matching which takes input such as helper="ABC-6m" and tries to match this string to the key of the dictionary and returns the key (not the value)!
My code currently looks like this but it is not robust, i.e. sometimes it works and sometimes it does not:
dictionary_scan = dict((el, el[:7]) for el in dictionary)
#swapping key and value
dictionary_scan = dict((v, k) for k, v in dictionary.items())
#concat string components in helper variable
helper = 'ABC'+'-'+'6m'
out=list(value for key, value in dictionary_scan.items() if helper in key)
The expected output is: 'ABC-6m-RF-200605-1352'. Sometimes this works in my code but sometimes it does not. Is there a better and more robust way to do this?
If you make a dictionary that maps prefixes to full keys, you'll only be able to get one key with a given prefix.
If there can be multiple keys that start with helper, you need to check them all with an ordinary list comprehension.
out = [key for key in dictionary if key.startswith(helper)]

Read json key value as insensitive key

I need to be able to pull the value of the key 'irr' from this json address in python:
IRR = conparameters['components'][i]['envelope'][j]['irr']
Even if 'irr' is any oher case, like IRR, Irr... etc.
Is that easy?
There's nothing built-in that does it, you have to search for a matching key.
See Get the first item from an iterable that matches a condition for how to write a first() function that finds the first element of an iterable that matches a condition. I'll use that in the solution below.
cur = conparameters['components'][i]['envelope'][j]
key = first(cur.keys(), lambda k: lower(k) == 'irr')
IRR = cur[key]

Sorting a list of dict from redis in python

in my current project i generate a list of data, each entry is from a key in redis in a special DB where only one type of key exist.
r = redis.StrictRedis(host=settings.REDIS_AD, port=settings.REDIS_PORT, db='14')
item_list = []
keys = r.keys('*')
for key in keys:
item = r.hgetall(key)
item_list.append(item)
newlist = sorted(item_list, key=operator.itemgetter('Id'))
The code above let me retrieve the data, create a list of dict each containing the information of an entry, problem is i would like to be able to sort them by ID, so they come out in order when displayed on my html tab in the template, but the sorted function doesn't seem to work since the table isn't sorted.
Any idea why the sorted line doesn't work ? i suppose i'm missing something to make it work but i can't find what.
EDIT :
Thanks to the answer in the comments,the problem was that my 'Id' come out of redis as a string and needed to be casted as int to be sorted
key=lambda d: int(d['Id'])
All values returned from redis are apparently strings and strings do not sort numerically ("10" < "2" == True).
Therefore you need to cast it to a numerical value, probably to int (since they seem to be IDs):
newlist = sorted(item_list, key=lambda d: int(d['Id']))

Regular expressions matching words which contain the pattern but also the pattern plus something else

I have the following problem:
list1=['xyz','xyz2','other_randoms']
list2=['xyz']
I need to find which elements of list2 are in list1. In actual fact the elements of list1 correspond to a numerical value which I need to obtain then change. The problem is that 'xyz2' contains 'xyz' and therefore matches also with a regular expression.
My code so far (where 'data' is a python dictionary and 'specie_name_and_initial_values' is a list of lists where each sublist contains two elements, the first being specie name and the second being a numerical value that goes with it):
all_keys = list(data.keys())
for i in range(len(all_keys)):
if all_keys[i]!='Time':
#print all_keys[i]
pattern = re.compile(all_keys[i])
for j in range(len(specie_name_and_initial_values)):
print re.findall(pattern,specie_name_and_initial_values[j][0])
Variations of the regular expression I have tried include:
pattern = re.compile('^'+all_keys[i]+'$')
pattern = re.compile('^'+all_keys[i])
pattern = re.compile(all_keys[i]+'$')
And I've also tried using 'in' as a qualifier (i.e. within a for loop)
Any help would be greatly appreciated. Thanks
Ciaran
----------EDIT------------
To clarify. My current code is below. its used within a class/method like structure.
def calculate_relative_data_based_on_initial_values(self,copasi_file,xlsx_data_file,data_type='fold_change',time='seconds'):
copasi_tool = MineParamEstTools()
data=pandas.io.excel.read_excel(xlsx_data_file,header=0)
#uses custom class and method to get the list of lists from a file
specie_name_and_initial_values = copasi_tool.get_copasi_initial_values(copasi_file)
if time=='minutes':
data['Time']=data['Time']*60
elif time=='hour':
data['Time']=data['Time']*3600
elif time=='seconds':
print 'Time is already in seconds.'
else:
print 'Not a valid time unit'
all_keys = list(data.keys())
species=[]
for i in range(len(specie_name_and_initial_values)):
species.append(specie_name_and_initial_values[i][0])
for i in range(len(all_keys)):
for j in range(len(specie_name_and_initial_values)):
if all_keys[i] in species[j]:
print all_keys[i]
The table returned from pandas is accessed like a dictionary. I need to go to my data table, extract the headers (i.e. the all_keys bit), then look up the name of the header in the specie_name_and_initial_values variable and obtain the corresponding value (the second element within the specie_name_and_initial_value variable). After this, I multiply all values of my data table by the value obtained for each of the matched elements.
I'm most likely over complicating this. Do you have a better solution?
thanks
----------edit 2 ---------------
Okay, below are my variables
all_keys = set([u'Cyp26_G_R1', u'Cyp26_G_rep1', u'Time'])
species = set(['[Cyp26_R1R2_RARa]', '[Cyp26_SRC3_1]', '[18-OH-RA]', '[p38_a]', '[Cyp26_G_rep1]', '[Cyp26]', '[Cyp26_G_a]', '[SRC3_p]', '[mRARa]', '[np38_a]', '[mRARa_a]', '[RARa_pp_TFIIH]', '[RARa]', '[Cyp26_G_L2]', '[atRA]', '[atRA_c]', '[SRC3]', '[RARa_Ser369p]', '[p38]', '[Cyp26_mRNA]', '[Cyp26_G_L]', '[TFIIH]', '[Cyp26_SRC3_2]', '[Cyp26_G_R1R2]', '[MSK1]', '[MSK1_a]', '[Cyp26_G]', '[Basal_Kinases]', '[Cyp26_R1_RARa]', '[4-OH-RA]', '[Cyp26_G_rep2]', '[Cyp26_Chromatin]', '[Cyp26_G_R1]', '[RXR]', '[SMRT]'])
You don't need a regex to find common elements, set.intersection will find all elements in list2 that are also in list1:
list1=['xyz','xyz2','other_randoms']
list2=['xyz']
print(set(list2).intersection(list1))
set(['xyz'])
Also if you wanted to compare 'xyz' to 'xyz2' you would use == not in and then it would correctly return False.
You can also rewrite your own code a lot more succinctly, :
for key in data:
if key != 'Time':
pattern = re.compile(val)
for name, _ in specie_name_and_initial_values:
print re.findall(pattern, name)
Based on your edit you have somehow managed to turn lists into strings, one option is to strip the []:
all_keys = set([u'Cyp26_G_R1', u'Cyp26_G_rep1', u'Time'])
specie_name_and_initial_values = set(['[Cyp26_R1R2_RARa]', '[Cyp26_SRC3_1]', '[18-OH-RA]', '[p38_a]', '[Cyp26_G_rep1]', '[Cyp26]', '[Cyp26_G_a]', '[SRC3_p]', '[mRARa]', '[np38_a]', '[mRARa_a]', '[RARa_pp_TFIIH]', '[RARa]', '[Cyp26_G_L2]', '[atRA]', '[atRA_c]', '[SRC3]', '[RARa_Ser369p]', '[p38]', '[Cyp26_mRNA]', '[Cyp26_G_L]', '[TFIIH]', '[Cyp26_SRC3_2]', '[Cyp26_G_R1R2]', '[MSK1]', '[MSK1_a]', '[Cyp26_G]', '[Basal_Kinases]', '[Cyp26_R1_RARa]', '[4-OH-RA]', '[Cyp26_G_rep2]', '[Cyp26_Chromatin]', '[Cyp26_G_R1]', '[RXR]', '[SMRT]'])
specie_name_and_initial_values = set(s.strip("[]") for s in specie_name_and_initial_values)
print(all_keys.intersection(specie_name_and_initial_values))
Which outputs:
set([u'Cyp26_G_R1', u'Cyp26_G_rep1'])
FYI, if you had lists inside the set you would have gotten an error as lists are mutable so are not hashable.

Categories

Resources