Accessing and extracting a dictionary set within my dictionary - python

I have a dictionary within a dictionary like below:
My code looks for todays date and if it exists in dict it should print “found a match at” + dict(name)
today = datetime.datetime.now()
if today in dict[name]:
print "Found a bday for " + str(dict[name])
#here i need a variable that will hold the email from that name so I can use to somewhere else
If I excuse print dict
It looks like this:
{'name1': set(['01-25', 'name1#company.com']), 'name2': set(['name2#company.com', '11-29']), 'name3': set(['08-15', 'name3#company.com']), 'name4': set(['01-24', 'name4#company.com'])}
My questions is how do I access the email. I understood the hard way that the values in my dictionary are currently a ‘set’ that I can’t access by indexing. Once I do print dict[name][1] then I get the error TypeError:
'set' object does not support indexing

Sets are unordered data structures, so you cannot access them with the index.
Even if you typecast them to a list, and then use a index,i.e. list(dict[name])[1] it is not guaranteed to return correct element, as the order of the list is not defined.
Your way to pick out matching element can be to iterate the set, and select the element which matches conditions like type and any possible regex, but that is also prone to error, unless your sets have few elements of different types.

You are using a set when you create the dict, where you should be using a list or a tuple.
dict.setdefault(name, [emails,bdays])

You could use something like this:
date = today.strftime('%m-%d')
for names in var.keys():
if date in var[names]:
for x in var[names]:
if x != date:
email = x
print "Found a bday for " + email
But there are better options to store this information.

Related

How to access single items in a list with a lenght of 1?

I have imported a json file and converted it to a python dict. From there I extracted the dict I needed and imported it into a list. Now I would like to manipulate the items further, for example only extract 'player_fullname' : 'Lenny Hampel'. However, it seems that there is another dict inside the list, which len is 1, and I cannot access it.
It is a list with 1 entry and the entry is the dict. Easiest way would be:
player = player[0] #get the dict
print(player['player_fullname'])
Check player_fullname whether in your dict and then compare name value.
[item for item in your_dict if item.get('player_fullname', '') == 'Lenny Hampel']

Thingspeak: Parse json response with Python

I would like to create an Alexa skill using Python to use data uploaded by sensors to Thingspeak. The cases where I only use one specific value is quite easy, the response from Thingspeak is the value only. When I want to use several values, in my case to sum up the athmospheric pressure to determine tendencies, teh response is a json object like this:
{"channel":{"id":293367,"name":"Weather Station","description":"My first attempt to build a weather station based on an ESP8266 and some common sensors.","latitude":"51.473509","longitude":"7.355569","field1":"humidity","field2":"pressure","field3":"lux","field4":"rssi","field5":"temp","field6":"uv","field7":"voltage","field8":"radiation","created_at":"2017-06-25T07:35:37Z","updated_at":"2018-08-04T12:11:22Z","elevation":"121","last_entry_id":1812},"feeds":
[{"created_at":"2018-10-21T18:11:45Z","entry_id":1713,"field2":"1025.62"},
{"created_at":"2018-10-21T18:12:05Z","entry_id":1714,"field2":"1025.58"},
{"created_at":"2018-10-21T18:12:25Z","entry_id":1715,"field2":"1025.56"},
{"created_at":"2018-10-21T18:12:45Z","entry_id":1716,"field2":"1025.65"},
{"created_at":"2018-10-21T18:13:05Z","entry_id":1717,"field2":"1025.58"},
{"created_at":"2018-10-21T18:13:25Z","entry_id":1718,"field2":"1025.63"}]
I now started with
f = urllib.urlopen(link) # Get your data
json_object = json.load(f)
for entry in json_object[0]
print entry["field2"]
The json object is a bit recursive, it is a list containing a list with an element with an array as the value.
Now I am not quite sure how to iterate over the values of the key "field2" in the array. I am quite new to Python and also json. Perhaps anyone can help me out?
Thanks in advance!
This has nothing to do with json - once the json string parsed by json.load(), what you get is a plain python object (usually a dict, sometimes a list, rarely - but this would be legal - a string, int, float, boolean or None).
it is a list containing a list with an element with an array as the value.
Actually it's a dict with two keys "channel" and "feeds". The first one has another dict for value, and the second a list of dicts. How to use dicts and lists is extensively documented FWIW
https://docs.python.org/3/tutorial/datastructures.html#dictionaries
https://docs.python.org/3/library/stdtypes.html#mapping-types-dict
https://docs.python.org/3/tutorial/introduction.html#lists
https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range
Here the values you're looking for are stored under the "field2" keys of the dicts in the "feeds" key, so what you want is:
# get the list stored under the "feeds" key
feeds = json_object["feeds"]
# iterate over the list:
for feed in feeds:
# get the value for the "field2" key
print feed["field2"]
You have a dictionary. Use key to access the value
Ex:
json_object = {"channel":{"id":293367,"name":"Weather Station","description":"My first attempt to build a weather station based on an ESP8266 and some common sensors.","latitude":"51.473509","longitude":"7.355569","field1":"humidity","field2":"pressure","field3":"lux","field4":"rssi","field5":"temp","field6":"uv","field7":"voltage","field8":"radiation","created_at":"2017-06-25T07:35:37Z","updated_at":"2018-08-04T12:11:22Z","elevation":"121","last_entry_id":1812},"feeds":
[{"created_at":"2018-10-21T18:11:45Z","entry_id":1713,"field2":"1025.62"},
{"created_at":"2018-10-21T18:12:05Z","entry_id":1714,"field2":"1025.58"},
{"created_at":"2018-10-21T18:12:25Z","entry_id":1715,"field2":"1025.56"},
{"created_at":"2018-10-21T18:12:45Z","entry_id":1716,"field2":"1025.65"},
{"created_at":"2018-10-21T18:13:05Z","entry_id":1717,"field2":"1025.58"},
{"created_at":"2018-10-21T18:13:25Z","entry_id":1718,"field2":"1025.63"}]}
for entry in json_object["feeds"]:
print entry["field2"]
Output:
1025.62
1025.58
1025.56
1025.65
1025.58
1025.63
I just figured it out, it was just like expected.
You have to get the entries array from the dict and than iterate over the list of items and print the value to the key field2.
# Get entries from the response
entries = json_object["feeds"]
# Iterate through each measurement and print value
for entry in entries:
print entry['field2']

Regular expressions matching words which contain the pattern but also the pattern plus something else

I have the following problem:
list1=['xyz','xyz2','other_randoms']
list2=['xyz']
I need to find which elements of list2 are in list1. In actual fact the elements of list1 correspond to a numerical value which I need to obtain then change. The problem is that 'xyz2' contains 'xyz' and therefore matches also with a regular expression.
My code so far (where 'data' is a python dictionary and 'specie_name_and_initial_values' is a list of lists where each sublist contains two elements, the first being specie name and the second being a numerical value that goes with it):
all_keys = list(data.keys())
for i in range(len(all_keys)):
if all_keys[i]!='Time':
#print all_keys[i]
pattern = re.compile(all_keys[i])
for j in range(len(specie_name_and_initial_values)):
print re.findall(pattern,specie_name_and_initial_values[j][0])
Variations of the regular expression I have tried include:
pattern = re.compile('^'+all_keys[i]+'$')
pattern = re.compile('^'+all_keys[i])
pattern = re.compile(all_keys[i]+'$')
And I've also tried using 'in' as a qualifier (i.e. within a for loop)
Any help would be greatly appreciated. Thanks
Ciaran
----------EDIT------------
To clarify. My current code is below. its used within a class/method like structure.
def calculate_relative_data_based_on_initial_values(self,copasi_file,xlsx_data_file,data_type='fold_change',time='seconds'):
copasi_tool = MineParamEstTools()
data=pandas.io.excel.read_excel(xlsx_data_file,header=0)
#uses custom class and method to get the list of lists from a file
specie_name_and_initial_values = copasi_tool.get_copasi_initial_values(copasi_file)
if time=='minutes':
data['Time']=data['Time']*60
elif time=='hour':
data['Time']=data['Time']*3600
elif time=='seconds':
print 'Time is already in seconds.'
else:
print 'Not a valid time unit'
all_keys = list(data.keys())
species=[]
for i in range(len(specie_name_and_initial_values)):
species.append(specie_name_and_initial_values[i][0])
for i in range(len(all_keys)):
for j in range(len(specie_name_and_initial_values)):
if all_keys[i] in species[j]:
print all_keys[i]
The table returned from pandas is accessed like a dictionary. I need to go to my data table, extract the headers (i.e. the all_keys bit), then look up the name of the header in the specie_name_and_initial_values variable and obtain the corresponding value (the second element within the specie_name_and_initial_value variable). After this, I multiply all values of my data table by the value obtained for each of the matched elements.
I'm most likely over complicating this. Do you have a better solution?
thanks
----------edit 2 ---------------
Okay, below are my variables
all_keys = set([u'Cyp26_G_R1', u'Cyp26_G_rep1', u'Time'])
species = set(['[Cyp26_R1R2_RARa]', '[Cyp26_SRC3_1]', '[18-OH-RA]', '[p38_a]', '[Cyp26_G_rep1]', '[Cyp26]', '[Cyp26_G_a]', '[SRC3_p]', '[mRARa]', '[np38_a]', '[mRARa_a]', '[RARa_pp_TFIIH]', '[RARa]', '[Cyp26_G_L2]', '[atRA]', '[atRA_c]', '[SRC3]', '[RARa_Ser369p]', '[p38]', '[Cyp26_mRNA]', '[Cyp26_G_L]', '[TFIIH]', '[Cyp26_SRC3_2]', '[Cyp26_G_R1R2]', '[MSK1]', '[MSK1_a]', '[Cyp26_G]', '[Basal_Kinases]', '[Cyp26_R1_RARa]', '[4-OH-RA]', '[Cyp26_G_rep2]', '[Cyp26_Chromatin]', '[Cyp26_G_R1]', '[RXR]', '[SMRT]'])
You don't need a regex to find common elements, set.intersection will find all elements in list2 that are also in list1:
list1=['xyz','xyz2','other_randoms']
list2=['xyz']
print(set(list2).intersection(list1))
set(['xyz'])
Also if you wanted to compare 'xyz' to 'xyz2' you would use == not in and then it would correctly return False.
You can also rewrite your own code a lot more succinctly, :
for key in data:
if key != 'Time':
pattern = re.compile(val)
for name, _ in specie_name_and_initial_values:
print re.findall(pattern, name)
Based on your edit you have somehow managed to turn lists into strings, one option is to strip the []:
all_keys = set([u'Cyp26_G_R1', u'Cyp26_G_rep1', u'Time'])
specie_name_and_initial_values = set(['[Cyp26_R1R2_RARa]', '[Cyp26_SRC3_1]', '[18-OH-RA]', '[p38_a]', '[Cyp26_G_rep1]', '[Cyp26]', '[Cyp26_G_a]', '[SRC3_p]', '[mRARa]', '[np38_a]', '[mRARa_a]', '[RARa_pp_TFIIH]', '[RARa]', '[Cyp26_G_L2]', '[atRA]', '[atRA_c]', '[SRC3]', '[RARa_Ser369p]', '[p38]', '[Cyp26_mRNA]', '[Cyp26_G_L]', '[TFIIH]', '[Cyp26_SRC3_2]', '[Cyp26_G_R1R2]', '[MSK1]', '[MSK1_a]', '[Cyp26_G]', '[Basal_Kinases]', '[Cyp26_R1_RARa]', '[4-OH-RA]', '[Cyp26_G_rep2]', '[Cyp26_Chromatin]', '[Cyp26_G_R1]', '[RXR]', '[SMRT]'])
specie_name_and_initial_values = set(s.strip("[]") for s in specie_name_and_initial_values)
print(all_keys.intersection(specie_name_and_initial_values))
Which outputs:
set([u'Cyp26_G_R1', u'Cyp26_G_rep1'])
FYI, if you had lists inside the set you would have gotten an error as lists are mutable so are not hashable.

How to efficiently select entries by date in python?

I have emails and dates. I can use 2 nested for loops to choose emails sent on same date, but how can i do it 'smart way' - efficiently?
# list of tuples - (email,date)
for entry in list_emails_dates:
current_date = entry[1]
for next_entry in list_emails_dates:
if current_date = next_entry[1]
list_one_date_emails.append(next_entry)
I know it can be written in shorter code, but I don't know itertools, or maybe use map, xrange?
You can just convert this to a dictionary, by collecting all emails related to a date into the same key.
To do this, you need to use defaultdict from collections. It is an easy way to give a new key in a dictionary a default value.
Here we are passing in the function list, so that each new key in the dictionary will get a list as the default value.
emails = defaultdict(list)
for email,email_date in list_of_tuples:
emails[email].append(email_date)
Now, you have emails['2013-14-07'] which will be a list of emails for that date.
If we don't use a defaultdict, and do a dictionary comprehension like this:
emails = {x[1]:x[0] for x in list_of_tuples}
You'll have one entry for each date, which will be the last email for that that, since assigning to the same key will override its value. A dictionary is the most efficient way to lookup a value by a key. A list is good if you want to lookup a value by its position in a series of values (assuming you know its position).
If for some reason you are not able to refactor it, you can use this template method, which will create a generator:
def find_by_date(haystack, needle):
for email, email_date in haystack:
if email_date == needle:
yield email
Here is how you would use it:
>>> email_list = [('foo#bar.com','2014-07-01'), ('zoo#foo.com', '2014-07-01'), ('a#b.com', '2014-07-03')]
>>> all_emails = list(find_by_date(email_list, '2014-07-01'))
>>> all_emails
['foo#bar.com', 'zoo#foo.com']
Or, you can do this:
>>> july_first = find_by_date(email_list, '2014-07-01')
>>> next(july_first)
'foo#bar.com'
>>> next(july_first)
'zoo#foo.com'
I would do an (and it's good to try using itertools)
itertools.groupby(list_of_tuples, lambda x: x[1])
which gives you the list of emails grouped by the date (x[1]). Note that when you do it you have to sort it regarding the same component (sorted(list_of_tuples, lambda x: x[1])).
One nice thing (other than telling the reader that we do a group) is that it works lazily and, if the list is kind of long, its performance is dominated by n log n for the sorting instead of n^2 for the nested loop.

use slice in for loop to build a list

I would like to build up a list using a for loop and am trying to use a slice notation. My desired output would be a list with the structure:
known_result[i] = (record.query_id, (align.title, align.title,align.title....))
However I am having trouble getting the slice operator to work:
knowns = "output.xml"
i=0
for record in NCBIXML.parse(open(knowns)):
known_results[i] = record.query_id
known_results[i][1] = (align.title for align in record.alignment)
i+=1
which results in:
list assignment index out of range.
I am iterating through a series of sequences using BioPython's NCBIXML module but the problem is adding to the list. Does anyone have an idea on how to build up the desired list either by changing the use of the slice or through another method?
thanks zach cp
(crossposted at [Biostar])1
You cannot assign a value to a list at an index that doesn't exist. The way to add an element (at the end of the list, which is the common use case) is to use the .append method of the list.
In your case, the lines
known_results[i] = record.query_id
known_results[i][1] = (align.title for align in record.alignment)
Should probably be changed to
element=(record.query_id, tuple(align.title for align in record.alignment))
known_results.append(element)
Warning: The code above is untested, so might contain bugs. But the idea behind it should work.
Use:
for record in NCBIXML.parse(open(knowns)):
known_results[i] = (record.query_id, None)
known_results[i][1] = (align.title for align in record.alignment)
i+=1
If i get you right you want to assign every record.query_id one or more matching align.title. So i guess your query_ids are unique and those unique ids are related to some titles. If so, i would suggest a dictionary instead of a list.
A dictionary consists of a key (e.g. record.quer_id) and value(s) (e.g. a list of align.title)
catalog = {}
for record in NCBIXML.parse(open(knowns)):
catalog[record.query_id] = [align.title for align in record.alignment]
To access this catalog you could either iterate through:
for query_id in catalog:
print catalog[query_id] # returns the title-list for the actual key
or you could access them directly if you know what your looking for.
query_id = XYZ_Whatever
print catalog[query_id]

Categories

Resources