Pymongo:iterating dictionary of a list of dictionary - python

Let us say I have a big collection including documents such as these:
{"_id": 0,
"name":"John Doe",
"items": [
{"x":5,
"y":8,
"z":9},
{"x":4,
"y":2,
"z":1},
{"x":3,
"y":5,
"z":8}
]
}
I am fetching the collection into a variable called 'data' with pymongo and trying to iterate over the items in order to update some. I have tried to print those values with the following code:
for i in data:
for j in data[i].get("items"):
pprint(j.get("x"))
and it gave an error:
pymongo.errors.InvalidOperation: cannot set options after executing query
Even if I get the items, I do not see a way to change data. Why is this happening and how can I iterate through items and change their values?

How looks db query?
Try something like this for iteration:
cursor = db.get_collection('some_collection').find()
for doc in cursor:
for item in doc['items']:
pprint(item.get('x'))

Let's say if you want to change the values of all x.
for i in data.get('items'):
# to change the values in items
# this will replace all values of key x
i['x'] = new_value
simply assign a new value to items key.

Related

Python: Trying to extract a value from a list of dictionaries that is stored as a string

I am getting data from an API and storing it in json format. The data I pull is in a list of dictionaries. I am using Python. My task is to only grab the information from the dictionary that matches the ticker symbol.
This is the short version of my data printing using json dumps
[
{
"ticker": "BYDDF.US",
"name": "BYD Co Ltd-H",
"price": 25.635,
"change_1d_prc": 9.927101200686117
},
{
"ticker": "BYDDY.US",
"name": "BYD Co Ltd ADR",
"price": 51.22,
"change_1d_prc": 9.843448423761526
},
{
"ticker": "TSLA.US",
"name": "Tesla Inc",
"price": 194.7,
"change_1d_prc": 7.67018746889343
}
]
Task only gets the dictionary for ticker = TSLA.US. If possible, only get the price associated with this ticker.
I am unaware of how to reference "ticker" or loop through all of them to get the one I need.
I tried the following, but it says that its a string, so it doesn't work:
if "ticker" == "TESLA.US":
print(i)
Try (mylist is your list of dictionaries)
for entry in mylist:
print(entry['ticker'])
Then try this to get what you want:
for entry in mylist:
if entry['ticker'] == 'TSLA.US':
print(entry)
This is a solution that I've seen divide the python community. Some say that it's a feature and "very pythonic"; others say that it's a bad design choice we're stuck with now, and bad practice. I'm personally not a fan, but it is a way to solve this problem, so do with it what you will. :)
Python function loops aren't a new scope; the loop variable persists even after the loop. So, either of these are a solution. Assuming that your list of dictionaries is stored as json_dict:
for target_dict in json_dict:
if target_dict["ticker"] == "TESLA.US":
break
At this point, target_dict will be the dictionary you want.
It is possible to iterate through a list of dictionaries using a for loop.
for stock in list:
if stock["ticker"] == "TSLA.US":
return stock["price"]
This essentially loops through every item in the list, and for each item (which is a dictionary) looks for the key "ticker" and checks if its value is equal to "TSLA.US". If it is, then it returns the value associated with the "price" key.

Parse a json/dictionary with same key values

I currently have a list variable that looks like this:
list_of_dicts = [{"Away_Team":"KC", "Home_Team":"NYY"},
{"Away_Team":"TB", "Home_Team":"MIA"},
{"Away_Team":"TOR", "Home_Team":"BOS"},
]
As you can see, there are multiple keys with the same names, pertaining to the game matchups.
When I try to use:
print(json.dumps(list_of_dicts[0], indent=4, sort_keys=True))
...it only prints out the first matchup due to the same keys:
{
"Away_Team": "KC",
"Home_Team": "NYY"
}
How can I convert this list_of_dicts variable into something like the following output so I can use it like a valid dictionary or json object?
{
"Away_Team_1":"KC", "Home_Team_1":"NYY",
"Away_Team_2":"TB", "Home_Team_2":"MIA",
"Away_Team_3":"TOR", "Home_Team_3":"BOS",
}
This output doesn't need to be exactly that if a better solution is available, this is just to give you an idea of how I'd like to be able to parse the data.
The list_of_dicts variable can be of varying sizes, I've shown 3 here, but it could contain 1 or 10 matchups, it varies, so the solution needs to be dynamic to that.
You can add suffixes to the keys with enumerate:
list_of_dicts2 = [{f"{k}_{i}":v for k,v in d.items()} for i,d in enumerate(list_of_dicts, start=1)]
One option is to use pandas:
pd.DataFrame(list_of_dicts).to_csv('filename.csv', index=False)
gives
Away_Team,Home_Team
KC,NYY
TB,MIA
TOR,BOS
Now the index is implied by the row, and if you load it back in you'll have those indices. Pandas also supports to_json if you are hard set on using json though. You can even recover your original list from a dataframe using .to_dict(orient='records')
Data structure is important. You really don't need a dictionary for this. Simply reducing to a list of tuples the first slot always the away team, and the second the home team.
list_of_dicts = [{"Away_Team":"KC", "Home_Team":"NYY"},
{"Away_Team":"TB", "Home_Team":"MIA"},
{"Away_Team":"TOR", "Home_Team":"BOS"},
]
l = [tuple(l.values()) for l in list_of_dicts]
output:
[('KC', 'NYY'), ('TB', 'MIA'), ('TOR', 'BOS')]
The problem with your proposed solution is iterating through dicts where you don't know the key name is cumbersome, this solution makes the data structure easy to decipher, transform, or manipulate.

Check for string in list items using list as reference

I want to replace items in a list based on another list as reference.
Take this example lists stored inside a dictionary:
dict1 = {
"artist1": ["dance pop","pop","funky pop"],
"artist2": ["chill house","electro house"],
"artist3": ["dark techno","electro techno"]
}
Then, I have this list as reference:
wish_list = ["house","pop","techno"]
My result should look like this:
dict1 = {
"artist1": ["pop"],
"artist2": ["house"],
"artist3": ["techno"]
}
I want to check if any of the list items inside "wishlist" is inside one of the values of the dict1. I tried around with regex, any.
This was an approach with just 1 list instead of a dictionary of multiple lists:
check = any(item in artist for item in wish_list)
if check == True:
artist_genres.clear()
artist_genres.append()
I am just beginning with Python on my own and am playing around with the SpotifyAPI to clean up my favorite songs into playlists. Thank you very much for your help!
The idea is like this,
dict1 = { "artist1" : ["dance pop","pop","funky pop"],
"artist2" : ["house","electro house"],
"artist3" : ["techno","electro techno"] }
wish_list = ["house","pop","techno"]
dict2={}
for key,value in dict1.items():
for i in wish_list:
if i in value:
dict2[key]=i
break
print(dict2)
A regex is not needed, you can get away by simply iterating over the list:
wish_list = ["house","pop","techno"]
dict1 = {
"artist1": ["dance pop","pop","funky pop"],
"artist2": ["chill house","electro house"],
"artist3": ["dark techno","electro techno"]
}
dict1 = {
# The key is reused as-is, no need to change it.
# The new value is the wishlist, filtered based on its presence in the current value
key: [genre for genre in wish_list if any(genre in item for item in value)]
for key, value in dict1.items() # this method returns a tuple (key, value) for each entry in the dictionary
}
This implementation relies a lot on list comprehensions (and also dictionary comprehensions), you might want to check it if it's new to you.

Nested dictionary access is returning a list

The issue I'm having is that when i try to access the values within a nested dictionary, i cannot because it's returning a list instead of a dictionary.
I have a .json file with this format;
{
"users": [
{
"1": {
"1": "value",
"2": "value"
}
}
]
}
I load the .json file, and access the value i want by using this function
def load_json(fn):
with open(fn) as pf:
data = json.load(pf)
return data['users']['1']['2']
If i simply do return data it is a dictionary, but if try to access further by adding ['users'], it turns into a list and will give an index error if i try to access key #1 or #2 inside of that..
My objective is to obtain the value of the nested key #2 for example, ideally without having loop through it.
Your JSON contains an array (Python list) wrapping the inner dicts (that's what the [ and ] in your JSON indicate). All you need to do is change:
return data['users']['1']['2']
to:
return data['users'][0]['1']['2']
# ^^^ Added
to index the list to get into the inner dicts.
given your data structure, and following it down :
data is a dictionary - with one key 'users' and a value of a list
data['users'] is a list - with one entry
data['users'][0] is a dictionary - with one key '1' and a value of a dictionary
data['users'][0][1] is a dictionary - with two keys '1' and '2'
So you need to do do :
def load_json(fn):
with open(fn) as pf:
data = json.load(pf)
return data['users'][0]['1']['2']

Getting specific field values from Json Python

I have a JSON file, and what I am trying to do is getting this specific field '_id'. Problem is that when I use json.load('input_file'), it says that my variable data is a list, not a dictionary, so I can't do something like:
for value in data['_id']:
print(data['_id'][i])
because I keep getting this error: TypeError: list indices must be integers or slices, not str
What I also tried to do is:
data = json.load(input_file)[0]
It kinda works. Now, my type is a dictionary, and I can access like this: data['_id']
But I only get the first '_id' from the archive...
So, what I would like to do is add all '_id' 's values into a list, to use later.
input_file = open('input_file.txt')
data = json.load(input_file)[0]
print(data['_id'])# only shows me the first '_id' value
Thanks for the help!
[{
"_id": "5436e3abbae478396759f0cf",
"name": "ISIC_0000000",
"updated": "2015-02-23T02:48:17.495000+00:00"
},
{
"_id": "5436e3acbae478396759f0d1",
"name": "ISIC_0000001",
"updated": "2015-02-23T02:48:27.455000+00:00"
},
{
"_id": "5436e3acbae478396759f0d3",
"name": "ISIC_0000002",
"updated": "2015-02-23T02:48:37.249000+00:00"
},
{
"_id": "5436e3acbae478396759f0d5",
"name": "ISIC_0000003",
"updated": "2015-02-23T02:48:46.021000+00:00"
}]
You want to print the _id of each element of your json list, so let's do it by simply iterating over the elements:
input_file = open('input_file.txt')
data = json.load(input_file) # get the data list
for element in data: # iterate on each element of the list
# element is a dict
id = element['_id'] # get the id
print(id) # print it
If you want to transform the list of elements into a list of ids for later use, you can use list comprehension:
ids = [ e['_id'] for e in data ] # get id from each element and create a list of them
As you can see the data is a list of dictionaries
for looping over data you need to use the following code
for each in data:
print each['_id']
print each['name']
print each['updated']
it says that my variable data is a list, not a dictionary, so I can't do something like:
for value in data['_id']:
print(data['_id'][i])
Yes, but you can loop over all the dictionaries in your list and get the values for their '_id' keys. This can be done in a single line using list comprehension:
data = json.load(input_file)
ids = [value['_id'] for value in data]
print(ids)
['5436e3abbae478396759f0cf', '5436e3acbae478396759f0d1', '5436e3acbae478396759f0d3', '5436e3acbae478396759f0d5']
Another way to achieve this is using the map built-in function of python:
ids = map(lambda value: value['_id'], data)
This creates a function that returns the value of the key _id from a dictionary using a lambda expression and then returns a list with the return value from this function applied on every item in data

Categories

Resources