mongodb python get the element position from an array in a document - python

I use python + mongodb to store some item ranking data in a collection called chart
{
date: date1,
region: region1,
ranking: [
{
item: bson.dbref.DBRef(db.item.find_one()),
price: current_price,
version: '1.0'
},
{
item: bson.dbref.DBRef(db.item.find_another_one()),
price: current_price,
version: '1.0'
},
.... (and the array goes on)
]
}
Now my problem is, I want to make a history ranking chart for itemA. And according to the $ positional operator, the query should be something like this:
db.chart.find( {'ranking.item': bson.dbref.DBRef('item', itemA._id)}, ['$'])
And the $ operator doesn't work.
Any other possible solution to this?

The $ positional operator is only used in update(...) calls, you can't use it to return the position within an array.
However, you can use field projection to limit the fields returned to just those you need to calculate the position in the array from within Python:
db.foo.insert({
'date': '2011-04-01',
'region': 'NY',
'ranking': [
{ 'item': 'Coca-Cola', 'price': 1.00, 'version': 1 },
{ 'item': 'Diet Coke', 'price': 1.25, 'version': 1 },
{ 'item': 'Diet Pepsi', 'price': 1.50, 'version': 1 },
]})
db.foo.insert({
'date': '2011-05-01',
'region': 'NY',
'ranking': [
{ 'item': 'Diet Coke', 'price': 1.25, 'version': 1 },
{ 'item': 'Coca-Cola', 'price': 1.00, 'version': 1 },
{ 'item': 'Diet Pepsi', 'price': 1.50, 'version': 1 },
]})
db.foo.insert({
'date': '2011-06-01',
'region': 'NY',
'ranking': [
{ 'item': 'Coca-Cola', 'price': 1.00, 'version': 1 },
{ 'item': 'Diet Pepsi', 'price': 1.50, 'version': 1 },
{ 'item': 'Diet Coke', 'price': 1.25, 'version': 1 },
]})
def position_of(item, ranking):
for i, candidate in enumerate(ranking):
if candidate['item'] == item:
return i
return None
print [position_of('Diet Coke', x['ranking'])
for x in db.foo.find({'ranking.item': 'Diet Coke'}, ['ranking.item'])]
# prints [1, 0, 2]
In this (admittedly trivial) example, returning just a subset of fields may not show much benefit; however if your documents are especially large, doing may show performance improvements.

Related

How to return key/value pairs in nested dict whose value is a list that contains said dict

As you can see the question itself is convoluted, but so is this problem. I'm trying to get the key/value pairs for name: name, symbol: symbol, and price: price in this API output. I've tried several things but can't seem to get it straight. Any help is appreciated.
Here is the dict:
data = {
'coins':
[
{
'id': 'bitcoin',
'icon': 'https://static.coinstats.app/coins/Bitcoin6l39t.png',
'name': 'Bitcoin',
'symbol': 'BTC',
'rank': 1,
'price': 48918.27974434776,
'priceBtc': 1,
'volume': 44805486777.77573,
'marketCap': 924655133704.012,
'availableSupply': 18902037,
'totalSupply': 21000000,
'priceChange1h': 0.05,
'priceChange1d': 1.22,
'priceChange1w': -3.02,
'websiteUrl': 'http://www.bitcoin.org',
'twitterUrl': 'https://twitter.com/bitcoin',
'exp': [
'https://blockchair.com/bitcoin/',
'https://btc.com/', 'https://btc.tokenview.com/'
]
},
{
'id': 'ethereum',
'icon': 'https://static.coinstats.app/coins/EthereumOCjgD.png',
'name': 'Ethereum',
'symbol': 'ETH',
'rank': 2,
'price': 4037.7735178501143,
'priceBtc': 0.08254618952631135,
'volume': 37993691263.19707,
'marketCap': 479507778421.57574,
'availableSupply': 118755491.4365,
'totalSupply': 0,
'priceChange1h': 0.3,
'priceChange1d': 4.53,
'priceChange1w': -8.85,
'websiteUrl': 'https://www.ethereum.org/',
'twitterUrl': 'https://twitter.com/ethereum',
'contractAddress': '0x2170ed0880ac9a755fd29b2688956bd959f933f8',
'decimals': 18,
'exp': [
'https://etherscan.io/',
'https://ethplorer.io/',
'https://blockchair.com/ethereum',
'https://eth.tokenview.com/'
]
}
]
}
Here's one of my failed attempts:
print(data['coins'][1])
new_dict = data['coins'][1]
for i in new_dict:
a = i.split(',')
print(a)
I'm not sure what the expected output is since you do not clarify, though you can access each key-value for each stock as such:
for stock in data["coins"]:
print(stock["name"] + " - " + stock["symbol"] + " - " + str(stock["price"]))
Output:
Bitcoin - BTC - 48918.27974434776
Ethereum - ETH - 4037.7735178501143

Fastest way to get specific key from a dict if it is found

I am currently writing a scraper that reads from an API that contains a JSON. By doing response.json() it would return a dict where we could easily use the e.g response["object"]to get the value we want as I assume that converts it to a dict. The current mock data looks like this:
data = {
'id': 336461,
'thumbnail': '/images/product/123456?trim&h=80',
'variants': None,
'name': 'Testing',
'data': {
'Videoutgång': {
'Typ av gränssnitt': {
'name': 'Typ av gränssnitt',
'value': 'PCI Test'
}
}
},
'stock': {
'web': 0,
'supplier': None,
'displayCap': '50',
'1': 0,
'orders': {
'CL': {
'ordered': -10,
'status': 1
}
}
}
}
What I am looking after is that the API sometimes does contain "orders -> CL" but sometime doesn't . That means that both happy path and unhappy path is what I am looking for which is the fastest way to get a data from a dict.
I have currently done something like this:
data = {
'id': 336461,
'thumbnail': '/images/product/123456?trim&h=80',
'variants': None,
'name': 'Testing',
'data': {
'Videoutgång': {
'Typ av gränssnitt': {
'name': 'Typ av gränssnitt',
'value': 'PCI Test'
}
}
},
'stock': {
'web': 0,
'supplier': None,
'displayCap': '50',
'1': 0,
'orders': {
'CL': {
'ordered': -10,
'status': 1
}
}
}
}
if (
"stock" in data
and "orders" in data["stock"]
and "CL" in data["stock"]["orders"]
and "status" in data["stock"]["orders"]["CL"]
and data["stock"]["orders"]["CL"]["status"]
):
print(f'{data["stock"]["orders"]["CL"]["status"]}: {data["stock"]["orders"]["CL"]["ordered"]}')
1: -10
However my question is that I would like to know which is the fastest way to get the data from a dict if it is in the dict?
Lookups are faster in dictionaries because Python implements them using hash tables.
If we explain the difference by Big O concepts, dictionaries have constant time complexity, O(1). This is another approach using .get() method as well:
data = {
'id': 336461,
'thumbnail': '/images/product/123456?trim&h=80',
'variants': None,
'name': 'Testing',
'data': {
'Videoutgång': {
'Typ av gränssnitt': {
'name': 'Typ av gränssnitt',
'value': 'PCI Test'
}
}
},
'stock': {
'web': 0,
'supplier': None,
'displayCap': '50',
'1': 0,
'orders': {
'CL': {
'ordered': -10,
'status': 1
}
}
}
}
if (data.get('stock', {}).get('orders', {}).get('CL')):
print(f'{data["stock"]["orders"]["CL"]["status"]}: {data["stock"]["orders"]["CL"]["ordered"]}')
Here is a nice writeup on lookups in Python with list and dictionary as example.
I got your point. For this question, since your stock has just 4 values it is hard to say if .get() method will work faster than using a loop or not. If your dictionary would have more items then certainly .get() would have worked much faster but since there are few keys, using loop will not make much difference.

Python steamlit select box menu returns string, but I need dict or list

Stack on this case, Python steamlit select box menu returns string, but I need dict or list, to use it further in my code.
I want to see company1, company2, company3 in dropdown menu, and if user's choice was for example 'company2' get ['ID': 'zxc222’, 'NAME': 'company2','DESC': 'comp2'].
BaseObject = [{
'ID': 'zxc123',
'NAME': 'company1',
'DESC': 'comp1'
}, {
'ID': 'zxc222',
'NAME': 'company2',
'DESC': 'comp2'
}, {
'ID': 'zxc345',
'NAME': 'company3',
'DESC': 'comp3'
}]
lenbo = len(BaseObject)
options = []
for i in range(0, lenbo):
options.append((BaseObject[i])['NAME'])
st.selectbox('Subdivision:', options)
You can do the conversion to a dict after the selectbox:
import streamlit as st
BaseObject = [{
'ID': 'zxc123',
'NAME': 'company1',
'DESC': 'comp1'
}, {
'ID': 'zxc222',
'NAME': 'company2',
'DESC': 'comp2'
}, {
'ID': 'zxc345',
'NAME': 'company3',
'DESC': 'comp3'
}]
lenbo = len(BaseObject)
options = []
for i in range(0, lenbo):
options.append((BaseObject[i])['NAME'])
choice = st.selectbox('Subdivision:', options)
chosen_base_object = None
for base_object in BaseObject:
if base_object["NAME"] == choice:
chosen_base_object = dict(base_object)
print(chosen_base_object) # {'ID': 'zxc345', 'NAME': 'company3', 'DESC': 'comp3'}

How can I print specific integer variables in a nested dictionary by using Python?

This is my first question :)
I loop over a nested dictionary to print specific values. I am using the following code.
for i in lizzo_top_tracks['tracks']:
print('Track Name: ' + i['name'])
It works for string variables, but does not work for other variables. For example, when I use the following code for the date variable:
for i in lizzo_top_tracks['tracks']:
print('Album Release Date: ' + i['release_date'])
I receive a message like this KeyError: 'release_date'
What should I do?
Here is a sample of my nested dictionary:
{'tracks': [{'album': {'album_type': 'album',
'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/56oDRnqbIiwx4mymNEv7dS'},
'href': 'https://api.spotify.com/v1/artists/56oDRnqbIiwx4mymNEv7dS',
'id': '56oDRnqbIiwx4mymNEv7dS',
'name': 'Lizzo',
'type': 'artist',
'uri': 'spotify:artist:56oDRnqbIiwx4mymNEv7dS'}],
'external_urls': {'spotify': 'https://open.spotify.com/album/74gSdSHe71q7urGWMMn3qB'},
'href': 'https://api.spotify.com/v1/albums/74gSdSHe71q7urGWMMn3qB',
'id': '74gSdSHe71q7urGWMMn3qB',
'images': [{'height': 640,
'width': 640}],
'name': 'Cuz I Love You (Deluxe)',
'release_date': '2019-05-03',
'release_date_precision': 'day',
'total_tracks': 14,
'type': 'album',
'uri': 'spotify:album:74gSdSHe71q7urGWMMn3qB'}]}
The code you posted isn't syntactically correct; running it through a Python interpreter gives a syntax error on the last line. It looks like you lost a curly brace somewhere toward the end. :)
I went through it and fixed up the white space to make the structure easier to see; the way you had it formatted made it hard to see which keys were at which level of nesting, but with consistent indentation it becomes much clearer:
lizzo_top_tracks = {
'tracks': [{
'album': {
'album_type': 'album',
'artists': [{
'external_urls': {
'spotify': 'https://open.spotify.com/artist/56oDRnqbIiwx4mymNEv7dS'
},
'href': 'https://api.spotify.com/v1/artists/56oDRnqbIiwx4mymNEv7dS',
'id': '56oDRnqbIiwx4mymNEv7dS',
'name': 'Lizzo',
'type': 'artist',
'uri': 'spotify:artist:56oDRnqbIiwx4mymNEv7dS'
}],
'external_urls': {
'spotify': 'https://open.spotify.com/album/74gSdSHe71q7urGWMMn3qB'
},
'href': 'https://api.spotify.com/v1/albums/74gSdSHe71q7urGWMMn3qB',
'id': '74gSdSHe71q7urGWMMn3qB',
'images': [{'height': 640, 'width': 640}],
'name': 'Cuz I Love You (Deluxe)',
'release_date': '2019-05-03',
'release_date_precision': 'day',
'total_tracks': 14,
'type': 'album',
'uri': 'spotify:album:74gSdSHe71q7urGWMMn3qB'
}
}]
}
So the first (and only) value you get for i in lizzo_top_tracks['tracks'] is going to be this dictionary:
i = {
'album': {
'album_type': 'album',
'artists': [{
'external_urls': {
'spotify': 'https://open.spotify.com/artist/56oDRnqbIiwx4mymNEv7dS'
},
'href': 'https://api.spotify.com/v1/artists/56oDRnqbIiwx4mymNEv7dS',
'id': '56oDRnqbIiwx4mymNEv7dS',
'name': 'Lizzo',
'type': 'artist',
'uri': 'spotify:artist:56oDRnqbIiwx4mymNEv7dS'
}],
'external_urls': {
'spotify': 'https://open.spotify.com/album/74gSdSHe71q7urGWMMn3qB'
},
'href': 'https://api.spotify.com/v1/albums/74gSdSHe71q7urGWMMn3qB',
'id': '74gSdSHe71q7urGWMMn3qB',
'images': [{'height': 640, 'width': 640}],
'name': 'Cuz I Love You (Deluxe)',
'release_date': '2019-05-03',
'release_date_precision': 'day',
'total_tracks': 14,
'type': 'album',
'uri': 'spotify:album:74gSdSHe71q7urGWMMn3qB'
}
}
The only key in this dictionary is 'album', the value of which is another dictionary that contains all the other information. If you want to print, say, the album release date and a list of the artists' names, you'd do:
for track in lizzo_top_tracks['tracks']:
print('Album Release Date: ' + track['album']['release_date'])
print('Artists: ' + str([artist['name'] for artist in track['album']['artists']]))
If these are dictionaries that you're building yourself, you might want to remove some of the nesting layers where there's only a single key, since they just make it harder to navigate the structure without giving you any additional information. For example:
lizzo_top_albums = [{
'album_type': 'album',
'artists': [{
'external_urls': {
'spotify': 'https://open.spotify.com/artist/56oDRnqbIiwx4mymNEv7dS'
},
'href': 'https://api.spotify.com/v1/artists/56oDRnqbIiwx4mymNEv7dS',
'id': '56oDRnqbIiwx4mymNEv7dS',
'name': 'Lizzo',
'type': 'artist',
'uri': 'spotify:artist:56oDRnqbIiwx4mymNEv7dS'
}],
'external_urls': {
'spotify': 'https://open.spotify.com/album/74gSdSHe71q7urGWMMn3qB'
},
'href': 'https://api.spotify.com/v1/albums/74gSdSHe71q7urGWMMn3qB',
'id': '74gSdSHe71q7urGWMMn3qB',
'images': [{'height': 640, 'width': 640}],
'name': 'Cuz I Love You (Deluxe)',
'release_date': '2019-05-03',
'release_date_precision': 'day',
'total_tracks': 14,
'type': 'album',
'uri': 'spotify:album:74gSdSHe71q7urGWMMn3qB'
}]
This structure allows you to write the query the way you were originally trying to do it:
for album in lizzo_top_albums:
print('Album Release Date: ' + album['release_date'])
print('Artists: ' + str([artist['name'] for artist in album['artists']]))
Much simpler, right? :)

Python - merge nested dictionaries

I have this data structure:
playlists_user1={'user1':[
{'playlist1':{
'tracks': [
{'name': 'Karma Police','artist': 'Radiohead', 'count': 1.0},
{'name': 'Bitter Sweet Symphony','artist': 'The Verve','count': 2.0}
]
}
},
{'playlist2':{
'tracks': [
{'name': 'We Will Rock You','artist': 'Queen', 'count': 3.0},
{'name': 'Roxanne','artist': 'Police','count': 5.0}
]
}
},
]
}
playlists_user2={'user2':[
{'playlist1':{
'tracks': [
{'name': 'Karma Police','artist': 'Radiohead', 'count': 2.0},
{'name': 'Sonnet','artist': 'The Verve','count': 4.0}
]
}
},
{'playlist2':{
'tracks': [
{'name': 'We Are The Champions','artist': 'Queen', 'count': 4.0},
{'name': 'Bitter Sweet Symphony','artist': 'The Verve','count': 1.0}
]
}
},
]
}
I would like to merge the two nested dictionaries into one single data structure, whose first item would be a playlist key, like so:
{'playlist1': {'tracks': [{'count': 1.0, 'name': 'Karma Police', 'artist': 'Radiohead'}, {'count': 2.0, 'name': 'Bitter Sweet Symphony', 'artist': 'The Verve'}]}}
I have tried:
prefs1 = playlists_user1['user1'][0]
prefs2 = playlists_user2['user2'][0]
prefs3 = prefs1.update(prefs2)
to no avail.
how do I solve this?
Since your playlist values are already a list of lists of dictionaries with one key (a bit obfuscated if you ask me):
combined = {}
for playlist in playlists_user1.values()[0]:
combined.update(playlist)
for playlist in playlists_user2.values()[0]:
combined.update(playlist)
or if you have many of these, make a list and:
combined = {}
for playlistlist in user_playlists:
for playlist in playlistlist.values()[0]:
combined.update(playlist)

Categories

Resources