How to extract a field value from UnQLite collection

How to extract a field value from UnQLite collection - python

db = UnQLite('test.db')
data = db.collection('data')
print(data.fetch(0))
This prints
{'id': b'abc', 'type': b'business', 'state': b'AZ', 'latitude': 33.3482589,
'name': b"ABC Restaurant", 'full_address': b'1835 E ABC Rd, Ste C109, Phoenix, AZ 85284',
'categories': [b'Restaurants', b'Buffets', b'Italian'],
'open': True, 'stars': 4, 'city': b'Phoenix', 'neighborhoods': [],
'__id': 0, 'review_count': 122, 'longitude': -111.9088346}
How do I fetch the value "Phoenix" for City?
type(data.fetch(0)) prints class 'dict'
I am looking at UnQlite documentation, not finding much. Please help.

You already get a dict so you only need to search for the key
x = {'id': b'abc', 'type': b'business', 'state': b'AZ', 'latitude': 33.3482589,
'name': b"ABC Restaurant", 'full_address': b'1835 E ABC Rd, Ste C109, Phoenix, AZ 85284',
'categories': [b'Restaurants', b'Buffets', b'Italian'],
'open': True, 'stars': 4, 'city': b'Phoenix', 'neighborhoods': [],
'__id': 0, 'review_count': 122, 'longitude': -111.9088346}
x['city']
#b'Phoenix'
Here Phoenix is not a str object but byte so if you want it as string you can convert it by using decode
x['city'].decode()
#'Phoenix'
Or in your case:
data.fetch(0)['city'].decode()

I figured it. Doing a collection.fetch(0).get('city') gives the value.

Related

How to rename keys in a dictionary and make a dataframe of it?

I have a complex situation which I hope to solve and which might profit us all. I collected data from my API, added a pagination and inserted the complete data package in a tuple named q1 and finally I have made a dictionary named dict_1of that tuple which looks like this:
dict_1 = {100: {'ID': 100, 'DKSTGFase': None, 'DK': False, 'KM': None,
'Country: {'Name': GE', 'City': {'Name': 'Berlin'}},
'Type': {'Name': '219'}, 'DKObject': {'Name': '8555', 'Object': {'Name': 'Car'}},
'Order': {'OrderId': 101, 'CreatedOn': '2018-07-06T16:54:36.783+02:00',
'ModifiedOn': '2018-07-06T16:54:36.783+02:00',
'Name': Audi, 'Client': {‘1’ }}, 'DKComponent': {'Name': ‘John’}},
{200: {'ID': 200, 'DKSTGFase': None, 'DK': False, ' KM ': None,
'Country: {'Name': ES', 'City': {'Name': 'Madrid'}}, 'Type': {'Name': '220'},
'DKObject': {'Name': '8556', 'Object': {'Name': 'Car'}},
'Order': {'OrderId': 102, 'CreatedOn': '2018-07-06T16:54:36.783+02:00',
'ModifiedOn': '2018-07-06T16:54:36.783+02:00',
'Name': Mercedes, 'Client': {‘2’ }}, 'DKComponent': {'Name': ‘Sergio’}},
Please note that in the above dictionary I have just stated 2 records. The actual dictionary has 1400 records till it reaches ID 1500.
Now I want to 2 things:
I want to change some keys for all the records. key DK has to become DK1. Key Name in Country has to become Name1 and Name in Object has to become 'Name2'
The second thing I want is to make a dataFrame of the whole bunch of data. My expected outcome is:
This is my code:
q1 = response_2.json()
next_link = q1['#odata.nextLink']
q1 = [tuple(q1.values())]
while next_link:
new_response = requests.get(next_link, headers=headers, proxies=proxies)
new_data = new_response.json()
q1.append(tuple(new_data.values()))
next_link = new_data.get('#odata.nextLink', None)
dict_1 = {
record['ID']: record
for tup in q1
for record in tup[2]
}
#print(dict_1)
for x in dict_1.values():
x['DK1'] = x['DK']
x['Country']['Name1'] = x['Country']['Name']
x['Object']['Name2'] = x['Object']['Name']
df = pd.DataFrame(dict_1)
When i run this I receive the following Error:
Traceback (most recent call last):
File "c:\data\FF\Desktop\Python\PythongMySQL\Talky.py", line 57, in <module>
x['Country']['Name1'] = x['Country']['Name']
TypeError: 'NoneType' object is not subscriptable

working code
lists=[]
alldict=[{100: {'ID': 100, 'DKSTGFase': None, 'DK': False, 'KM': None,
'Country': {'Name': 'GE', 'City': {'Name': 'Berlin'}},
'Type': {'Name': '219'}, 'DKObject': {'Name': '8555', 'Object': {'Name': 'Car'}},
'Order': {'OrderId': 101, 'CreatedOn': '2018-07-06T16:54:36.783+02:00',
'ModifiedOn': '2018-07-06T16:54:36.783+02:00',
'Name': 'Audi', 'Client': {'1' }}, 'DKComponent': {'Name': 'John'}}}]
for eachdict in alldict:
key=list(eachdict.keys())[0]
eachdict[key]['DK1']=eachdict[key]['DK']
del eachdict[key]['DK']
eachdict[key]['Country']['Name1']=eachdict[key]['Country']['Name']
del eachdict[key]['Country']['Name']
eachdict[key]['DKObject']['Object']['Name2']=eachdict[key]['DKObject']['Object']['Name']
del eachdict[key]['DKObject']['Object']['Name']
lists.append([key, eachdict[key]['DK1'], eachdict[key]['KM'], eachdict[key]['Country']['Name1'],
eachdict[key]['Country']['City']['Name'], eachdict[key]['DKObject']['Object']['Name2'], eachdict[key]['Order']['Client']])
pd.DataFrame(lists, columns=[<columnNamesHere>])
Output:
{100: {'ID': 100,
'DKSTGFase': None,
'KM': None,
'Country': {'City': {'Name': 'Berlin'}, 'Name1': 'GE'},
'Type': {'Name': '219'},
'DKObject': {'Name': '8555', 'Object': {'Name2': 'Car'}},
'Order': {'OrderId': 101,
'CreatedOn': '2018-07-06T16:54:36.783+02:00',
'ModifiedOn': '2018-07-06T16:54:36.783+02:00',
'Name': 'Audi',
'Client': {'1'}},
'DKComponent': {'Name': 'John'},
'DK1': False}}

Getting a KeyError: venues error in FourSquare/Python call

OK, I'm a newbie and I think I'm doing everything I should be, but I am still getting a KeyError: venues. (I also tried using "venue" instead and I am not at my maximum quota for the day at FourSquare)... I am using a Jupyter Notebook to do this
Using this code:
VERSION = '20200418'
RADIUS = 1000
LIMIT = 2
**url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, RADIUS, LIMIT)
url
results = requests.get(url).json()**
I get 2 results (shown at end of this post)
When I try to take those results and put them into a dataframe, i get "KeyError: venues"
# assign relevant part of JSON to venues
venues = results['response']['venues']
# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-29-5acf500bf9ad> in <module>
1 # assign relevant part of JSON to venues
----> 2 venues = results['response']['venues']
3
4 # tranform venues into a dataframe
5 dataframe = json_normalize(venues)
KeyError: 'venues'
I'm not really sure where I am going wrong... This has worked for me with other locations... But then again, like I said, I'm new at this... (I haven't maxed out my queries, and I've tried using "venue" instead)... Thank you
FourSquareResults:
{'meta': {'code': 200, 'requestId': '5ec42de01a4b0a001baa10ff'},
'response': {'suggestedFilters': {'header': 'Tap to show:',
'filters': [{'name': 'Open now', 'key': 'openNow'}]},
'warning': {'text': "There aren't a lot of results near you. Try something more general, reset your filters, or expand the search area."},
'headerLocation': 'Cranford',
'headerFullLocation': 'Cranford',
'headerLocationGranularity': 'city',
'totalResults': 20,
'suggestedBounds': {'ne': {'lat': 40.67401708586377,
'lng': -74.29300815204098},
'sw': {'lat': 40.65601706786374, 'lng': -74.31669390523408}},
'groups': [{'type': 'Recommended Places',
'name': 'recommended',
'items': [{'reasons': {'count': 0,
'items': [{'summary': 'This spot is popular',
'type': 'general',
'reasonName': 'globalInteractionReason'}]},
'venue': {'id': '4c13c8d2b7b9c928d127aa37',
'name': 'Cranford Canoe Club',
'location': {'address': '250 Springfield Ave',
'crossStreet': 'Orange Avenue',
'lat': 40.66022488705574,
'lng': -74.3061084180977,
'labeledLatLngs': [{'label': 'display',
'lat': 40.66022488705574,
'lng': -74.3061084180977},
{'label': 'entrance', 'lat': 40.660264, 'lng': -74.306191}],
'distance': 543,
'postalCode': '07016',
'cc': 'US',
'city': 'Cranford',
'state': 'NJ',
'country': 'United States',
'formattedAddress': ['250 Springfield Ave (Orange Avenue)',
'Cranford, NJ 07016',
'United States']},
'categories': [{'id': '4f4528bc4b90abdf24c9de85',
'name': 'Athletics & Sports',
'pluralName': 'Athletics & Sports',
'shortName': 'Athletics & Sports',
'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/sports_outdoors_',
'suffix': '.png'},
'primary': True}],
'photos': {'count': 0, 'groups': []},
'venuePage': {'id': '60380091'}},
'referralId': 'e-0-4c13c8d2b7b9c928d127aa37-0'},
{'reasons': {'count': 0,
'items': [{'summary': 'This spot is popular',
'type': 'general',
'reasonName': 'globalInteractionReason'}]},
'venue': {'id': '4d965995e07ea35d07e2bd02',
'name': 'Mizu Sushi',
'location': {'address': '103 Union Ave.',
'lat': 40.65664427772896,
'lng': -74.30343966195308,
'labeledLatLngs': [{'label': 'display',
'lat': 40.65664427772896,
'lng': -74.30343966195308}],
'distance': 939,
'postalCode': '07016',
'cc': 'US',
'city': 'Cranford',
'state': 'NJ',
'country': 'United States',
'formattedAddress': ['103 Union Ave.',
'Cranford, NJ 07016',
'United States']},
'categories': [{'id': '4bf58dd8d48988d1d2941735',
'name': 'Sushi Restaurant',
'pluralName': 'Sushi Restaurants',
'shortName': 'Sushi',
'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/sushi_',
'suffix': '.png'},
'primary': True}],
'photos': {'count': 0, 'groups': []}},
'referralId': 'e-0-4d965995e07ea35d07e2bd02-1'}]}]}}

Look more closely at response that you're getting - there's no "venues" key there. Closest one that I see is "groups" list, which has "items" list in it, and individual items have "venue" key in them.

From MongoDB convert from dictionary to row with Pandas

This is a test coming from MongoDB, I need to convert to MySQL. But! Sometimes there is more then one "agents", if that's the case I need each agent on their own row and that agent should have the same "display_name". For example Walter should have Gloria on one row and Barb on next and both have Walt Mosley under "display_name".
[{'name': 'Loomis, Gloria',
'primaryemail': 'gloria#gmail.com',
'primaryphone': '212-382-1121'},
{'name': 'Hogson, Barb',
'primaryemail': 'bho124#aol.com',
'primaryphone': ''}]
I've tried this but it just splits out the key/values.
a,b,c = [[d[e] for d in test] for e in sorted(test[0].keys())]
print(a,b,c)
This is the original JSON format:
{'_id': ObjectId('58e6ececafb08d6'),
'item_type': 'Contributor',
'role': 0,
'short_bio': 'Walter Mosley (b. 1952)',
'firebrand_id': 1588,
'display_name': 'Walter Mosley',
'first_name': 'Walter',
'last_name': 'Mosley',
'slug': 'walter-mosley',
'updated': datetime.datetime(2020, 1, 7, 8, 17, 11, 926000),
'image': 'https://s3.amazonaws.com/8588-book-contributor.jpg',
'social_media_name': '',
'social_media_link': '',
'website': '',
'agents': [{'name': 'Loomis, Gloria',
'primaryemail': 'gloria#gmail.com',
'primaryphone': '212-382-1121'},
{'name': 'Hogson, Barb',
'primaryemail': 'bho124#aol.com',
'primaryphone': ''}],
'estates': [],
'deleted': False}

If you've an array of dictionaries from your JSON file, try this :
JSON input :
inputJSON = [{'item_type': 'Contributor',
'role': 0,
'short_bio': 'Walter Mosley (b. 1952)',
'firebrand_id': 1588,
'display_name': 'Walter Mosley',
'first_name': 'Walter',
'last_name': 'Mosley',
'slug': 'walter-mosley',
'image': 'https://s3.amazonaws.com/8588-book-contributor.jpg',
'social_media_name': '',
'social_media_link': '',
'website': '',
'agents': [{'name': 'Loomis, Gloria',
'primaryemail': 'gloria#gmail.com',
'primaryphone': '212-382-1121'},
{'name': 'Hogson, Barb',
'primaryemail': 'bho124#aol.com',
'primaryphone': ''}],
'estates': [],
'deleted': False}]
Code :
import copy
finalJSON = []
for each in inputJSON:
for agnt in each.get('agents'):
newObj = copy.deepcopy(each)
newObj['agents'] = agnt
finalJSON.append(newObj)
print(finalJSON)

Iterating over a dict object that contains nested elements

Python 5.6
Here is the result from a call using the geocoder module
import geocoder
anaddress = 'State Street, Hood River, OR'
g = geocoder.arcgis(anaddress)
d = g.geojson
print(d)
{'geometry': {'type': 'Point', 'coordinates': [-121.52181774656506, 45.707876183969184]}, 'type': 'Feature', 'properties':
{'provider': 'arcgis', 'ok': True, 'location': '1037 State St, Hood River, OR', 'lat': 45.707876183969184, 'lng': -121.52
181774656506, 'bbox': [-121.52281774656507, 45.706876183969186, -121.52081774656506, 45.70887618396918], 'encoding': 'utf-
8', 'status': 'OK', 'address': '1037 State St, Hood River, Oregon, 97031', 'status_code': 200, 'confidence': 9}, 'bbox': [
-121.52281774656507, 45.706876183969186, -121.52081774656506, 45.70887618396918]}
How can I iterate through this structure and print it out nicely?

Is your goal, only to the print the structure or to parse it as well?
In case you want to just print your output nicely, try this
from pprint import pprint
pprint(d)
This shall provide you with a nicely printed structure.
In order to parse this, you can do it as you would with any dictionary object using keys and values.

Removing duplicate entries?

I need to compare values from different rows. Each row is a dictionary, and I need to compare the values in adjacent rows for the key 'flag'. How would I do this? Simply saying:
for row in range(1,len(myjson))::
if row['flag'] == (row-1)['flag']:
print yes
returns a TypeError: 'int' object is not subscriptable
Even though range returns a list of ints...
RESPONSE TO COMMENTS:
List of rows is a list of dictionaries. Originally, I import a tab-delimited file and read it in using the csv.dict module such that it is a list of dictionaries with the keys corresponding to the variable names.
Code: (where myjson is a list of dictionaries)
for row in myjson:
print row
Output:
{'website': '', 'phone': '', 'flag': 0, 'name': 'Diane Grant Albrecht M.S.', 'email': ''}
{'website': 'www.got.com', 'phone': '111-222-3333', 'flag': 1, 'name': 'Lannister G. Cersei M.A.T., CEP', 'email': 'cersei#got.com'}
{'website': '', 'phone': '', 'flag': 2, 'name': 'Argle D. Bargle Ed.M.', 'email': ''}
{'website': 'www.daManWithThePlan.com', 'phone': '000-000-1111', 'flag': 3, 'name': 'Sam D. Man Ed.M.', 'email': 'dman123#gmail.com'}
{'website': '', 'phone': '', 'flag': 3, 'name': 'Sam D. Man Ed.M.', 'email': ''}
{'website': 'www.daManWithThePlan.com', 'phone': '111-222-333', 'flag': 3, 'name': 'Sam D. Man Ed.M.', 'email': 'dman123#gmail.com'}
{'website': '', 'phone': '', 'flag': 4, 'name': 'D G Bamf M.S.', 'email': ''}
{'website': '', 'phone': '', 'flag': 5, 'name': 'Amy Tramy Lamy Ph.D.', 'email': ''}
Also:
type(myjson)
<type 'list'>

For comparing adjacent items you can use zip:
Example:
>>> lis = [1,1,2,3,4,4,5,6,7,7]
for x,y in zip(lis, lis[1:]):
if x == y :
print x,y,'are equal'
...
1 1 are equal
4 4 are equal
7 7 are equal
For your list of dictionaries, you can do something like :
from itertools import izip
it1 = iter(list_of_dicts)
it2 = iter(list_of_dicts)
next(it2)
for x,y in izip(it1, it2):
if x['flag'] == y['flag']
print yes
Update:
For more than 2 adjacent items you can use itertools.groupby:
>>> lis = [1,1,1,1,1,2,2,3,4]
for k,group in groupby(lis):
print list(group)
[1, 1, 1, 1, 1]
[2, 2]
[3]
[4]
For your code it would be :
>>> for k, group in groupby(dic, key = lambda x : x['flag']):
... print list(group)
...
[{'website': '', 'phone': '', 'flag': 0, 'name': 'Diane Grant Albrecht M.S.', 'email': ''}]
[{'website': 'www.got.com', 'phone': '111-222-3333', 'flag': 1, 'name': 'Lannister G. Cersei M.A.T., CEP', 'email': 'cersei#got.com'}]
[{'website': '', 'phone': '', 'flag': 2, 'name': 'Argle D. Bargle Ed.M.', 'email': ''}]
[{'website': 'www.daManWithThePlan.com', 'phone': '000-000-1111', 'flag': 3, 'name': 'Sam D. Man Ed.M.', 'email': 'dman123#gmail.com'}, {'website': '', 'phone': '', 'flag': 3, 'name': 'Sam D. Man Ed.M.', 'email': ''}, {'website': 'www.daManWithThePlan.com', 'phone': '111-222-333', 'flag': 3, 'name': 'Sam D. Man Ed.M.', 'email': 'dman123#gmail.com'}]
[{'website': '', 'phone': '', 'flag': 4, 'name': 'D G Bamf M.S.', 'email': ''}]
[{'website': '', 'phone': '', 'flag': 5, 'name': 'Amy Tramy Lamy Ph.D.', 'email': ''}]

Your exception indicates that list_of_rows is not what you think it is.
To look at other, adjacent rows, provided list_of_rows is indeed a list, I'd use enumerate() to include the current index and then use that index to load next and previous rows:
for i, row in enumerate(list_of_rows):
previous = list_of_rows[i - 1] if i else None
next = list_of_rows[i + 1] if i + 1 < len(list_of_rows) else None

Looks like you want to access list elements in batches:
http://code.activestate.com/recipes/303279/

You could try this
pre_item = list_of_rows[0]['flag']
for row in list_of_rows[1:]:
if row['flag'] == pre_item :
print yes
pre_item = row['flag']

list_of_rows = [ { 'a': 'foo',
'flag': 'bar' },
{ 'a': 'blo',
'flag': 'bar' } ]
for row, successor_row in zip(list_of_rows, list_of_rows[1:]):
if row['flag'] == successor_row['flag']:
print "yes"

It's simple. If you need to remove those dicts that have the same value for key "flag", as the title of your post suggests (it is somewhat misleading because your dictionaries are not strictly speaking duplicates), you can simply loop over the whole list of dictionaries, keeping track of flags in a separate list, if an item has a flag which is already in the list of flags simply don't add it, it would look something like:
def filterDicts(listOfDicts):
result = []
flags = []
for di in listOfDicts:
if di["flag"] not in flags:
result.append(di)
flags.append(di["flag"])
return result
When called with value of list of dictionaries that you have provided, it returns list with 5 items, each has an unique value of flag.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to extract a field value from UnQLite collection - python

I figured it. Doing a collection.fetch(0).get('city') gives the value.

Related

How to rename keys in a dictionary and make a dataframe of it?

Getting a KeyError: venues error in FourSquare/Python call

From MongoDB convert from dictionary to row with Pandas

Iterating over a dict object that contains nested elements

Removing duplicate entries?

Categories

Resources