I am trying to pretty-print a python object by calling:
from pprint import pprint
...
pprint(update)
But the output looks like this:
<telegram.update.Update object at 0xffff967e62b0>
However, using Python's internal print() I get the correct output:
{'update_id': 14191809, 'message': {'message_id': 22222, 'date': 11111, 'chat': {'id': 00000, 'type': 'private', 'username': 'xxxx', 'first_name': 'X', 'last_name': 'Y'}, 'text': '/start', 'entities': [{'type': 'bot_command', 'offset': 0, 'length': 6}], 'caption_entities': [], 'photo': [], 'new_chat_members': [], 'new_chat_photo': [], 'delete_chat_photo': False, 'group_chat_created': False, 'supergroup_chat_created': False, 'channel_chat_created': False, 'from': {'id': 01010101, 'first_name': 'X', 'is_bot': False, 'last_name': 'Y', 'username': 'xxxx', 'language_code': 'en'}}}
Is there a way to make pprint(), show the object-data correctly and formatted?
pprint uses the representation (__repr__() method) of the object while print uses __str__(). What you see in print output is not a dictionary but a string representation of the inner structure of the telegram.update.Update instance.
There is no generic solution to this, but since your question is about a specific library, consulting the relevant docs shows that there is a .to_json() method, so you can do this:
import json
from pprint import pprint
...
pprint(json.loads(update.to_json()))
Related
i'm trying to create a program, which needs to read messages from a discord bot and retrieve links from these messages.
here's the code:
import requests
import json
from bs4 import builder
import bs4
def retrieve_messages(channelid):
headers = {
'authorization': 'NTQ5OTM4ODEzOTUxMTQ4MDQ3.YMi7CQ.fOm6F-dmPJPEW0dehLwCkB_ilBU'
}
r = requests.get(f'https://discord.com/api/v9/channels/{channelid}/messages', headers=headers)
jsonn = json.loads(r.text)
for value in jsonn:
print(value, '\n')
retrieve_messages('563699841377763348')
here's the output:
{'id': '908857015412084796', 'type': 0, 'content': '<#&624528614330859520>', 'channel_id': '5636998413777633, 2021.```\n5J53T-BKJK5-CTXBZ-JJJTJ-WW6F3```Redeem on48', 'author': {'id': '749499357761503284', 'username': 'shift', 'avatar': 'de9cd6f3224e660a4b6906a89fc2bc15/shift-code/5j53t-bkjk5-ctxbz-jjjtj-ww6f3/?utm_source', 'discriminator': '6125', 'public_flags': 0, 'bot': True}, 'attachments': [], 'embeds': [], 'mentions': []'pinned': False, 'mention_everyone': False, 'tts': Fa, 'mention_roles': ['624528614330859520'], 'pinned': False, 'mention_everyone': False, 'tts': False, 'timest}amp': '2021-11-12T23:13:18.221000+00:00', 'edited_timestamp': None, 'flags': 0, 'components': []}
{'id': '908857014430629898', 'type': 0, 'content': '', 'channel_id': '563699841377763348', 'author': {'id':
'749499357761503284', 'username': 'shift', 'avatar': 'de9cd6f3224e660a4b6906a89fc2bc15', 'discriminator': '6125', 'public_flags': 0, 'bot': True}, 'attachments': [], 'embeds': [{'type': 'rich', 'title': '<:GoldenKey:273763771929853962> Borderlands 1: 5 gold keys', 'description': 'Platform: Universal\nExpires: 30 November,
2021.```\n5J53T-BKJK5-CTXBZ-JJJTJ-WW6F3```Redeem on the [website](https://shift.gearboxsoftware.com/rewards) or in game.\n\n[Source](https://shift.orcicorn.com/shift-code/5j53t-bkjk5-ctxbz-jjjtj-ww6f3/?utm_source=json&utm_medium=shift&utm_campaign=automation)', 'color': 16040976}], 'mentions': [], 'mention_roles': [], 'pinned': False, 'mention_everyone': False, 'tts': False, 'timestamp': '2021-11-12T23:13:17.987000+00:00', 'edited_timestamp': None, 'flags': 1, 'components': []}
in the output there are 2 links, but I need to save the second link to a variable, and I'm wondering how I can do that
This is easiest done with the response body as a text object that can be scanned with regex to find the URLs
Solution
The variable test_case_data is the response body in TEXT form as a string.
import re
regex = r"(http|ftp|https):\/\/([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,#?^=%&:\/~+#-]*[\w#?^=%&\/~+#-])"
def find_embedded_urls(data):
return re.finditer(regex,data)
test_case_data = """'id': '908857014430629898', 'type': 0, 'content': '', 'channel_id': '563699841377763348', 'author': {'id':
'749499357761503284', 'username': 'shift', 'avatar': 'de9cd6f3224e660a4b6906a89fc2bc15', 'discriminator': '6125', 'public_flags': 0, 'bot': True}, 'attachments': [], 'embeds': [{'type': 'rich', 'title': '<:GoldenKey:273763771929853962> Borderlands 1: 5 gold keys', 'description': 'Platform: Universal\nExpires: 30 November,
2021.```\n5J53T-BKJK5-CTXBZ-JJJTJ-WW6F3```Redeem on the [website](https://shift.gearboxsoftware.com/rewards) or in game.\n\n[Source](https://shift.orcicorn.com/shift-code/5j53t-bkjk5-ctxbz-jjjtj-ww6f3/?utm_source=json&utm_medium=shift&utm_campaign=automation)', 'color': 16040976}], 'mentions': [], 'mention_roles': [], 'pinned': False, 'mention_everyone': False, 'tts': False, 'timestamp': '2021-11-12T23:13:17.987000+00:00', 'edited_timestamp': None, 'flags': 1, 'components': []}"""
# test_case_data = response.text
matches = find_embedded_urls(test_case_data)
matches = [match[0] for match in matches] #convert all urls to strings
print(matches) # List of all the urls! Index for whatever one you need
Output
['https://shift.gearboxsoftware.com/rewards', 'https://shift.orcicorn.com/shift-code/5j53t-bkjk5-ctxbz-jjjtj-ww6f3/?utm_source=json&utm_medium=shift&utm_campaign=automation']
With the URLs as a list index, you can set variables by indexing the list at whatever point you need.
When I search for the book using this link https://www.googleapis.com/books/v1/volumes?q=9780310709626 I get the author name in the details.
However when I run my code and print items I don't see the author name. I've been trying to figure out why it doesn't show from the data but I don't see any problem with my code.
print(searchBooks("9780310709626"))
def getBooks(id):
url = "https://www.googleapis.com/books/v1/volumes?q=isbn:"
resp = url(api + id)
data = json.load(resp)
print(data["items"])
My code output:
[{'kind': 'books#volume', 'id': 'JEP3sgEACAAJ', 'etag': '92vdEneJ83g', 'selfLink': 'https://www.googleapis.com/books/v1/volumes/JEP3sgEACAAJ', 'volumeInfo': {'title': "The Beginner's Bible", 'subtitle': 'Timeless Bible Stories', 'publisher': 'Zondervan', 'publishedDate': '2005', 'description': 'Retells familiar Bible stories from the Old and New Testaments for children to enjoy.', 'industryIdentifiers': [{'type': 'ISBN_10', 'identifier': '0310709628'}, {'type': 'ISBN_13', 'identifier': '9780310709626'}], 'readingModes': {'text': False, 'image': False}, 'pageCount': 511, 'printType': 'BOOK', 'categories': ['Juvenile Nonfiction'], 'averageRating': 4.5, 'ratingsCount': 2, 'maturityRating': 'NOT_MATURE', 'allowAnonLogging': False, 'contentVersion': 'preview-1.0.0', 'panelizationSummary': {'containsEpubBubbles': False, 'containsImageBubbles': False}, 'imageLinks': {'smallThumbnail': 'http://books.google.com/books/content?id=JEP3sgEACAAJ&printsec=frontcover&img=1&zoom=5&source=gbs_api', 'thumbnail': 'http://books.google.com/books/content?id=JEP3sgEACAAJ&printsec=frontcover&img=1&zoom=1&source=gbs_api'}, 'language': 'en', 'previewLink': 'http://books.google.com.tw/books?id=JEP3sgEACAAJ&dq=isbn:9780310709626&hl=&cd=1&source=gbs_api', 'infoLink': 'http://books.google.com.tw/books?id=JEP3sgEACAAJ&dq=isbn:9780310709626&hl=&source=gbs_api', 'canonicalVolumeLink': 'https://books.google.com/books/about/The_Beginner_s_Bible.html?hl=&id=JEP3sgEACAAJ'}, 'saleInfo': {'country': 'TW', 'saleability': 'NOT_FOR_SALE', 'isEbook': False}, 'accessInfo': {'country': 'TW', 'viewability': 'NO_PAGES', 'embeddable': False, 'publicDomain': False, 'textToSpeechPermission': 'ALLOWED', 'epub': {'isAvailable': False}, 'pdf': {'isAvailable': False}, 'webReaderLink': 'http://play.google.com/books/reader?id=JEP3sgEACAAJ&hl=&printsec=frontcover&source=gbs_api', 'accessViewStatus': 'NONE', 'quoteSharingAllowed': False}, 'searchInfo': {'textSnippet': 'Retells familiar Bible stories from the Old and New Testaments for children to enjoy.'}}, {'kind': 'books#volume', 'id': 'ZRgnzQEACAAJ', 'etag': 'RXYM4Rbwx+g', 'selfLink': 'https://www.googleapis.com/books/v1/volumes/ZRgnzQEACAAJ', 'volumeInfo': {'title': "The Beginner's Bible", 'authors': ['Catherine DeVries'], 'publishedDate': '2005', 'industryIdentifiers': [{'type': 'ISBN_10', 'identifier': '0310709628'}, {'type': 'ISBN_13', 'identifier': '9780310709626'}], 'readingModes': {'text': False, 'image': False}, 'pageCount': 511, 'printType': 'BOOK', 'averageRating': 4, 'ratingsCount': 1, 'maturityRating': 'NOT_MATURE', 'allowAnonLogging': False, 'contentVersion': 'preview-1.0.0', 'panelizationSummary': {'containsEpubBubbles': False, 'containsImageBubbles': False}, 'language': 'en', 'previewLink': 'http://books.google.com.tw/books?id=ZRgnzQEACAAJ&dq=isbn:9780310709626&hl=&cd=2&source=gbs_api', 'infoLink': 'http://books.google.com.tw/books?id=ZRgnzQEACAAJ&dq=isbn:9780310709626&hl=&source=gbs_api', 'canonicalVolumeLink': 'https://books.google.com/books/about/The_Beginner_s_Bible.html?hl=&id=ZRgnzQEACAAJ'}, 'saleInfo': {'country': 'TW', 'saleability': 'NOT_FOR_SALE', 'isEbook': False}, 'accessInfo': {'country': 'TW', 'viewability': 'NO_PAGES', 'embeddable': False, 'publicDomain': False, 'textToSpeechPermission': 'ALLOWED', 'epub': {'isAvailable': False}, 'pdf': {'isAvailable': False}, 'webReaderLink': 'http://play.google.com/books/reader?id=ZRgnzQEACAAJ&hl=&printsec=frontcover&source=gbs_api', 'accessViewStatus': 'NONE', 'quoteSharingAllowed': False}}]
With the requests library:
import requests
url = 'https://www.googleapis.com/books/v1/volumes?q=9780310709626'
resp = requests.get(url)
json = resp.json()
print(json['items'][0]['volumeInfo']['authors'])
From the response you can see that authors is an array. To reach that array you will need to do json['items'][0]['volumeInfo']['authors'].
As items is also an array, meaning that there could be multiple items in this response. You might want to write extra code to deal with that other than hard-code index=0.
Note that in this case, you probably won't know the schema of the response. You should handle unexpected behaviors. For some certain books maybe some keys are missing, json['items'] could be an empty array, or even items is not in the response at all.
I am trying to check if a specific item in a json file is equal to one of my python variables.
{'data': {'redemption': {'channel_id': 'secret',
'id': 'secret',
'redeemed_at': '2021-02-08T09:46:22.637059711Z',
'reward': {'background_color': '#FA1ED2',
'channel_id': '145998001',
'cooldown_expires_at': None,
'cost': 500,
'default_image': {'url_1x': 'https://static-cdn.jtvnw.net/custom-reward-images/ghost-1.png',
'url_2x': 'https://static-cdn.jtvnw.net/custom-reward-images/ghost-2.png',
'url_4x': 'https://static-cdn.jtvnw.net/custom-reward-images/ghost-4.png'},
'global_cooldown': {'global_cooldown_seconds': 1,
'is_enabled': False},
'id': '123',
'image': None,
'is_enabled': True,
'is_in_stock': True,
'is_paused': False,
'is_sub_only': False,
'is_user_input_required': False,
'max_per_stream': {'is_enabled': False,
'max_per_stream': 1},
'max_per_user_per_stream': {'is_enabled': False,
'max_per_user_per_stream': 1},
'prompt': '*Dabs*',
'redemptions_redeemed_current_stream': None,
'should_redemptions_skip_request_queue': False,
'template_id': 'template:4425c37e-6881-442a-aa3d-fdc6998a29de',
'title': 'Dab!',
'updated_for_indicator_at': '2020-09-10T18:55:40.064177881Z'},
'status': 'UNFULFILLED',
'user': {'display_name': 'Androteex',
'id': 'secret',
'login': 'androteex'}},
'timestamp': '2021-02-08T09:46:22.637059711Z'},
'type': 'reward-redeemed'}
I want to find the second id: 'id': '123' and check if id is equal to 123. And if so I want to print that string. How could I do that?
You can use the JSON module.
import json
data = json.loads(my_json)
my_id = data['data']['redemption']['reward']['id']
if my_id == '123':
print(data)
Granted it was added to data (can be done with json.loads):
id = data['data']['redemption']['reward']['id']
idcheck = 123
if (int(id) == idcheck):
print ("YES")
Here is the backstory:
I am trying to pull data via an api:
r = requests.get(url, headers=headers)
df = pd.read_json(json.dumps(r.json()), orient='list') // preferably i would use pd.io.json.json_normalize but it throws an error
Sample Json response(slightly simplified):
{'childCollectors': [12345678,],
'Param1': False,
'creationTime': '2017-01-19T18:53:28Z',
'physicalSerialNumber': ‘someserialnumberhere’,
'accountName': ‘account name’,
'Param2’: False,
'Param3’: None,
‘Param4’: None,
'type': ‘somesupersecrettype :)’,
'timeout': 5,
'points': [],
'archived': False,
‘Class’: ‘name’,
'firstReportedTime': '2017-10-26T00:24:12Z',
'deviceIdentifier': '255CD895348K',
'usenetmetering': False,
'vendor': ‘vendor name’,
'reportingIntervalLengthInMillis': 300000,
'id': 12145278,
'manualReadings': [],
'readingsMultiplier': 1,
'outageTimeInMillis': 43200000,
'multiplier': 1,
'parentPoint': None,
'buildingName': ‘buildingname',
'queryable': True,
'replications': [],
'downwardCommunications': [‘something here’],
'buildings': [associatedbuildingid],
'collectorClassDetails': {'identifierName': 'SERIAL_NUMBER',
'multiplierLabel': ‘supersecretlabel’,
'displayName': ‘fancyversionofClass’,
'downwardCommunications': [’something here’],
'vendor': ‘vendor’,
'isPhysicalMeter': False,
'name': 'fancyversionofClass',
'upwardCommunications': [‘somesuacyformat’],
'gateway': True},
'name': 'Gateway 1',
'primaryBuilding': associatedbuildingid,
'parentCollector': None,
'metrics': [‘well duh metrics’],
'upwardCommunications': [‘internet’]}
My Question is:
1) When I get a weird json format response like the above, what is the best practice of converting it to something more standard so that the data can be extracted and utilized. IE I personally want to create a source that contains all of the childCollector ID's to use to query data.
Sorry if this question is too specific but I am still learning python and would love to learn more about parsing data :)
I am trying to pull 'created' from the Monzo data I'm pulling.
I have made a call to the Monzo api with the following code:
from monzo.monzo import Monzo
client = Monzo(INSERT API KEY)
data = client.get_transactions("INSERT ACCOUNT NUMBER")
print (data)
and I can't quite get the data I need which looks like this:
d': 'merch_000094MPASVBf7xCdrZOz3', 'created': '2016-01-20T21: 26: 33.985Z', 'name': 'DelicedeFrance', 'logo': 'https: //mondo-logo-cache.appspot.com/twitter/deliceuk/?size=large', 'emoji': '🇫🇷', 'category': 'eating_out', 'online': False, 'atm': False, 'address': {'short_formatted': 'LiverpoolStreetStation,
LondonEC2M7PY', 'formatted': 'LiverpoolStreetStation,
LondonEC2M7PY,
UnitedKingdom', 'address': 'LiverpoolStreetStation', 'city': 'London', 'region': 'GreaterLondon', 'country': 'GBR', 'postcode': 'EC2M7PY', 'latitude': 51.518159172221615, 'longitude': -0.08210659649555102, 'zoom_level': 17, 'approximate': False}, 'updated': '2016-02-02T14: 10: 48.664Z', 'metadata': {'foursquare_category': 'Restaurant', 'foursquare_category_icon': 'https: //ss3.4sqi.net/img/categories_v2/food/default_88.png','foursquare_website': '', 'google_places_icon': 'https: //maps.gstatic.com/mapfiles/place_api/icons/restaurant-71.png', 'google_places_name': 'DelicedeFrance', 'suggested_name': 'DelicedeFrance', 'suggested_tags': '#food', 'twitter_id': ''}, 'disable_feedback': False}, 'notes': '', 'metadata': {}, 'account_balance': 3112, 'attachments': [], 'category': 'eating_out', 'is_load': False, 'settled': '2017-04-28T04: 54: 18.167Z', 'local_amount': -199, 'local_currency': 'GBP', 'updated': '2017-04-28T06: 15: 06.095Z', 'counterparty': {}, 'originator': False, 'include_in_spending': True}, {'created': '2017-04-28T08: 54: 10.917Z','amount': -130, 'currency': 'GBP', 'merchant': {'created': '2016-04-21T08: 02: 13.537Z','logo': 'https: //mondo-logo-cache.appspot.com/twitter/MCSaatchiLondon/?size=large', 'emoji': '🍲', 'category': 'eating_out', 'online': False, 'atm': False...
How do I pull the 'created' date?
Try this:
#!/usr/bin/env python
import csv
from pymonzo import MonzoAPI
if __name__ == '__main__':
monzo_api = MonzoAPI()
monzo_transactions = monzo_api.transactions()
with open('monzo_transactions.csv', 'w') as csvfile:
writer = csv.writer(csvfile)
for transaction in monzo_transactions:
writer.writerow([
transaction.amount, transaction.description,
transaction.created,
])
print('All done!')
If this is actually right json code and you just have paste errors, than you can use the python libary json:
import json
data = json.loads(datastring)
If this not json code, you probably have to write a parser on your own.