I am creating a watcher for my Videos section in youtube which will keep monitoring the stats of latest video uploaded. I am not using Selenium because it will keep the browser engaged and interrupt. But using requests_html to load the /videos page and return me the stats like how many view are now and send me a message on Telegram app.
So when I did requests_html call i retrieve a small json which has all the videos & stats but the command json.load() actually coverts it into dict of dict. The intention is to use the result of json.load() as JSON so that I can perform json parsing.
Attached the snapshot of JSON structure and highlighted the desired keys
I couldnt find any better example that tell me how to convert it into just JSON instead of dict of dict. because there are multiple levels approx 20+ to retrieve the desired key value. I have seen very simple example that do the following, but it is not possible to write those many nodes in the below statement.
aKeyValue = dictName['parentnode']['childNode']
Secondly if it is just JSON then I believe jsonpath_ng and its parse methods can be used to retrieve a desired key with having to provide the complete path from root. If this is tnot the right way, please suggest any other Py Module.
the recursive functions didnt work properly. I tried may be 10+ different functions, none worked on the JSON. Finally I found jsonpath-ng to work after I read thru its entire documentation
It was pretty simple, I dont know why I didnt figure this out earlier.
json.loads(jsonString)
parse($..gridRenderer.items[0].gridVideoRenderer.viewCountText.simpleText)
But at the same time the statement below doesnt work. Although statement below wasnt required, but i just tried. It worked on jsonpath evaluator online but not in the python code.
$..gridRenderer.items[0].gridVideoRenderer.viewCountText[?(simpleText="5927 views")]
If someone could point out why the above statement wouldnt work in python code ?
Related
messed up dictionary view
change this format to this
cleaned dictionary view
I am working on a project for self-learning, In the middle of the project while I was following up, my API's output was not cleaned and it was hard to read. I wanted my output should look like the one that the video liked in the second image. I search on the internet but didn't find any clear answer.
I got the way to Display Dict in Json format. There is a Chrome Extension named "JSON Viewer" which help to convert single line of python Dict key-values into JSON web format.
I have huge collection of .json files containing hundreds or thousands of documents I want to import to arangodb collections. Can I do it using python and if the answer is yes, can anyone send an example on how to do it from a list of files? i.e:
for i in filelist:
import i to collection
I have read the documentation but I couldn't find anything even resembling that
So after a lot of trial and error I found out that I had the answer in front of me. So I didn't need to import the .json file, I just needed to read it and then do a bulk import of documents. The code is like this:
a = db.collection('collection_name')
for x in list_of_json_files:
with open(x,'r') as json_file:
data = json.load(json_file)
a.import_bulk(data)
So actually it was quite simple. In my implementation I am collecting the .json files from multiple folders and importing them to multiple collections. I am using the python-arango 5.4.0 driver
I had this same problem. Though your implementation will be slightly different, the answer you need (maybe not the one you're looking for) is to use the "bulk import" functionality.
Since ArangoDB doesn't have an "official" Python driver (that I know of), you will have to peruse other sources to give you a good idea on how to solve this.
The HTTP bulk import/export docs provide curl commands, which can be neatly translated to Python web requests. Also see the section on headers and values.
ArangoJS has a bulk import function, which works with an array of objects, so there's no special processing or preparation required.
I have also used the arangoimport tool to great effect. It's command-line, so it could be controlled from Python, or used stand-alone in a script. For me, the key here was making sure my data was in JSONL or "JSON Lines" format (each line of the file is a self-contained JSON object, no bounding array or comma separators).
So, I am trying to print out gifs by using Tenor API.
I want it to only print one gif link but it prints out everything any Idea how to fix this?
Thank you.
https://i.stack.imgur.com/xf084.png
Sadly, I can not tell you the exact problem you are having, I replicated your code and used the official API Docs here
From what I can tell, this is one GIF just in a lot of different formats.
You can filter them like so:
print(top_8gifs['weburl'])
or
print(top_8gifs['results'][0])
EDIT: Looking at your .png (please embed it as code in the future) this should work for you, if you want the url:
print(top_8gifs[0]['url'])
A Python dict you can select using the key (like gifs['weburl'])
A Python list you have to select by index so gifs[0]
Using these techniques you can gather the data you need from that output.
I am a newbie to the world of Python and JSON though I've managed to work my way through most problems. The latest though is stumping me. I am trying to work with the API at localbitcoins.com and the JSON file is here LBC_JSON--it's a public file.
The output is quite large. I have tried working with it pandas using this code:
from pandas.io.json import json_normalize
from pandas.io.json import read_json
pandas_json = read_json('https://localbitcoins.com/buy-bitcoins-online/alipay/.json')
print(len(pandas_json))
print(type(pandas_json))
print(pandas_json)
But the completed data is not outputted, and then, not completely.
I have tried using the requests library and generating a response.json() on the response. Even though this brings in the complete data I cannot find a way to access the data that I need. I've tried iteration through the data with no luck. All I need is the first price in the API.
I have managed to get this info by using BeautifulSoup and CSS tags but I don't feel this is the correct way to access this info since an API is provided.
Thanks in advance for your answers.
You have to iterate over ad_list, for example:
for ad in pandas_json['data']['ad_list']:
print(ad['data']['profile']['username'], ad['data']['temp_price'])
I just wanted to ask what can I do to solve this issue I have.
Essentially I am making a stock checker for sneakers from Adidas, I know the endpoint to obtain the stock but the JSON data given back to me whilst readable and contains what I need also contains a bunch of other information that is unnecessary to what I am trying to do.
Example of a link to an endpoint:
http://production.store.adidasgroup.demandware.net/s/adidas-GB/dw/shop/v16_9/products/(BZ0221)?client_id=c1f3632f-6d3a-43f4-9987-9de920731dcb&expand=availability,variations,prices
This is a link to the JSON containing the stock of the shoe, price and availability. However, if you try to open it you'll see that it responds a bunch of useless info such as the description of the shoe and the price which I do not need.
A github repository that I was using to try and get to grips with the requests I am trying to make is:
https://github.com/yzyio/adidas-stock-checker/blob/master/assets/index.js
I can get it to give me the JSON response I am just trying to strip what I don't need and keep what I do need which I am finding very difficult especially in python.
Many Thanks!
Since you've said you can get a JSON response from the server than the first think you need to do is tell python to load it as JSON.
import json
data = json.loads(response_from_server)
After doing this you can now access the values in your JSON object the way you would access them via a Python dict.
data["artist"]["id"]