How do i access JSON data to send to Pandas? - python

I make a call to an API and get a JSON response. For the better half of 2 days i have been trying to access the data, and send it to a CSV file, or push it to Pandas with no luck. I have gone through page after page of assistance, but something is just not clicking.
Below is the JSON data, all i need are the following elements singleLineAddress, isAgentsAdvice & contractDate
[
{
"_embedded":{
"propertySummaryList":[
{
"propertySummary":{
"address":{
"singleLineAddress":"Example St"
},
"id":47328284,
"lastSaleDetail":{
"contractDate":"Example",
"isAgentsAdvice":Yes,
"isArmsLength":0,
"isPriceWithheld":0,
The ideal end result would be a csv export that had the following column names
Address | AA | Contract_Date
So far i've tried to create a simple dataframe in python, however it seems to always print on singular lines, so i moved over to attempting to 'flatten' the file for easier processing, but no data was extracted at all in those attempts.
I attempted to follow a few github recipes, but the overall theme seemed to be just flattening the data, which as above didnt seem to work well with my json file.
I attempted the normal dataframe route but didnt have much luck, my main focus has been in both json_normalize, and flatten_json type work, but if someone wants to offer a better solution, i am all ears.
Any help is appreciated. I'd appreciate any leads on what may be 'missing' from my knowledge base around JSON and Pandas as it's been quite handy the last few weeks.
Cheers

Related

Python script to automate tasks with Excel

First of all I am very sorry if this is a common questions, however I am still very new in the programming world, and cannot figure out the keywords for what I'm actually looking for.
Basically I have an Excel spreadsheet with data Picture of Excel Sheet - For each identifier I have a 0 or a 1. My task is to copy the number in the identifier column, paste it into a webpage from my workplace and click a button to exclude the customer from billing. Once I have done this, I will go back to the Excel spreadsheet and change the number from 0 to 1. Since there is a lot to go through, I thought this would be a fun project for me to start learning a basic script. I did not make it very far though!
import pandas as pd
Migreringer = pd.read_excel('Migreringer.xls')
ExludeBilling = Migreringer[["Exclude Billing", "Identifier"]]
IsExcludedBilling = Migreringer["Exclude Billing"] > 0
I was hoping that someone was able to give me a good idea as to how to move on from here, My idea is that it would check each row for the True/False statement from IsExcludedBilling, and then as a start copy the identifier if the statement is False and paste it into a word document or something similar to test this out? I just cannot seem to figure out how to make Python go through each row and validate the statements, and then make it copy something from a different column in the same row to a different document?
I know from experience that I am more motivated in learning with a project in mind, so that is why I've chosen to start here. Maybe I should take a couple of steps back before engaging with a project like this?
do you wan this ?
df = pd.DataFrame(
{
'Exclude Billing': [1,0,0,1,0],
'Identifier': [ 41823,41835,41819,41798,41812],
}
)
df_zero=df.loc[df['Exclude Billing']==0,:]
df_zero.to_csv('data_with_zero.csv',index=False)

Extract certain data from excel with Python to .ini-file

Hi I have a spreadsheet with different values in column A(DeviceName) and B(SystemIP).
I want to copy these 2 values into an .ini-file.
The .ini-file should basically be this below. With the IP and Devicename being dynamic and will get its value from an .xlsx-file.
[Network]
SystemIP=0.0.0.0
SubnetMask=255.255.255.0
DefaultGateway=10.0.0.1
[Device]
DeviceName=XXXX
MAC=12:32:FA:AB:D2
I've tested some things with Pandas to read the excel-file, however it doesnt really work the way I want.
If anyone could guide me in the right direction to manage this with a Python script I'd really appreciate it!

Storing, modifying and manipulating web scraped data

I'm working on a python webscraper that pulls data from a car advertising site. I got the scraping part all done with beatifoulsoup but I've ran into many difficulties trying to store and modify it. I would really appreciate some advice on this part since I'm a lacking knowledge on this part.
So here is what I want to do:
Scrape the data each hour (done).
Store scraped data as a dictionary in a .JSON file (done).
Everytime the ad_link not found in the scraped_data.json set it to dict['Status'] = 'Inactive' (done).
If a cars price changes , print notification + add old price to dictionary. On this part I came across many challenges with the .JSON way.
I've kept using 2 .json files and comparing them to each other (scraped_data_temp , permanent_data.json) but I think this is by far not the best method.
What would you guys suggest? How should I do this? .
What would be the best way to approach manipulating this kind of data ? (Databases maybe? - got no experince with them but I'm eager to learn) and what would be a good way to represent this kind of data, pygal?
Thank you very much.
If you have larger data, I would definitely recommend using some kind of DB. If you don't have the need to use DB server, you can use sqlite. I have used it in the past to save bigger data locally. You can use sqlalchemy in python to interact with DB-s.
As for displaying data, I tend to use matplotlib. It's extremely flexible, has extensive documentation and examples, so you can adjust data to you linking, graphs, charts, etc.
I'm assuming that you are using python3.

Parsing a CSV into a database for an API using Python?

I'm gonna use data from a .csv to train a model to predict user activity on google ads (impressions, clicks) in relation to the weather for a given day. And I have a .csv that contains 6000+ recordings of this info and want to parse it into a database using Python.
I tried making a df in pandas but for some reason the whole table isn't shown. The middle columns (there's about 7 columns I think) and rows (numbered over 6000 as I mentioned) are replaced with '...' when I print the table so I'm not sure if the entirety of the information is being stored and if this will be usable.
My next attempt will possible be SQLite but since it's local memory, will this interfere with someone else making requests to my API endpoint if I don't have the db actively open at all times?
Thanks in advance.
If you used pd.read_csv() i can assure you all of the info is there, it's just not displaying it.
You can check by doing something like print(df['Column_name_you_are_interested_in'].tolist()) just to make sure though. You can also use the various count type methods in pandas to make sure all of your lines are there.
Panadas is pretty versatile so it shouldn't have trouble with 6000 lines

Trying to work with API JSON output in Python

I am a newbie to the world of Python and JSON though I've managed to work my way through most problems. The latest though is stumping me. I am trying to work with the API at localbitcoins.com and the JSON file is here LBC_JSON--it's a public file.
The output is quite large. I have tried working with it pandas using this code:
from pandas.io.json import json_normalize
from pandas.io.json import read_json
pandas_json = read_json('https://localbitcoins.com/buy-bitcoins-online/alipay/.json')
print(len(pandas_json))
print(type(pandas_json))
print(pandas_json)
But the completed data is not outputted, and then, not completely.
I have tried using the requests library and generating a response.json() on the response. Even though this brings in the complete data I cannot find a way to access the data that I need. I've tried iteration through the data with no luck. All I need is the first price in the API.
I have managed to get this info by using BeautifulSoup and CSS tags but I don't feel this is the correct way to access this info since an API is provided.
Thanks in advance for your answers.
You have to iterate over ad_list, for example:
for ad in pandas_json['data']['ad_list']:
print(ad['data']['profile']['username'], ad['data']['temp_price'])

Categories

Resources