Im working with nested JSON data with Pandas, but i have a problem once i extract the dataframe of the nested data.
The data looks like:
[{"export_id":"COL-EXP-1894","origin_office":"EXAMPLE","destination_office":"","incoterms":"","shipment_date":"","export_date":"2023-01-01","origin_port":"Buenaventura","destination_port":"New York/New Jersey","bl_number":null,"shipping_line":null,"shipping_mode":null,"vessel_name":null,"voyage_number":null,"reservation_number":null,"container_number":null,"seal_number":null,"eta":null,"etd":null,"export_status":"in_progress","ico_list":\[\]}\]
And reading like that all is good, but some data have ico_list like:
[{"export_id":"COL-EXP-1894","origin_office":"EXAMPLE","destination_office":"","incoterms":"","shipment_date":"","export_date":"2023-01-01","origin_port":"Buenaventura","destination_port":"New York/New Jersey","bl_number":null,"shipping_line":null,"shipping_mode":null,"vessel_name":null,"voyage_number":null,"reservation_number":null,"container_number":null,"seal_number":null,"eta":null,"etd":null,"export_status":"in_progress","ico_list":[{"ico_id":"03-0178-436-23","contract_id":"CI-1046","customer":null,"origin_office":"example","destination_office":"example","incoterm":"CIF","quality":"ML","mark":"example","packaging_type":"Nitrogen-Flushed Vac-Packed Boxes - 35KG","packaging_capacity":35.0,"units":1,"quantity":35.0,"certification":null}]}]
And not just one like the example, can be more, so i implemented this:
if response.status_code == 200:
data_str = response.text
try:
atlas_api_data = json.loads(data_str)
df_atlas = pd.json_normalize(atlas_api_data)
#print(df_atlas)
except:
print('ErrorOccured While Parsing JSON ATLAS API TO Dataframe')
df_atlas2 = pd.json_normalize(df_atlas['ico_list'].loc[95])
for i, row in df_atlas.iterrows():
export_id = row['export_id']
origin_office = row['origin_office']
destination_office = row['destination_office']
export_date = row['export_date']
origin_port = row['origin_port']
destination_port = row['destination_port']
bl_number = row['bl_number']
shipping_line = row['shipping_line']
shipping_mode = row['shipping_mode']
vessel_name = row['vessel_name']
voyage_number = row['voyage_number']
reservation_number = row['reservation_number']
container_number = row['container_number']
seal_number = row['seal_number']
export_status = row['export_status']
values = [export_id,origin_office,destination_office,export_date,origin_port,destination_port,
bl_number,shipping_line,shipping_mode,vessel_name,voyage_number,reservation_number,container_number,
seal_number,export_status]
data_list.append(values)
df_atlas2 = pd.json_normalize(df_atlas['ico_list'].loc[i])
if df_atlas2.empty:
print('Empty DF')
else:
for row_ico, j in df_atlas2.iterrows():
ico_id = row_ico['ico_id']
contract_id = row_ico['contract_id']
customer = row_ico['customer']
incoterm = row_ico['incoterm']
quality = row_ico['quality']
mark = row_ico['mark']
packaging_type = row_ico['packaging_type']
packaging_capacity = row_ico['packaging_capacity']
units = row_ico['units']
quantity = row_ico['quantity']
certification = row_ico['certification']
ico_values = [export_id,ico_id,contract_id,customer,incoterm,quality,mark,packaging_type,packaging_capacity,units,quantity,certification]
data_ico_list.append(ico_values)
In this way i extract only the data that i need, and for the first level worked, but when i go to the second iterrows() it says
TypeError Traceback (most recent call last)
Cell In [4], line 43
41 else:
42 for row_ico, j in df_atlas2.iterrows():
---> 43 ico_id = row_ico['ico_id']
44 contract_id = row_ico['contract_id']
45 customer = row_ico['customer']
TypeError: 'int' object is not subscriptable
When printing the df_atlas2 it looks normal, like this:
variable: df_atlas2 before goes into iterrrows()
I tried using df_atlas2['ico_id'].astype(str) with all the columns and ico_id = str(row_ico['ico_id']) but still getting the message
If you know how to solve this, hundred thanks!
I have this JSON output in a HTML and I want to check the stock. I build everything already but I am stuck at the part when Python needs to tell me if the stock is true or not.
All the numbers are stores around the Netherlands. I just want to code that Python prints ''In Stock'' if only ONE of them is TRUE. I did the '' if ... or ... == 'True', but then if one of the stores is False, it's telling me it's still out of stock.
Any idea what kind of code I need to use to let Python tell me if one of the stores has stock?
I am using BS4, Beautifulsoup to parse the JSON.
Just stuck at the ''If... == 'True' part.
Thanks!
{"1665134":{"642":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1298":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1299":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1322":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1325":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1966":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1208":{"hasStock":false,"hasShowModel":false,"lowStock":false},"193":{"hasStock":false,"hasShowModel":false,"lowStock":false},"194":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1102":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1360":{"hasStock":false,"hasShowModel":false,"lowStock":false},"852":{"hasStock":false,"hasShowModel":false,"lowStock":false},"853":{"hasStock":false,"hasShowModel":false,"lowStock":false},"854":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1239":{"hasStock":false,"hasShowModel":false,"lowStock":false},"855":{"hasStock":false,"hasShowModel":false,"lowStock":false},"856":{"hasStock":false,"hasShowModel":false,"lowStock":false},"857":{"hasStock":false,"hasShowModel":false,"lowStock":false},"858":{"hasStock":false,"hasShowModel":false,"lowStock":false},"859":{"hasStock":false,"hasShowModel":false,"lowStock":false},"860":{"hasStock":false,"hasShowModel":false,"lowStock":false},"861":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1246":{"hasStock":false,"hasShowModel":false,"lowStock":false},"862":{"hasStock":false,"hasShowModel":false,"lowStock":false},"863":{"hasStock":false,"hasShowModel":false,"lowStock":false},"864":{"hasStock":false,"hasShowModel":false,"lowStock":false},"865":{"hasStock":false,"hasShowModel":false,"lowStock":false},"866":{"hasStock":false,"hasShowModel":false,"lowStock":false},"867":{"hasStock":false,"hasShowModel":false,"lowStock":false},"484":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1380":{"hasStock":false,"hasShowModel":false,"lowStock":false},"868":{"hasStock":false,"hasShowModel":false,"lowStock":false},"869":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1381":{"hasStock":false,"hasShowModel":false,"lowStock":false},"870":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1255":{"hasStock":false,"hasShowModel":false,"lowStock":false},"871":{"hasStock":false,"hasShowModel":false,"lowStock":false},"360":{"hasStock":false,"hasShowModel":false,"lowStock":false},"872":{"hasStock":false,"hasShowModel":false,"lowStock":false},"873":{"hasStock":false,"hasShowModel":false,"lowStock":false},"746":{"hasStock":false,"hasShowModel":false,"lowStock":false},"875":{"hasStock":false,"hasShowModel":false,"lowStock":false},"876":{"hasStock":false,"hasShowModel":false,"lowStock":false},"749":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1391":{"hasStock":false,"hasShowModel":false,"lowStock":false},"880":{"hasStock":false,"hasShowModel":false,"lowStock":false},"499":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1275":{"hasStock":false,"hasShowModel":false,"lowStock":false},"1149":{"hasStock":false,"hasShowModel":false,"lowStock":false},"637":{"hasStock":false,"hasShowModel":false,"lowStock":false}}}
Python code;
def monitor():
try:
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')
voorraad = response.json()
v1 = (voorraad['{}'.format(productid)]['193']['hasStock'])
v2 = (voorraad['{}'.format(productid)]['194']['hasStock'])
v3 = (voorraad['{}'.format(productid)]['360']['hasStock'])
v4 = (voorraad['{}'.format(productid)]['484']['hasStock'])
v5 = (voorraad['{}'.format(productid)]['499']['hasStock'])
v6 = (voorraad['{}'.format(productid)]['637']['hasStock'])
v7 = (voorraad['{}'.format(productid)]['642']['hasStock'])
v8 = (voorraad['{}'.format(productid)]['746']['hasStock'])
v9 = (voorraad['{}'.format(productid)]['749']['hasStock'])
v10 = (voorraad['{}'.format(productid)]['852']['hasStock'])
v11 = (voorraad['{}'.format(productid)]['853']['hasStock'])
v12 = (voorraad['{}'.format(productid)]['854']['hasStock'])
v13 = (voorraad['{}'.format(productid)]['855']['hasStock'])
v14 = (voorraad['{}'.format(productid)]['856']['hasStock'])
v15 = (voorraad['{}'.format(productid)]['857']['hasStock'])
v16 = (voorraad['{}'.format(productid)]['858']['hasStock'])
v17 = (voorraad['{}'.format(productid)]['859']['hasStock'])
v18 = (voorraad['{}'.format(productid)]['860']['hasStock'])
v19 = (voorraad['{}'.format(productid)]['861']['hasStock'])
v20 = (voorraad['{}'.format(productid)]['862']['hasStock'])
v21 = (voorraad['{}'.format(productid)]['863']['hasStock'])
v22 = (voorraad['{}'.format(productid)]['864']['hasStock'])
v23 = (voorraad['{}'.format(productid)]['865']['hasStock'])
v24 = (voorraad['{}'.format(productid)]['866']['hasStock'])
v25 = (voorraad['{}'.format(productid)]['867']['hasStock'])
v26 = (voorraad['{}'.format(productid)]['868']['hasStock'])
v27 = (voorraad['{}'.format(productid)]['869']['hasStock'])
v28 = (voorraad['{}'.format(productid)]['870']['hasStock'])
v29 = (voorraad['{}'.format(productid)]['871']['hasStock'])
v30 = (voorraad['{}'.format(productid)]['872']['hasStock'])
v31 = (voorraad['{}'.format(productid)]['873']['hasStock'])
v32 = (voorraad['{}'.format(productid)]['875']['hasStock'])
v33 = (voorraad['{}'.format(productid)]['876']['hasStock'])
v34 = (voorraad['{}'.format(productid)]['880']['hasStock'])
v35 = (voorraad['{}'.format(productid)]['1102']['hasStock'])
v36 = (voorraad['{}'.format(productid)]['1149']['hasStock'])
v37 = (voorraad['{}'.format(productid)]['1208']['hasStock'])
v38 = (voorraad['{}'.format(productid)]['1239']['hasStock'])
v39 = (voorraad['{}'.format(productid)]['1246']['hasStock'])
v40 = (voorraad['{}'.format(productid)]['1255']['hasStock'])
v41 = (voorraad['{}'.format(productid)]['1275']['hasStock'])
v42 = (voorraad['{}'.format(productid)]['1298']['hasStock'])
v43 = (voorraad['{}'.format(productid)]['1299']['hasStock'])
v44 = (voorraad['{}'.format(productid)]['1322']['hasStock'])
v45 = (voorraad['{}'.format(productid)]['1325']['hasStock'])
v46 = (voorraad['{}'.format(productid)]['1360']['hasStock'])
v47 = (voorraad['{}'.format(productid)]['1380']['hasStock'])
v48 = (voorraad['{}'.format(productid)]['1381']['hasStock'])
v49 = (voorraad['{}'.format(productid)]['1391']['hasStock'])
v50 = (voorraad['{}'.format(productid)]['1966']['hasStock'])
if any(v1, v2, v3):
print(colored('[{}] ' + 'IN STOCK | ' + (product_title), 'green').format(str(datetime.now())))
send_to_discord(product_title, webpagina, footerlogo, url, image_url)
time.sleep(50)
exit()
else:
print(colored('[{}] ' + 'OUT OF STOCK | ' + (product_title), 'red').format(str(datetime.now())))
time.sleep(2)
Any was a test, not familiar with it...
Your code will get way out of hand if you have that many manual entries that each do the same thing. First, I'd suggest making a list of product codes
products = [193, 194, 360, 384, ...]
the response.json() from js isn't what you'd use in python. first import
import json
then use json.loads() or json.dumps() to 'parse' and 'stringify' respectively
store = json.loads(<soup.whatever>)
and then I assume "1665134" is a merch id or something, which you can just iterate through subsequent objects
for product in store:
if(product['hasStock']):
# do stuff with stock
else:
# has no stock, you're sol
First, never write 50 variables manually when you can replace that with a loop...
assuming you need some specific ids, do the following:
ids = ['193','192','360'...] and loop over the array.
You said " just want to code that Python prints ''In Stock'' if only ONE of them is TRUE." So only one, if the value is true for more than 1 store, you want the program to return false (logically you need at least one, not only 1, but I'm just following what you said). Also, judging by the code you already wrote, you don't seem to care which specific store has the stock.
In this case, this should work for you:
ids = ['193','192','360'...]
occurancesOfTrue=0;
for i in ids:
if (voorraad['{}'.format(productid)][i]['hasStock']): occurancesOfTrue++;
if occurancesOfTrue==1:
print(colored('[{}] ' + 'IN STOCK | ' + (product_title), 'green').format(str(datetime.now())))
send_to_discord(product_title, webpagina, footerlogo, url, image_url)
time.sleep(50)
exit()
else:
print(colored('[{}] ' + 'OUT OF STOCK | ' + (product_title), 'red').format(str(datetime.now())))
time.sleep(2)
if you need it to be at least 1 occurrence, instead of 1 unique, replace if occurancesOfTrue==1: by if occurancesOfTrue>=1:
I have two functions, one which creates a dataframe from a csv and another which manipulates that dataframe. There is no problem the first time I pass the raw data through the lsc_age(import_data()) functions. However, I get the above-referenced error (TypeError: 'DataFrame' object is not callable) upon second+ attempts. Any ideas for how to solve the problem?
def import_data(csv,date1,date2):
global data
data = pd.read_csv(csv,header=1)
data = data.iloc[:,[0,1,4,6,7,8,9,11]]
data = data.dropna(how='all')
data = data.rename(columns={"National: For Dates 9//1//"+date1+" - 8//31//"+date2:'event','Unnamed: 1':'time','Unnamed: 4':'points',\
'Unnamed: 6':'name','Unnamed: 7':'age','Unnamed: 8':'lsc','Unnamed: 9':'club','Unnamed: 11':'date'})
data = data.reset_index().drop('index',axis=1)
data = data[data.time!='Time']
data = data[data.points!='Power ']
data = data[data['event']!="National: For Dates 9//1//"+date1+" - 8//31//"+date2]
data = data[data['event']!='USA Swimming, Inc.']
data = data.reset_index().drop('index',axis=1)
for i in range(len(data)):
if len(str(data['event'][i])) <= 3:
data['event'][i] = data['event'][i-1]
else:
data['event'][i] = data['event'][i]
data = data.dropna()
age = []
event = []
gender = []
for row in data.event:
gender.append(row.split(' ')[0])
if row[:9]=='Female 10':
n = 4
groups = row.split(' ')
age.append(' '.join(groups[1:n]))
event.append(' '.join(groups[n:]))
elif row[:7]=='Male 10':
n = 4
groups = row.split(' ')
age.append(' '.join(groups[1:n]))
event.append(' '.join(groups[n:]))
else:
n = 2
groups = row.split(' ')
event.append(' '.join(groups[n:]))
groups = row.split(' ')
age.append(groups[1])
data['age_group'] = age
data['event_simp'] = event
data['gender'] = gender
data['year'] = date2
return data
def lsc_age(data_two):
global lsc, lsc_age, top, all_performers
lsc = pd.DataFrame(data_two['event'].groupby(data_two['lsc']).count()).reset_index().sort_values(by='event',ascending=False)
lsc_age = data_two.groupby(['year','age_group','lsc'])['event'].count().reset_index().sort_values(by=['age_group','event'],ascending=False)
top = pd.concat([lsc_age[lsc_age.age_group=='10 & under'].head(),lsc_age[lsc_age.age_group=='11-12'].head(),\
lsc_age[lsc_age.age_group=='13-14'].head(),lsc_age[lsc_age.age_group=='15-16'].head(),\
lsc_age[lsc_age.age_group=='17-18'].head()],ignore_index=True)
all_performers = pd.concat([lsc_age[lsc_age.age_group=='10 & under'],lsc_age[lsc_age.age_group=='11-12'],\
lsc_age[lsc_age.age_group=='13-14'],lsc_age[lsc_age.age_group=='15-16'],\
lsc_age[lsc_age.age_group=='17-18']],ignore_index=True)
all_performers = all_performers.rename(columns={'event':'no. top 100'})
all_performers['age_year_lsc'] = all_performers.age_group+' '+all_performers.year.astype(str)+' '+all_performers.lsc
return all_performers
years = [i for i in range(2008,2018)]
for i in range(len(years)-1):
lsc_age(import_data(str(years[i+1])+"national100.csv",\
str(years[i]),str(years[i+1])))
During the first call to your function lsc_age() in line
lsc_age = data_two.groupby(['year','age_group','lsc'])['event'].count().reset_index().sort_values(by=['age_group','event'],ascending=False)
you are overwriting your function object with a dataframe. This is happening since you imported the function object from the global namespace with
global lsc, lsc_age, top, all_performers
Functions in Python are objects. Please see more information about this here.
To solve your problem, try to avoid the global imports. They do not seem to be necessary. Try to pass your data around through the arguments of the function.
I'm not entirely sure why im getting a dictionary key error. I'm trying to create a multi level dict with = sign and getting a key error on metrics, but not on the first two.
doc['timestamp']
and
doc['instance_id']
both work fine, but when it gets to metrics it gives me a metrics key error. I'm not entirely sure why.
doc = {}
doc['timestamp'] = datetime.now()
#doc['instance_id'] = get_cloud_app_name()
doc['instance_id'] = "MyMac"
cpu_dict_returned = get_cpu_info()
doc['metrics']['cpu_usage']['user_cpu'] = cpu_dict_returned['user_cpu']
doc['metrics']["cpu_usage"]['system_cpu'] = cpu_dict_returned['system_cpu']
doc['metrics']["cpu_usage"]['idle_cpu'] = cpu_dict_returned['idle_cpu']
doc['metrics']["cpu_usage"]['cpu_count'] = cpu_dict_returned['cpu_count']
You must create the sub-dictionnaries before using them:
doc = {}
doc['timestamp'] = datetime.now()
doc['instance_id'] = "MyMac"
cpu_dict_returned = get_cpu_info()
doc['metrics'] = {}
doc['metrics']['cpu_usage'] = {}
doc['metrics']['cpu_usage']['user_cpu'] = cpu_dict_returned['user_cpu']
doc['metrics']["cpu_usage"]['system_cpu'] = cpu_dict_returned['system_cpu']
doc['metrics']["cpu_usage"]['idle_cpu'] = cpu_dict_returned['idle_cpu']
doc['metrics']["cpu_usage"]['cpu_count'] = cpu_dict_returned['cpu_count']
You can do this more succinctly using a dictionary comprehension:
doc = {}
doc['timestamp'] = datetime.now()
doc['instance_id'] = "MyMac"
cpu_dict_returned = get_cpu_info()
doc['metrics'] = {
'cpu_usage':
{k: cpu_dict_returned.get(k)
for k in ['user_cpu', 'system_cpu', 'idle_cpu', 'cpu_count']}
}
Note that the sub dictionary cpu_usage is first created, and then the nested dictionary is inserted.