keep calling an API until it is updated with latest item (Python) - python

I'm looking to call an API, and compare the data to my saved data in a CSV. If it has a new data point then I want to update my CSV and return the DataFrame... The mystery I have is why these two variables appear to be the same, yet the If statement moves to the Else instead of recognizing they are the same, if they are the same it should keep looping until an updated data point appears,(see second_cell == lastItem1 )
import pandas_datareader as pdr # https://medium.com/swlh/pandas-datareader-federal-reserve-economic-data-fred-a360c5795013
import datetime
def datagetter():
i = 1
while i < 120:
start = datetime.datetime (2005, 1, 1) ### Step 1: get data, and print last item
end = datetime.datetime (2040, 1, 1)
df = pdr.DataReader('PAYEMS', 'fred', start, end) ## This is the API
lastItem1 = df["PAYEMS"].iloc[-1] # find the last item in the data we have just downloaded
print ("Latest item from Fred API: " , lastItem1) ### Print the last item
with open('PAYEMS.csv', 'r') as logs: # So first we open the most recent CSV file
data = logs.readlines()
last_row = data[-1].split(',') # split is default on , as CSVs should be.
second_cell = last_row[1] # "second_cell" is our variable name for the saved datapoint from last month/week/day
print ("Last Item, in thousands" , second_cell)
if second_cell == lastItem1:
print ("CSV " , second_cell, "API ", lastItem1, " downloaded and stored items are the same, will re-loop until a new datapoint")
print("attempt no.", i)
i += 1
else:
df.to_csv("PAYEMS.csv")
print ("returning dataframe")
# print(df.tail())
return df
df = datagetter()
print(df.tail(3))

solved my own problem:
my CSV was returning a string, and the API an int... not quite sure why.
So
if second_cell == "": second_cell = 0 second_cell1 = int(float(second_cell))

Related

Changing a specific "cell" in a Pandas DataFrame using a conditional on that "cell" but after finding the end of a column

Edited with more info
I am relatively new to stack, and to Pandas. I have tried to find answers on how, but cannot find a definitive answer and I put it to the hive mind.
I have data stored in a variable (from MQTT that needs to be added to index 5. Before it gets added I first need to find the last entry of the ID "12345" and then get the value of the matching TYPE .
Then, I need to add the new data from the variable into index 5 with the opposite of what is there in the filtered series. If the filtered series displays a "" or "out" then create new entry with the data in the variable but put "in" if the value is "in" create new entry and place "out".
So I have a df with just 4 columns
ID DATE TIME TYPE
0 12345 20200518 2018 IN
1 22345 20200518 2019 IN
2 32345 20200518 2036 IN
3 42345 20200518 2105 IN
4 12345 20200518 2201 OUT
I want to find the last value entry of "12345" (I can do this using filt = df['ID'] == IDT where IDT = 12345 which gives me the last series with that value (12345). Once I have found the series I then want to "read" the value in the TYPE "cell" based on whats already in there, and then put the opposite underneath to create a new record.
If it is a "" or "Out" then change it to "IN" but add it to the df (not overwrite the filtered result)
In essence it is to keep track of an RFID system and line 5 should be
5 12345 20200518 2343 IN.
New code here
Thank you for replying. I added that to my code. I had to play around with it to get the errors out of the way and well there is a point I can't get passed. Here is the the entire script (I know I need some automation in there to keep the script going through and also some if statements to stop the file duplicating some text inside etc.
This is the error log: Caught exception in on_message: index -1 is out of bounds for axis 0 with size 0
import datetime
import paho.mqtt.client as mqtt
import time
import pandas as pd
import numpy as np
# message = ("id2436520201757") used for testing
today = datetime.date.today()
filename = str(today)
# opens a file with todays date and adds headers, this
with open(filename + ".csv", "a") as hashsearch:
hashsearch.write("ID,DATE,TIME,TYPE \n")
hashsearch.close()
df = pd.read_csv(filename + ".csv")
def on_message(client, userdata, message):
print("message received from phone " ,str(message.payload.decode("utf-8")))
print("message topic=",message.topic)
print("message qos=",message.qos)
print("message retain flag=",message.retain)
# trying to find last entry of value and replace it in new line below...
msg = str(message.payload.decode("utf-8"))
idx = (msg[2:7])
print(idx, 'this is the new msg index')
last_idx = df.loc[df['ID'] == idx].index[-1]
last_entry = df['TYPE'][last_idx]
if last_entry == 'OUT' or last_entry == '':
new_entry = 'IN'
elif last_entry == 'IN':
new_entry = 'OUT'
print(last_idx, " ", last_entry, " ", new_entry)
payload = ("ID: " + msg[2:7] + " " + "DATE: " + msg [7:13] + " " + "TIME: " + msg [13:17] + new_entry + "\n")
with open(filename + ".csv", "a") as f:
f.write(payload, "this should have all data in")
def on_log(client, userdata, level, buf):
print("log: ",buf)
########################################
broker_address="192.168.0.46"
print("creating new instance")
client = mqtt.Client("esp2") #create new instance
client.on_message=on_message #attach function to callback
print("connecting to broker")
client.connect(broker_address) #connect to broker
client.loop_start() #start the loop
print("Subscribing to topic","door")
client.subscribe("door")
print("Publishing message to topic","window")
client.publish("window","OFF")
client.on_log=on_log
time.sleep(10) # wait
client.loop_stop() #stop the loop
I thank you in advance for any help you may give on how to achieve the above.
Cheers,
J
So after having read your edit, I think I get what you want. You can find the 'opposite' entry as follows
IDT = 12345
# find the index in df corresponding to the last entry of IDT
last_idx = df.loc[df['ID'] == IDT].index[-1]
last_entry = df['TYPE'][last_idx]
if last_entry == 'OUT' or last_entry == '':
new_entry = 'IN'
elif last_entry == 'IN':
new_entry = 'OUT'
Now you can alter the data that you have from MQTT by setting its 'TYPE' column entry to new_entry and append it to the dataframe using df = df.append(new_data, ignore_index=True). Hope this helps :)

IndexError: single positional indexer is out-of-bounds error while downloading data

I am running a code to download data and saving them in local drive. However, I am getting above mentioned error message. Please note initially I have converted date in a different format and while saving them I get this error message.
Can you please help me with this error?
'''
import quandl
import os
import pandas as pd
import datetime as dt
import glob
if name == "main":
'Creating bucket to store missing data file.'
data_missing = []
New_date = []
'Defining a path to save CSV files after downloading and also deleting all csv file at one go.'
extension = 'csv'
path = "F:/Tradepoint/MyMkt/"
if not os.path.exists(path):
os.mkdir(path)
os.chdir(path)
csv_count = [forma for forma in glob.glob('*.{}'.format(extension))]
for csv_coun in range(len(csv_count)):
os.remove(r"F:/Tradepoint/MyMkt/" + csv_count[csv_coun][0:])
'Setting up quandl configuration, reading ticker list, setting up date for which data is going to get downloaded'
quandl.ApiConfig.api_key = 'Hba3CzgNnEa2LMxR14FA'
end_date = dt.date.today()
diff_year = dt.timedelta(days=3650)
start_date = end_date - diff_year
stock_list = pd.read_csv(r"F:\Abhay_New\Abhay\Python\Project\SHARADAR_SF1.csv")
'Looping through quandl website to download data and renaming them as per requirement.'
for stock_lis in range(len(stock_list)):
data = quandl.get_table('SHARADAR/SEP', date={'gte':start_date, 'lte':end_date}, ticker=stock_list.iloc[stock_lis])
sort_by_date = data.sort_values('date')
for sort_by_dat in range(len(sort_by_date['date'])):
Date = dt.date.strftime(sort_by_date['date'][sort_by_dat],'%d-%m-%Y')
New_date.append(Date)
if len(data)>1:
Date = pd.Series(New_date).rename('Date').astype(str)
OPEN = sort_by_date['open']
HIGH = sort_by_date['high']
LOW = sort_by_date['low']
CLOSE = sort_by_date['close']
VOLUME = sort_by_date['volume']
final_data = pd.concat([Date,OPEN,HIGH,LOW,CLOSE,VOLUME],axis=1)
stk = stock_list.iloc[sort_by_dat][0]
final_data.to_csv(str(path + stk + '.csv'), sep=',', index = False, header = False)
else:
data_missing.append(stock_list.iloc[sort_by_dat])
print(data_missing)
'''
Thanks,
Abhay Dodiya
The index of both for loops is i. This causes potentially unintended behavior:
for i in range(2):
for i in range(3,11):
pass
print(i)
gives
10
10
So even after exiting the second loop, the last value from i is there. Rename the counting variable in that loop and your issue should be gone.
In your case you probably have more dates than stocks, and thus observe the error message you have.

I cannot discover error, attempting to insert ...columns do not match

So the code runs until inserting new row, at which time I get>>>
'attempting to insert [ 3 item result] into these columns [ 5 items]. I have tried to discover where my code is causing a loss in results, but cannot. Any suggestions would be great.
Additional information, my feature class I am inserting has five fields and they are they same as the fields as the source fields. It reaches to my length != and my error prints. Please assist if you anyone would like.
# coding: utf8
import arcpy
import os, sys
from arcpy import env
arcpy.env.workspace = r"E:\Roseville\ScriptDevel.gdb"
arcpy.env.overwriteOutput = bool('TRUE')
# set as python bool, not string "TRUE"
fc_buffers = "Parcels" # my indv. parcel buffers
fc_Landuse = "Geology" # my land use layer
outputLayer = "IntersectResult" # output layer
outputFields = [f.name for f in arcpy.ListFields(outputLayer) if f.type not in ['OBJECTID', "Geometry"]] + ['SHAPE#']
landUseFields = [f.name for f in arcpy.ListFields(fc_Landuse) if f.type not in ['PTYPE']]
parcelBufferFields = [f.name for f in arcpy.ListFields(fc_buffers) if f.type not in ['APN']]
intersectionFeatureLayer = arcpy.MakeFeatureLayer_management(fc_Landuse, 'intersectionFeatureLayer').getOutput(0)
selectedBuffer = arcpy.MakeFeatureLayer_management(fc_buffers, 'selectedBuffer').getOutput(0)
def orderFields(luFields, pbFields):
ordered = []
for field in outputFields:
# append the matching field
if field in landUseFields:
ordered.append(luFields[landUseFields.index(field)])
if field in parcelBufferFields:
ordered.append(pbFields[parcelBufferFields.index(field)])
return ordered
print pbfields
with arcpy.da.SearchCursor(fc_buffers, ["OBJECTID", 'SHAPE#'] + parcelBufferFields) as sc, arcpy.da.InsertCursor(outputLayer, outputFields) as ic:
for row in sc:
oid = row[0]
shape = row[1]
print (oid)
print "Got this far"
selectedBuffer.setSelectionSet('NEW', [oid])
arcpy.SelectLayerByLocation_management(intersectionFeatureLayer,"intersect", selectedBuffer)
with arcpy.da.SearchCursor(intersectionFeatureLayer, ['SHAPE#'] + landUseFields) as intersectionCursor:
for record in intersectionCursor:
recordShape = record[0]
print "list made"
outputShape = shape.intersect(recordShape, 4)
newRow = orderFields(row[2:], record[1:]) + [outputShape]
if len(newRow) != len(outputFields):
print 'there is a problem. the number of columns in the record you are attempting to insert into', outputLayer, 'does not match the number of destination columns'
print '\tattempting to insert:', newRow
print '\tinto these columns:', outputFields
continue
# insert into the outputFeatureClass
ic.insertRow(newRow)
Your with statement where you define the cursors is creating a input cursor with 5 fields, but your row you are trying to feed it is only 3 fields. You need to make sure your insert cursor is the same length as the row. I suspect the problem is actually in the orderfields method. Or what you pass to it.

opening shelf file / dbm file returns error dbmError created file using dbm.open and shelve.Shelf.open

python 3.4.2, on Linux
I'm pretty new to this language but I'm coding this project. It started as a simple program that displayed a dictionary . Well I'm trying to expand on it based on tutorials that i am reading. I came to one about shelving and being able to preserve info in a save file in a format much like a dictionary. So far i have a program that takes input and updates the dictionary based on the input. It's very basic but it works on a simple level but naturally I would want to save what I entered. Following is the code so far.
the updateSaveP1() function is what is giving me trouble. Although not coded like this currently, i would ultimately like the function to take 2 arguments, one to name the key in the shelf file, and one to reference the target dictionary/ list etc. Currently its not even saving to the file.
the loadINV() function is a place holder and doent work as coded currently. i need to figure out the dbm problem first as i get the same dbmError with the load function too.
I originally just .opened the file. found documentation here at Stack that i should open it with either so that it creates the right file. I have tried both to no avail.
NOTICE**** this code will create an empty file on your system at the working directory of python named savedata.db
Much Thanks and appreciation for any help.
`import pygame, math, sys, os, pprint, shelve, dbm,
SAVE_LOCATION = os.path.join(os.getcwd() + '/savedata')
SAVE_FILE_LIST = os.listdir(os.getcwd())
SOURCE = os.path.dirname('inventory.py')
YES = ["y","Y"]
NO = ["n","N"]
playerStash = {"rope": 77,
"giant's toe" : 1,
"gold" : 420}
'''def loadINV():
fileTOload = dbm.open('savedata.db', 'w')
print('opened savedata file')
#for eachLine in fileTOload:
playerStash.update(str(fileTOload.read())
)
print('updated dict')
fileTOload.close()'''
def checkSavesFile():
while True:
if os.path.exists(SAVE_LOCATION):
print('Save file found')
break
elif os.path.exists(SAVE_LOCATION + '.db'):
print('.db Save file found')
loadINV()
break
else:
updateSaveP1()
print('New Save Created')
break
def updateSaveP1():
with dbm.open('savedata', 'c') as save:
save['player1'] = str(playerStash)
save.close()
#print(SAVE_LOCATION) #debugging - file name format verification
#pprint.pprint(SAVE_FILE_LIST) debugging will pretty print list of files
checkSavesFile() # runs the save file check
def askAboutInv(player):
while True:
print("What item would you like to add? \n\
(leave blank and press enter to quit)")
name = input() # Reads input and checks for duplicates or non entries
if name == '':
break # Stop loop
elif name in playerStash.keys():
# the check to see if input was in dictionary
dict_quant = int(playerStash.get(name, 0))
# "dict_quant" represents the value in dictionary as an integer
dict_item = str(playerStash.get(name, 0))
# "dict_item represents the value in dictionary as a string
addedItem = dict_quant + 1
#handles adding the value of the input
print("You have " + dict_item + " already, \n\
would you like to add more Y/N?")
# prints " You have "dictionary number" already"
answer = input()
# checks for input if you want to add more to inventory
if answer in YES: #checks to see if y or Y is entered
playerStash[name] = addedItem
# adds +1 to the quantity of "name" per the dict_quant variable
print("you have " + str(addedItem) + " now")
# prints " you have "new dictionary number" now"
if answer in NO: #checks to see if n or N was entered
print("Nothing added") #prints
break #ends loop
else: #if none others statements are true
if name not in playerStash.keys():
#if "name" / input is not in the dictionary
playerStash[name] = playerStash.setdefault(name, 1)
# add the item to the dictionary with a value of 1
print('Inventory updated.')
# prints
updateSaveP1()
def inventoryDisp(player):# displays dictionary pointing towards argument
print("Inventory")
item_total = 0
for eachOne in playerStash.items():
print(eachOne) # looks at and prints each item/ key in dictionary
for i, q in playerStash.items():
item_total = item_total + q #adds all the quantities / values up
print("Total number of items: " + str(item_total))
# prints total number of items in inventory
def updatedInv(player): #same as above just reads "updated inventory"
print("Updated Inventory")
item_total = 0
for eachOne in playerStash.items():
print(eachOne)
for i, q in playerStash.items():
item_total = item_total + q
print("Total number of items: " + str(item_total))
inventoryDisp(playerStash)
askAboutInv(playerStash)
updateSaveP1()
updatedInv(playerStash)`
Update*****
after changing this:
`def updateSaveP1():
with dbm.open('savedata', 'c') as save:
save['player1'] = str(playerStash)
save.close()`
to this:
`def updateSaveP1():
save = openShelf()
#save = shelve.Shelf(dbm.open('savedata', 'c')) #old code
save['player1'] = str(playerStash)
print(save['player1'])
save.close()`
it would seem that the dictionary does get saved. now the loadINV function is giving me trouble. This:
def loadINV():
fileTOload = dbm.open('savedata.db', 'w')
print('opened savedata file')
#for eachLine in fileTOload:
playerStash.update(str(fileTOload.read())
)
print('updated dict')
fileTOload.close()
is now this:
def loadINV():
file = openShelf()
print('opened savedata file')
playerStash.update(file['player1'])
print('updated dict')
fileTOload.close()
but the .update() method returns this error: which i can't seem to find any info.
Traceback (most recent call last):
File "/home/pi/inventoryShelve.py", line 58, in <module>
checkSavesFile() # runs the save file check
File "/home/pi/inventoryShelve.py", line 40, in checkSavesFile
loadINV()
File "/home/pi/inventoryShelve.py", line 25, in loadINV
playerStash.update(file['player1'])
ValueError: dictionary update sequence element #0 has length 1; 2 is required
Turned out I was saving (into the shelf) the data for the inventory dictionary as something other than a dict:
def updateSaveP1():
save = openShelf()
save['player1'] = str(playerStash)
#print(save['player1'])
save.close()
became
def updateSaveP1():
save = openShelf()
save['player1'] = dict(playerStash)
#print(save['player1'])
save.close()

Python Dictionary Throwing KeyError for Some Reason

In some code I pass a dictionary (TankDict) a string from a list. This throws a KeyError, no matter what letter I put in. When I copied and pasted the dictionary out of the context of the program and passed in the same letters from a list, they came out correctly. I have also run type(TankDict) and it comes back as 'dict'.
Here is the dictionary:
TankDict = {'E':0, 'F':1, 'G':2, 'H':3, 'I':4, 'J':5,
'K':6, 'L':7, 'M':8, 'N':9,
'O':10, 'P':11, 'Q':12, 'R':13, 'S':14, 'T':15,
'U':16, 'V':17, 'W':18, 'X':19}
The error:
enter code herechannelData[1] = tank_address_dict[channelData[1]]
KeyError: 'L'
(tank_address_dict is a function argument into which TankDict is passed)
the contents of channelData: ['447', 'L', '15', 'C']
Can anyone tell me the (probably simple) reason that this happens?
EDIT: Code!
This is the function where the error is:
def getTankID(channel,tank_address_dict,PTM_dict,channel_ref):
rawChannelData = 'NA'
for line in channel_ref:
if str(channel) in line: rawChannelData = line
if(rawChannelData == 'NA'): return -1;
channelData = rawChannelData.split(' ')
channelData.extend(['',''])
channelData[1] = channelData[1][:-1]
channelData[3] = channelData[1][-1]
channelData[1] = channelData[1][:-1]
channelData[2] = channelData[1][1:]
channelData[1] = channelData[1][:1]
print channelData #debug
print 'L' in tank_address_dict
print 'E' in tank_address_dict
print 'O' in tank_address_dict
print 'U' in tank_address_dict
print type(tank_address_dict)
channelData[1] = tank_address_dict[channelData[1]]
channelData[3] = PTM_dict[channelData[3]]
return(channelData[1:])
This is the function that calls it:
def runFile(model, datafile, time_scale, max_PEs, tank_address_dict, PMT_dict, channel_ref):
#add initSerial for ser0-4
while(True):
raw_data = datafile.readline() #intake data
if(raw_data == ''): break #End while loop if the file is done
data = raw_data.split(' ') #break up the parts of each line
del data[::2] #delete the human formatting
data[2] = data[2][:-1] #rm newline (NOTE: file must contain blank line at end!)
TankID = getTankID(data[0], tank_address_dict, PMT_dict,channel_ref)
if(TankID == -1):
print '!---Invalid channel number passed by datafile---!'; break #check for valid TankID
model[TankID[0]][TankID[1]][TankID[2]] = scale(data[2],(0,max_PEs),(0,4096))
createPackets(model)
#updateModel(ser0,ser1,ser2,ser3,ser4,packet)
data[2] = data[2]*time_scale #scale time
time.sleep(data[2]) #wait until the next event
print data #debug
if(TankID != -1): print '---File',datafile,'finished---' #report errors in file run
else: print '!---File',datafile,'finished with error---!'
And this is the code that calls that:
import hawc_func
import debug_options
#begin defs
model = hawc_func.createDataStruct() #create the data structure
TankDict = hawc_func.createTankDict() #tank grid coordinate conversion table
PTMDict = hawc_func.createPMTDict() #PMT conversion table
log1 = open('Logs/log1.txt','w') #open a logfile
data = open('Data/event.txt','r') #open data
channel_ref = open('aux_files/channel_map.dat','r')
time_scale = 1 #0-1 number to scale nano seconds? to seconds
#end defs
hawc_func.runFile(model,data,4000,TankDict,PTMDict,time_scale,channel_ref)
#hawc_func.runFile(model,data,TankDict,PTMDict)
#close files
log1.close()
data.close()
#end close files
print '-----Done-----' #confirm tasks finished
tank_address_dict is created through this function, run by the 3rd block of code, then passed on through the other two:
def createTankDict():
TankDict = {'E':0, 'F':1, 'G':2, 'H':3, 'I':4, 'J':5,
'K':6, 'L': 7, 'M':8, 'N':9,
'O':10, 'P':11, 'Q':12, 'R':13, 'S':14, 'T':15,
'U':16, 'V': 17, 'W':18, 'X':19}
return TankDict
You are not passing your arguments correctly.
def runFile(model, datafile, time_scale, max_PEs, tank_address_dict, PMT_dict, channel_ref):
hawc_func.runFile(model,data,4000,TankDict,PTMDict,time_scale,channel_ref)
Here, you have max_PEs = TankDict.
That may not be your only problem. Fix that first, and if you are still having problems, update your post with your fixed code and then tell us what your new error is.

Categories

Resources