WUnderground, Extraction of Extremes Today - python

As contributor to WUnderground not a problem to read via API-call the JSON-outputfile with Today's values for my station.
That JSON-file has a series of numbered 'bins', with the series growing with time from 00:00.
In each numbered 'bin' an equivalent dataset reporting values.
At the end of the day a few hundred 'bins' in the JSON-file.
Avoiding setup of a local database, to find an actual survey of Extremes_Today, it is required to periodically scan the latest JSON-file from bin0 till the latest added bin in a recursive way.
It means in some way to read each numbered bin, extract&evaluate values, jump to next bin, till last bin reached & processed.
Trying the 2 approaches below in a Python-script: these 2 script-segments just should check & report that a bin exists. The scriptlines till 442 do other jobs (incl. complete read-out of bin=0 for references), already running without error.
# Line 442 = In WU JSON-Output Today's Data find & process next Bin upto/incl. last Bin
# Example call-string for ToDay-info = https://api.weather.com/v2/pws/observations/all/1day?stationId=KMAHANOV10&format=json&units=m&apiKey=yourApiKey
# Extracting contents of the JSON-file by the scriptlines below
# page = urllib.urlopen('https://api.weather.com/v2/pws/observations/all/1day?stationId=KMAHANOV10&format=json&units=m&apiKey=yourApiKey')
# content_test = page.read()
# obj_test2 = json.loads(content_test)
# Extraction of a value is like
# Epochcheck = obj_test2['observations'][Bin]['epoch']
# 'epoch' is present as element in all bins of the JSON-file (with trend related to the number of the bin) and therefore choosen as key for scan & search. If not found, then that bin not existing = passed last present bin
# Bin [0] earlier separately has been processed => initial contents at 00:00 = references for Extremes-search
# GENERAL setup of the scanning function:
# Bin = 0
# while 'epoch' exists
# Read 'epoch' & translate to CET/LocalTime
# Compare values of Extremes in that bin with earlier Extremes
# if hi_value higher than hiExtreme => new hiExtreme & adapt HiTime (= translated 'epoch')
# if low_value lower than LowExtreme => new lowExtreme & adapt LowTime (= translated 'epoch')
# Bin = Bin + 1
# Approach1
Bin = 0
Epochcheck = obj_test2['observations'][0]['epoch']
try:
Epochcheck = obj_test2['observations'][Bin]['epoch']
print(Bin)
Bin += 1
except NameError:
Epochcheck = None
# Approach2
Bin = 0
Epochcheck = obj_test2['observations'][0]['epoch']
While Epochcheck is not None:
Epochcheck = obj_test2['observations'][Bin]['epoch']
Print(Bin)
Bin += 1
Approach1 does not throw an error, but it steps out at Bin = 1.
Approach2 reports a syntax error.
File "/home/pi/domoticz/scripts/python/URL_JSON_WU_to_HWA_Start01a_0186.py", line 476
While Epochcheck is not None:
^
SyntaxError: invalid syntax
Apparently the checkline with dynamically variable contents for Bin cannot be set up in this way: the dynamic setting of variable Bin must be inserted/described in a different way.
Epochcheck = obj_test2['observations'][Bin]['epoch']
What is in Python the appropriate way to perform such JSON-scanning using a dynamic variable [Bin]?
Or simpler way of scan&extract a series of Bins in a JSON-file?

Related

How to rewrite my Pinescript code to Python

I am trying to rewrite this code to Python:
src = input.source(close, "Source")
volStop(src) =>
var max = src
var min = src
max := math.max(max, src)
min := math.min(min, src)
[max, min]
[max, min] = volStop(src)
plot(max, "Max", style=plot.style_cross)
Precisely I have a problem with these lines:
max := math.max(max, src)
min := math.min(min, src)
In Python I have a function, leta call it func1, and I want to get the same result the Pinescript is returning.
I have only tried for loop since from what I understand calling a function in Pinescript works kind of like for loop. And I tried to replicate the calculation but I couldn't achieve the expected results.
This is the line that is being plotted on Tradingview:
And this is the line that is being plotted in Python: the area that is framed with red square is the approximate area that is visible on Tradingview screenshot.
My current code:
maxmin = pd.DataFrame()
maxmin["max"] = price_df[f"{name_var}"]
maxmin["min"] = price_df[f"{name_var}"]
for i in range(price_df.shape[0]):
maxmin["max"].iloc[i] = max(maxmin["max"].shift(1).fillna(0).iloc[i], price_df[f"{name_var}"].iloc[i])
maxmin["min"].iloc[i] = min(maxmin["min"].shift(1).fillna(0).iloc[i], price_df[f"{name_var}"].iloc[i])
The name_var variable is set to 'Close' column.
How can I rewrite the Pinescript code to Python to get the same results?
Pine Script basically runs your code on every bar and store variables into a history (series). This is why you can access old data.
I think the problem here is you should give the same input data to get the same output. The calculation starts from the 1st available bar, where bar_index is 0. You can scroll to the left to see at which time the 1st bar is. Then your input data in python should be the same. Or you can restrict your Pine Script to start calculation like this:
start_time = input.time(timestamp("1 Jan 2023 00:00 +0000"), "Date")
// ...
volStop(src) =>
var max = src
var min = src
if time >= start_time
max := math.max(max, src)
min := math.min(min, src)
[max, min]
else
[na, na]
// ...
The python code doesn't need the shift and fillna. The input data should not have na at all, because you need the same data as on TradingView, which has no na. So the for cycle and the builtin max/min do the job:
for i in range(price_df.shape[0]):
maxmin["max"].iloc[i] = max(maxmin["max"].iloc[i], price_df[f"{name_var}"].iloc[i])
maxmin["min"].iloc[i] = min(maxmin["min"].iloc[i], price_df[f"{name_var}"].iloc[i])
But it is slow because of the iteration and value access one by one. You can use pandas methods here:
maxmin['max'] = price_df[name_var].cummax()
maxmin['min'] = price_df[name_var].cummin()

Reading a binary file using np.fromfile()

I have a binary file that has numerous sections. Each section has its own pattern (i.e. the placement of integers, floats, and strings).
The pattern of each section is known. However, the number of times that pattern occurs within the section is unknown. Each record is in between two same integers. These integers indicate the size of the record. The section name is in between two integer record length variables: 8 and 8. Also within each section, there are multiple records (which are known).
Header
---------------------
Known header pattern
---------------------
8 Section One 8
---------------------
Section One pattern repeating i times
---------------------
8 Section Two 8
---------------------
Section Two pattern repeating j times
---------------------
8 Section Three 8
---------------------
Section Three pattern repeating k times
---------------------
Here was my approach:
Loop through and read each record using f.read(record_length), if the record is 8 bytes, convert to string, this will be the section name.
Then i call: np.fromfile(file,dtype=section_pattern,count=n)
I am calling np.fromfile for each section.
The issue I am having is two fold:
How do I determine n for each section without doing a first pass read?
Reading each record to find a section name seems rather inefficient. Is there a more efficient way to accomplish this?
The section names are always between two integer record variables: 8 and 8.
Here is a sample code, note that in this case i do not have to specify count since the OES section is the last section:
with open('m13.op2', "rb") as f:
filesize = os.fstat(f.fileno()).st_size
f.seek(108,1) # skip header
while True:
rec_len_1 = unpack_int(f.read(4))
record_bytes = f.read(rec_len_1)
rec_len_2 = unpack_int(f.read(4))
record_num = record_num + 1
if rec_len_1==8:
tablename = unpack_string(record_bytes).strip()
if tablename == 'OES':
OES = [
# Top keys
('1','i4',1),('op2key7','i4',1),('2','i4',1),
('3','i4',1),('op2key8','i4',1),('4','i4',1),
('5','i4',1),('op2key9','i4',1),('6','i4',1),
# Record 2 -- IDENT
('7','i4',1),('IDENT','i4',1),('8','i4',1),
('9','i4',1),
('acode','i4',1),
('tcode','i4',1),
('element_type','i4',1),
('subcase','i4',1),
('LSDVMN','i4',1), # Load set number
('UNDEF(2)','i4',2), # Undefined
('LOADSET','i4',1), # Load set number or zero or random code identification number
('FCODE','i4',1), # Format code
('NUMWDE(C)','i4',1), # Number of words per entry in DATA record
('SCODE(C)','i4',1), # Stress/strain code
('UNDEF(11)','i4',11), # Undefined
('THERMAL(C)','i4',1), # =1 for heat transfer and 0 otherwise
('UNDEF(27)','i4',27), # Undefined
('TITLE(32)','S1',32*4), # Title
('SUBTITL(32)','S1',32*4), # Subtitle
('LABEL(32)','S1',32*4), # Label
('10','i4',1),
# Record 3 -- Data
('11','i4',1),('KEY1','i4',1),('12','i4',1),
('13','i4',1),('KEY2','i4',1),('14','i4',1),
('15','i4',1),('KEY3','i4',1),('16','i4',1),
('17','i4',1),('KEY4','i4',1),('18','i4',1),
('19','i4',1),
('EKEY','i4',1), #Element key = 10*EID+Device Code. EID = (Element key)//10
('FD1','f4',1),
('EX1','f4',1),
('EY1','f4',1),
('EXY1','f4',1),
('EA1','f4',1),
('EMJRP1','f4',1),
('EMNRP1','f4',1),
('EMAX1','f4',1),
('FD2','f4',1),
('EX2','f4',1),
('EY2','f4',1),
('EXY2','f4',1),
('EA2','f4',1),
('EMJRP2','f4',1),
('EMNRP2','f4',1),
('EMAX2','f4',1),
('20','i4',1)]
nparr = np.fromfile(f,dtype=OES)
if f.tell() == filesize:
break

I get the following error "NameError: name 'ref_atoms' is not defined" on my python code

I was wondering why i'm getting this error for the following code, as I have defined ref_atoms as ['CA']. The error occurs in line 121 where the superimposer is initiated for following line of code super_imposer.set_atoms(ref_atoms, sample_atoms)
def alignPDB(potentialTag,tagProtein):
# Select what residues numbers you wish to align
# and put them in a list
start_id = 1
end_id = 10
atoms_to_be_aligned = range(start_id, end_id + 1)
# Start the parser
pdb_parser = Bio.PDB.PDBParser(QUIET = True)
# Get the structures
ref_structure = pdb_parser.get_structure("tagProtein", "4ABN.pdb")
sample_structure = pdb_parser.get_structure("potentialTag", "2LYZ.pdb")
# Use the first model in the pdb-files for alignment
# Change the number 0 if you want to align to another structure
ref_model = ref_structure[0]
sample_model = sample_structure[0]
# Make a list of the atoms (in the structures) you wish to align.
# In this case we use CA atoms whose index is in the specified range
ref_atoms = ['CA']
sample_atoms = ['CA']
# Iterate of all chains in the model in order to find all residues
for ref_chain in ref_model:
# Iterate of all residues in each model in order to find proper atoms
for ref_res in ref_chain:
# Check if residue number ( .get_id() ) is in the list
if ref_res.get_id()[1] in atoms_to_be_aligned:
# Append CA atom to list
ref_atoms.append(ref_res['CA'])
# Do the same for the sample structure
for sample_chain in sample_model:
for sample_res in sample_chain:
if sample_res.get_id()[1] in atoms_to_be_aligned:
sample_atoms.append(sample_res['CA'])
# Now we initiate the superimposer:
super_imposer = Bio.PDB.Superimposer()
super_imposer.set_atoms(ref_atoms, sample_atoms)
super_imposer.apply(sample_model.get_atoms())
# Print RMSD:
print super_imposer.rms
# Save the aligned version of 1UBQ.pdb
io = Bio.PDB.PDBIO()
io.set_structure(sample_structure)
io.save("2LYZ_aligned.pdb")

Python: slow for loop performance on reading, extracting and writing from a list of thousands of files

I am extracting 150 different cell values from 350,000 (20kb) ascii raster files. My current code is fine for processing the 150 cell values from 100's of the ascii files, however it is very slow when running on the full data set.
I am still learning python so are there any obvious inefficiencies? or suggestions to improve the below code.
I have tried closing the 'dat' file in the 2nd function; no improvement.
dat = None
First: I have a function which returns the row and column locations from a cartesian grid.
def world2Pixel(gt, x, y):
ulX = gt[0]
ulY = gt[3]
xDist = gt[1]
yDist = gt[5]
rtnX = gt[2]
rtnY = gt[4]
pixel = int((x - ulX) / xDist)
line = int((ulY - y) / xDist)
return (pixel, line)
Second: A function to which I pass lists of 150 'id','x' and 'y' values in a for loop. The first function is called within and used to extract the cell value which is appended to a new list. I also have a list of files 'asc_list' and corresponding times in 'date_list'. Please ignore count / enumerate as I use this later; unless it is impeding efficiency.
def asc2series(id, x, y):
#count = 1
ls_id = []
ls_p = []
ls_d = []
for n, (asc,date) in enumerate(zip(asc, date_list)):
dat = gdal.Open(asc_list)
gt = dat.GetGeoTransform()
pixel, line = world2Pixel(gt, east, nort)
band = dat.GetRasterBand(1)
#dat = None
value = band.ReadAsArray(pixel, line, 1, 1)[0, 0]
ls_id.append(id)
ls_p.append(value)
ls_d.append(date)
Many thanks
In world2pixel you are setting rtnX and rtnY which you don't use.
You probably meant gdal.Open(asc) -- not asc_list.
You could move gt = dat.GetGeoTransform() out of the loop. (Rereading made me realize you can't really.)
You could cache calls to world2Pixel.
You're opening dat file for each pixel -- you should probably turn the logic around to only open files once and lookup all the pixels mapped to this file.
Benchmark, check the links in this podcast to see how: http://talkpython.fm/episodes/show/28/making-python-fast-profiling-python-code

Instructables open source code: Python IndexError: list index out of range

I've seen this error on several other questions but couldn't find the answer.
{I'm a complete stranger to Python, but I'm following the instructions from a site and I keep getting this error once I try to run the script:
IndexError: list index out of range
Here's the script:
##//txt to stl conversion - 3d printable record
##//by Amanda Ghassaei
##//Dec 2012
##//http://www.instructables.com/id/3D-Printed-Record/
##
##/*
## * This program is free software; you can redistribute it and/or modify
## * it under the terms of the GNU General Public License as published by
## * the Free Software Foundation; either version 3 of the License, or
## * (at your option) any later version.
##*/
import wave
import math
import struct
bitDepth = 8#target bitDepth
frate = 44100#target frame rate
fileName = "bill.wav"#file to be imported (change this)
#read file and get data
w = wave.open(fileName, 'r')
numframes = w.getnframes()
frame = w.readframes(numframes)#w.getnframes()
frameInt = map(ord, list(frame))#turn into array
#separate left and right channels and merge bytes
frameOneChannel = [0]*numframes#initialize list of one channel of wave
for i in range(numframes):
frameOneChannel[i] = frameInt[4*i+1]*2**8+frameInt[4*i]#separate channels and store one channel in new list
if frameOneChannel[i] > 2**15:
frameOneChannel[i] = (frameOneChannel[i]-2**16)
elif frameOneChannel[i] == 2**15:
frameOneChannel[i] = 0
else:
frameOneChannel[i] = frameOneChannel[i]
#convert to string
audioStr = ''
for i in range(numframes):
audioStr += str(frameOneChannel[i])
audioStr += ","#separate elements with comma
fileName = fileName[:-3]#remove .wav extension
text_file = open(fileName+"txt", "w")
text_file.write("%s"%audioStr)
text_file.close()
Thanks a lot,
Leart
Leart - check these it may help:
Is your input file in correct format? As I see it, you need to produce that file before hand before you can use it in this program... Post that file in here as well.
Check if your bitrate and frame rates are correct
Just for debugging purposes (if the code is correct, this may not produce correct results, but good for testing). You are accessing frameInt[4*i+1], with index i multiplied by 4 then adding 1 (going beyond the frameInt index eventually).
Add an 'if' to check size before accessing the array element in frameInt:
if len(frameInt)>=(4*i+1):
Add that statement right after the first occurence of "for i in range(numframes):" and just before "frameOneChannel[i] = frameInt[4*i+1]*2**8+frameInt[4*i]#separate channels and store one channel in new list"
*watch tab spaces

Categories

Resources