How to optimize the finding of divergences between 2 signals

How to optimize the finding of divergences between 2 signals - python

I am trying to create an indicator that will find all the divergences between 2 signals.
The output of the function so far looks like this
But the problem is that is painfully slow when I am trying to use it with long signals. Could any of you guys help me to make it faster if is possible?
My code:
def find_divergence(price: pd.Series, indicator: pd.Series, width_divergence: int, order: int):
div = pd.DataFrame(index=range(price.size), columns=[
f"Bullish_{width_divergence}_{order}",
f"Berish_{width_divergence}_{order}"
])
div[f'Bullish_idx_{width_divergence}_{order}'] = False
div[f'Berish_idx_{width_divergence}_{order}'] = False
def calc_argrelextrema(price_: np.numarray):
return argrelextrema(price_, np.less_equal, order=order)[0]
price_ranges = []
for i in range(len(price)):
price_ranges.append(price.values[0:i + 1])
f = []
with ThreadPoolExecutor(max_workers=16) as exe:
for i in price_ranges:
f.append(exe.submit(calc_argrelextrema, i))
prices_lows = SortedSet()
for r in concurrent.futures.as_completed(f):
data = r.result()
for d in reversed(data):
if d not in prices_lows:
prices_lows.add(d)
else:
break
price_lows_idx = pd.Series(prices_lows)
for idx_1 in range(price_lows_idx.size):
min_price = price[price_lows_idx[idx_1]]
min_indicator = indicator[price_lows_idx[idx_1]]
for idx_2 in range(idx_1 + 1, idx_1 + width_divergence):
if idx_2 >= price_lows_idx.size:
break
if price[price_lows_idx[idx_2]] < min_price:
min_price = price[price_lows_idx[idx_2]]
if indicator[price_lows_idx[idx_2]] < min_indicator:
min_indicator = indicator[price_lows_idx[idx_2]]
consistency_price_rd = min_price == price[price_lows_idx[idx_2]]
consistency_indicator_rd = min_indicator == indicator[price_lows_idx[idx_1]]
consistency_price_hd = min_price == price[price_lows_idx[idx_1]]
consistency_indicator_hd = min_indicator == indicator[price_lows_idx[idx_2]]
diff_price = price[price_lows_idx[idx_1]] - price[price_lows_idx[idx_2]] # should be neg
diff_indicator = indicator[price_lows_idx[idx_1]] - indicator[price_lows_idx[idx_2]] # should be pos
is_regular_divergence = diff_price > 0 and diff_indicator < 0
is_hidden_divergence = diff_price < 0 and diff_indicator > 0
if is_regular_divergence and consistency_price_rd and consistency_indicator_rd:
div.at[price_lows_idx[idx_2], f'Bullish_{width_divergence}_{order}'] = (price_lows_idx[idx_1], price_lows_idx[idx_2])
div.at[price_lows_idx[idx_2], f'Bullish_idx_{width_divergence}_{order}'] = True
elif is_hidden_divergence and consistency_price_hd and consistency_indicator_hd:
div.at[price_lows_idx[idx_2], f'Berish_{width_divergence}_{order}'] = (price_lows_idx[idx_1], price_lows_idx[idx_2])
div.at[price_lows_idx[idx_2], f'Berish_idx_{width_divergence}_{order}'] = True
return div

Related

parallelized tasks are not ditributed a cross available cpus

As shown in the code posted below in section DecoupleGridCellsProfilerLoopsPool, the run() is called as much times as the contents of the self.__listOfLoopDecouplers and it works as it supposed to be, i mean the parallelization is working duly.
as shown in the same section, DecoupleGridCellsProfilerLoopsPool.pool.map returns results and i populate some lists,lets discuss the list names self.__iterablesOfZeroCoverageCell it contains number of objects of type gridCellInnerLoopsIteratorsForZeroCoverageModel.
After that, i created the pool ZeroCoverageCellsProcessingPool with the code as posted below as well.
The problem i am facing, is the parallelized code in ZeroCoverageCellsProcessingPool is very slow and the visulisation of the cpu tasks shows that there are no processes work in parallel as shown in the video contained in url posted below.
i was suspicious about the pickling issues related to when parallelizing the code in ZeroCoverageCellsProcessingPool,so i removed the enitre body of the run() in ZeroCoverageCellsProcessingPool. however, it shows no change in the behaviour of the parallelized code.
also the url posted below shown how the parallelized methoth of ZeroCoverageCellsProcessingPool behaves.
given the code posted below, please let me know why the parallelization does not work for code in ZeroCoverageCellsProcessingPool
output url:please click the link
output url
DecoupleGridCellsProfilerLoopsPool
def postTask(self):
self.__postTaskStartTime = time.time()
with Pool(processes=int(config['MULTIPROCESSING']['proceses_count'])) as DecoupleGridCellsProfilerLoopsPool.pool:
self.__chunkSize = PoolUtils.getChunkSize(lst=self.__listOfLoopDecouplers,cpuCount=int(config['MULTIPROCESSING']['cpu_count']))
logger.info(f"DecoupleGridCellsProfilerLoopsPool.self.__chunkSize(task per processor):{self.__chunkSize}")
for res in DecoupleGridCellsProfilerLoopsPool.pool.map(self.run,self.__listOfLoopDecouplers,chunksize=self.__chunkSize):
if res[0] is not None and res[1] is None and res[2] is None:
self.__iterablesOfNoneZeroCoverageCell.append(res[0])
elif res[1] is not None and res[0] is None and res[2] is None:
self.__iterablesOfZeroCoverageCell.append(res[1])
elif res[2] is not None and res[0] is None and res[1] is None:
self.__iterablesOfNoDataCells.append(res[2])
else:
raise Exception (f"WTF.")
DecoupleGridCellsProfilerLoopsPool.pool.join()
assert len(self.__iterablesOfNoneZeroCoverageCell)+len(self.__iterablesOfZeroCoverageCell)+len(self.__iterablesOfNoDataCells) == len(self.__listOfLoopDecouplers)
zeroCoverageCellsProcessingPool = ZeroCoverageCellsProcessingPool(self.__devModeForWSAWANTIVer2,self.__iterablesOfZeroCoverageCell)
zeroCoverageCellsProcessingPool.postTask()
def run(self,param:LoopDecoupler):
row = param.getRowValue()
col = param.getColValue()
elevationsTIFFWindowedSegmentContents = param.getElevationsTIFFWindowedSegment()
verticalStep = param.getVericalStep()
horizontalStep = param.getHorizontalStep()
mainTIFFImageDatasetContents = param.getMainTIFFImageDatasetContents()
NDVIsTIFFWindowedSegmentContentsInEPSG25832 = param.getNDVIsTIFFWindowedSegmentContentsInEPSG25832()
URLOrFilePathForElevationsTIFFDatasetInEPSG25832 = param.getURLOrFilePathForElevationsTIFFDatasetInEPSG25832()
threshold = param.getThreshold()
rowsCnt = 0
colsCnt = 0
pixelsValuesSatisfyThresholdInTIFFImageDatasetCnt = 0
pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt = int(config['window']['width']) * int(config['window']['height'])
pixelsWithNoDataValueInTIFFImageDatasetCnt = int(config['window']['width']) * int(config['window']['height'])
_pixelsValuesSatisfyThresholdInNoneZeroCoverageCell = []
_pixelsValuesDoNotSatisfyThresholdInZeroCoverageCell = []
_pixelsValuesInNoDataCell = []
gridCellInnerLoopsIteratorsForNoneZeroCoverageModel = None
gridCellInnerLoopsIteratorsForZeroCoverageModel = None
gridCellInnerLoopsIteratorsForNoDataCellsModel = None
for x in range(row,row + verticalStep):
if rowsCnt == verticalStep:
rowsCnt = 0
for y in range(col,col + horizontalStep):
if colsCnt == horizontalStep:
colsCnt = 0
pixelValue = mainTIFFImageDatasetContents[0][x][y]
# windowIOUtils.writeContentsToFile(windowIOUtils.getPathToOutputDir()+"/"+config['window']['file_name']+".{0}".format(config['window']['file_extension']), "pixelValue:{0}\n".format(pixelValue))
if pixelValue >= float(threshold):
pixelsValuesSatisfyThresholdInTIFFImageDatasetCnt+=1
_pixelsValuesSatisfyThresholdInNoneZeroCoverageCell.append(elevationsTIFFWindowedSegmentContents[0][rowsCnt][colsCnt])
elif ((pixelValue < float(threshold)) and (pixelValue > float(config['TIFF']['no_data_value']))):
pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt-=1
_pixelsValuesDoNotSatisfyThresholdInZeroCoverageCell.append(elevationsTIFFWindowedSegmentContents[0][rowsCnt][colsCnt])
elif (pixelValue <= float(config['TIFF']['no_data_value'])):
pixelsWithNoDataValueInTIFFImageDatasetCnt-=1
_pixelsValuesInNoDataCell.append(elevationsTIFFWindowedSegmentContents[0][rowsCnt][colsCnt])
else:
raise Exception ("WTF.Exception: unhandled condition for pixel value: {0}".format(pixelValue))
# _pixelCoordinatesInWindow.append([x,y])
colsCnt+=1
rowsCnt+=1
'''Grid-cell classfication'''
if (pixelsValuesSatisfyThresholdInTIFFImageDatasetCnt > 0):
gridCellInnerLoopsIteratorsForNoneZeroCoverageModel = GridCellInnerLoopsIteratorsForNoneZeroCoverageModel()
gridCellInnerLoopsIteratorsForNoneZeroCoverageModel.setRowValue(row)
gridCellInnerLoopsIteratorsForNoneZeroCoverageModel.setColValue(col)
gridCellInnerLoopsIteratorsForNoneZeroCoverageModel.setVericalStep(verticalStep)
gridCellInnerLoopsIteratorsForNoneZeroCoverageModel.setHorizontalStep(horizontalStep)
gridCellInnerLoopsIteratorsForNoneZeroCoverageModel.setMainTIFFImageDatasetContents(mainTIFFImageDatasetContents)
gridCellInnerLoopsIteratorsForNoneZeroCoverageModel.setNDVIsTIFFWindowedSegmentContentsInEPSG25832(NDVIsTIFFWindowedSegmentContentsInEPSG25832)
gridCellInnerLoopsIteratorsForNoneZeroCoverageModel.setURLOrFilePathForElevationsTIFFDatasetInEPSG25832(URLOrFilePathForElevationsTIFFDatasetInEPSG25832)
gridCellInnerLoopsIteratorsForNoneZeroCoverageModel.setPixelsValuesSatisfyThresholdInTIFFImageDatasetCnt(pixelsValuesSatisfyThresholdInTIFFImageDatasetCnt)
gridCellInnerLoopsIteratorsForNoneZeroCoverageModel.setPixelsValuesSatisfyThresholdInNoneZeroCoverageCell(_pixelsValuesSatisfyThresholdInNoneZeroCoverageCell)
elif (pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt < (int(config['window']['width']) * int(config['window']['height'])) and pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt >= 0):
gridCellInnerLoopsIteratorsForZeroCoverageModel = GridCellInnerLoopsIteratorsForZeroCoverageModel()
gridCellInnerLoopsIteratorsForZeroCoverageModel.setRowValue(row)
gridCellInnerLoopsIteratorsForZeroCoverageModel.setColValue(col)
gridCellInnerLoopsIteratorsForZeroCoverageModel.setVericalStep(verticalStep)
gridCellInnerLoopsIteratorsForZeroCoverageModel.setHorizontalStep(horizontalStep)
gridCellInnerLoopsIteratorsForZeroCoverageModel.setMainTIFFImageDatasetContents(mainTIFFImageDatasetContents)
gridCellInnerLoopsIteratorsForZeroCoverageModel.setNDVIsTIFFWindowedSegmentContentsInEPSG25832(NDVIsTIFFWindowedSegmentContentsInEPSG25832)
gridCellInnerLoopsIteratorsForZeroCoverageModel.setURLOrFilePathForElevationsTIFFDatasetInEPSG25832(URLOrFilePathForElevationsTIFFDatasetInEPSG25832)
gridCellInnerLoopsIteratorsForZeroCoverageModel.setPixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt(pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt)
gridCellInnerLoopsIteratorsForZeroCoverageModel.setPixelsWithNoDataValueInTIFFImageDatasetCnt(pixelsWithNoDataValueInTIFFImageDatasetCnt)
gridCellInnerLoopsIteratorsForZeroCoverageModel.setPixelsValuesDoNotSatisfyThresholdInZeroCoverageCell(_pixelsValuesDoNotSatisfyThresholdInZeroCoverageCell)
elif (pixelsWithNoDataValueInTIFFImageDatasetCnt == 0):
gridCellInnerLoopsIteratorsForNoDataCellsModel = GridCellInnerLoopsIteratorsForNoDataCellsModel()
gridCellInnerLoopsIteratorsForNoDataCellsModel.setRowValue(row)
gridCellInnerLoopsIteratorsForNoDataCellsModel.setColValue(col)
gridCellInnerLoopsIteratorsForNoDataCellsModel.setVericalStep(verticalStep)
gridCellInnerLoopsIteratorsForNoDataCellsModel.setHorizontalStep(horizontalStep)
gridCellInnerLoopsIteratorsForNoDataCellsModel.setMainTIFFImageDatasetContents(mainTIFFImageDatasetContents)
gridCellInnerLoopsIteratorsForNoDataCellsModel.setNDVIsTIFFWindowedSegmentContentsInEPSG25832(NDVIsTIFFWindowedSegmentContentsInEPSG25832)
gridCellInnerLoopsIteratorsForNoDataCellsModel.setURLOrFilePathForElevationsTIFFDatasetInEPSG25832(URLOrFilePathForElevationsTIFFDatasetInEPSG25832)
gridCellInnerLoopsIteratorsForNoDataCellsModel.setPixelsWithNoDataValueInTIFFImageDatasetCnt(pixelsWithNoDataValueInTIFFImageDatasetCnt)
gridCellInnerLoopsIteratorsForNoDataCellsModel.setPixelsValuesInNoDataCell(_pixelsValuesInNoDataCell)
if gridCellInnerLoopsIteratorsForZeroCoverageModel is not None:
gridCellInnerLoopsIteratorsForZeroCoverageModel.setPixelsWithNoDataValueInTIFFImageDatasetCnt(pixelsWithNoDataValueInTIFFImageDatasetCnt)
else:
raise Exception (f"WTF.")
return gridCellInnerLoopsIteratorsForNoneZeroCoverageModel,gridCellInnerLoopsIteratorsForZeroCoverageModel,gridCellInnerLoopsIteratorsForNoDataCellsModel
ZeroCoverageCellsProcessingPool:
def postTask(self):
self.__postTaskStartTime = time.time()
"""to collect results per each row
"""
resAllCellsForGridCellsClassifications = []
# NDVIs
resAllCellsForNDVITIFFDetailsForZeroCoverageCell = []
# area of coverage
resAllCellsForAreaOfCoverageForZeroCoverageCell = []
# interception
resAllCellsForInterceptionForZeroCoverageCell = []
# fourCornersOfWindowInEPSG25832
resAllCellsForFourCornersOfWindowInEPSG25832ZeroCoverageCell = []
# outFromEPSG25832ToEPSG4326-lists
resAllCellsForOutFromEPSG25832ToEPSG4326ForZeroCoverageCells = []
# fourCornersOfWindowsAsGeoJSON
resAllCellsForFourCornersOfWindowsAsGeoJSONInEPSG4326ForZeroCoverageCell = []
# calculatedCenterPointInEPSG25832
resAllCellsForCalculatedCenterPointInEPSG25832ForZeroCoverageCell = []
# centerPointsOfWindowInImageCoordinatesSystem
resAllCellsForCenterPointsOfWindowInImageCoordinatesSystemForZeroCoverageCell = []
# pixelValuesOfCenterPoints
resAllCellsForPixelValuesOfCenterPointsForZeroCoverageCell = []
# centerPointOfKeyWindowAsGeoJSONInEPSG4326
resAllCellsForCenterPointOfKeyWindowAsGeoJSONInEPSG4326ForZeroCoverageCell = []
# centerPointInEPSG4326
resAllCellsForCenterPointInEPSG4326ForZeroCoveringCell = []
# average heights
resAllCellsForAverageHeightsForZeroCoverageCell = []
# pixels values
resAllCellsForPixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCell = []
# area Of Coverage
resAllCellsForAreaOfCoverageForZeroCoverageCell = []
resAllCellsForPixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt = []
noneKeyWindowCnt=0
# center points as string
centerPointsAsStringForZeroCoverageCell = ""
with Pool(processes=int(config['MULTIPROCESSING']['proceses_count'])) as ZeroCoverageCellsProcessingPool.pool:
self.__chunkSize = PoolUtils.getChunkSize(lst=self.__iterables,cpuCount=int(config['MULTIPROCESSING']['cpu_count']))
logger.info(f"ZeroCoverageCellsProcessingPool.self.__chunkSize(task per processor):{self.__chunkSize}")
for res in ZeroCoverageCellsProcessingPool.pool.map(func=self.run,iterable=self.__iterables,chunksize=self.__chunkSize):
resAllCellsForGridCellsClassifications.append(res[0])
# NDVIs
resAllCellsForNDVITIFFDetailsForZeroCoverageCell.append(res[1])
# area of coverage
resAllCellsForAreaOfCoverageForZeroCoverageCell.append(res[2])
# interception
resAllCellsForInterceptionForZeroCoverageCell.append(res[3])
# fourCornersOfWindowInEPSG25832
resAllCellsForFourCornersOfWindowInEPSG25832ZeroCoverageCell.append(res[4])
# outFromEPSG25832ToEPSG4326-lists
resAllCellsForOutFromEPSG25832ToEPSG4326ForZeroCoverageCells.append(res[5])
# fourCornersOfWindowsAsGeoJSONInEPSG4326
resAllCellsForFourCornersOfWindowsAsGeoJSONInEPSG4326ForZeroCoverageCell.append(res[6])
# calculatedCenterPointInEPSG25832
resAllCellsForCalculatedCenterPointInEPSG25832ForZeroCoverageCell.append(res[7])
# centerPointsOfWindowInImageCoordinatesSystem
resAllCellsForCenterPointsOfWindowInImageCoordinatesSystemForZeroCoverageCell.append(res[8])
# pixelValuesOfCenterPoints
resAllCellsForPixelValuesOfCenterPointsForZeroCoverageCell.append(res[9])
# centerPointInEPSG4326
resAllCellsForCenterPointInEPSG4326ForZeroCoveringCell.append(res[10])
# centerPointOfKeyWindowAsGeoJSONInEPSG4326
resAllCellsForCenterPointOfKeyWindowAsGeoJSONInEPSG4326ForZeroCoverageCell.append(res[11])
# average heights
resAllCellsForAverageHeightsForZeroCoverageCell.append(res[12])
# pixels values
resAllCellsForPixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCell.append(res[13])
# pixelsValues cnt
resAllCellsForPixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt.append(res[14])
noneKeyWindowCnt +=res[15]
# centerPoints-As-String
if (res[16] is not None):
centerPointsAsStringForZeroCoverageCell+=str(res[16])
assert noneKeyWindowCnt == len(self.__iterables)
ZeroCoverageCellsProcessingPool.pool.close()
ZeroCoverageCellsProcessingPool.pool.terminate()
ZeroCoverageCellsProcessingPool.pool.join()
return
def run(self,params:GridCellInnerLoopsIteratorsForZeroCoverageModel):
if params is not None:
logger.info(f"Processing zero coverage cell #(row{params.getRowValue()},col:{params.getColValue()})")
row = params.getRowValue()
col = params.getColValue()
mainTIFFImageDatasetContents = params.getMainTIFFImageDatasetContents()
NDVIsTIFFWindowedSegmentContentsInEPSG25832 = params.getNDVIsTIFFWindowedSegmentContentsInEPSG25832()
URLOrFilePathForElevationsTIFFDatasetInEPSG25832 = params.getURLOrFilePathForElevationsTIFFDatasetInEPSG25832()
datasetElevationsTIFFInEPSG25832 = rasterio.open(URLOrFilePathForElevationsTIFFDatasetInEPSG25832,'r')
_pixelsValuesDoNotSatisfyThresholdInZeroCoverageCell = params.getPixelsValuesDoNotSatisfyThresholdInZeroCoverageCell()
pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt = params.getPixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt()
countOfNoDataCells = params.getPixelsWithNoDataValueInTIFFImageDatasetCnt()
outFromEPSG25832ToEPSG4326ForZeroCoverageCells = []
fourCornersOfWindowsAsGeoJSONInEPSG4326ForZeroCoverageCell = []
ndviTIFFDetailsForZeroCoverageCell = NDVITIFFDetails(None,None,None).getNDVIValuePer10mX10m()
"""area of coverage per grid-cell"""
areaOfCoverageForZeroCoverageCell = None
""""interception"""
interceptionForZeroCoverageCell = None
CntOfNDVIsWithNanValueInZeroCoverageCell = 0
fourCornersOfWindowInEPSG25832ZeroCoverageCell = None
outFromEPSG25832ToEPSG4326ForZeroCoverageCells = []
fourCornersOfWindowsAsGeoJSONInEPSG4326ForZeroCoverageCell = []
calculatedCenterPointInEPSG25832ForZeroCoverageCell = None
centerPointsOfWindowInImageCoordinatesSystemForZeroCoverageCell = None
pixelValuesOfCenterPointsOfZeroCoverageCell = None
centerPointInEPSG4326ForZeroCoveringCell = None
centerPointOfKeyWindowAsGeoJSONInEPSG4326ForZeroCoverageCell = None
centerPointsAsStringForZeroCoverageCell = None
"""average heights"""
averageHeightsForZeroCoverageCell = None
gridCellClassifiedAs = GridCellClassifier.ZERO_COVERAGE_CELL.value
cntOfNoneKeyWindow = 1
ndviTIFFDetailsForZeroCoverageCell = NDVITIFFDetails(ulX=row//int(config['ndvi']['resolution_height']),ulY=col//int(config['ndvi']['resolution_width']),dataset=NDVIsTIFFWindowedSegmentContentsInEPSG25832).getNDVIValuePer10mX10m()
"""area of coverage per grid-cell"""
areaOfCoverageForZeroCoverageCell = round(AreaOfCoverageDetails(pixelsCount=(int(config['window']['width']) * int(config['window']['height'])) - (pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt + countOfNoDataCells)).getPercentageOfAreaOfCoverage(),2)
""""interception"""
if math.isnan(ndviTIFFDetailsForZeroCoverageCell):
# ndviTIFFDetailsForZeroCoverageCell = 0
CntOfNDVIsWithNanValueInZeroCoverageCell = 1
interceptionForZeroCoverageCell = config['sentinel_values']['interception']
else:
Indvi = INDVI()
Ic = Indvi.calcInterception(ndviTIFFDetailsForZeroCoverageCell)
Pc=areaOfCoverageForZeroCoverageCell,"""percentage of coverage"""
Pnc=float((int(config['window']['width'])*int(config['window']['height'])) - areaOfCoverageForZeroCoverageCell),"""percentage of non-coverage"""
Inc=float(config['interception']['noneCoverage']),"""interception of none-coverage"""
I=(float(Pc[0])*(Ic))+float((Pnc[0]*Inc[0]))
interceptionForZeroCoverageCell = round(I,2)
if I != 10 and I != float('nan'):
logger.error(f"ndviTIFFDetailsForZeroCoverageCell:{ndviTIFFDetailsForZeroCoverageCell}")
logger.error(f"I:{I}")
fourCornersOfWindowInEPSG25832ZeroCoverageCell = RasterIOPackageUtils.convertFourCornersOfWindowFromImageCoordinatesToCRSByCoordinatesOfCentersOfPixelsMethodFor(row,col,int(config['window']['height']),int(config['window']['width']),datasetElevationsTIFFInEPSG25832)
for i in range(0,len(fourCornersOfWindowInEPSG25832ZeroCoverageCell)):
# fourCornersOfKeyWindowInEPSG4326.append(RasterIOPackageUtils.convertCoordsToDestEPSGForDataset(fourCornersOfWindowInEPSG25832[i],datasetElevationsTIFFInEPSG25832,destEPSG=4326))
outFromEPSG25832ToEPSG4326ForZeroCoverageCells.append(OSGEOUtils.fromEPSG25832ToEPSG4326(fourCornersOfWindowInEPSG25832ZeroCoverageCell[i])) # resultant coords order is in form of lat,lon and it must be in lon,lat.thus, out[1]-lat out[0]-lon
"""fourCornersOfWindowsAsGeoJSONInEPSG4326"""
fourCornersOfWindowInEPSG4326 = []
for i in range(0,len(outFromEPSG25832ToEPSG4326ForZeroCoverageCells)):
fourCornersOfWindowInEPSG4326.append(([outFromEPSG25832ToEPSG4326ForZeroCoverageCells[i][1]],[outFromEPSG25832ToEPSG4326ForZeroCoverageCells[i][0]]))
fourCornersOfWindowsAsGeoJSONInEPSG4326ForZeroCoverageCell.append(jsonUtils.buildFeatureCollectionAsGeoJSONForFourCornersOfKeyWindow(fourCornersOfWindowInEPSG4326[0],fourCornersOfWindowInEPSG4326[1],fourCornersOfWindowInEPSG4326[2],fourCornersOfWindowInEPSG4326[3],areaOfCoverageForZeroCoverageCell))
# debugIOUtils.writeContentsToFile(debugIOUtils.getPathToOutputDir()+"/"+"NDVIsPer10mX10mForKeyWindow"+config['window']['file_name']+".{0}".format(config['window']['file_extension']),"{0}\n".format(NDVIsPer10mX10mForKeyWindow))
"""
building geojson object for a point "center-point" to visualize it.
"""
calculatedCenterPointInEPSG25832ForZeroCoverageCell = MiscUtils.calculateCenterPointsGivenLLOfGridCell(fourCornersOfWindowInEPSG25832ZeroCoverageCell[1])#lower-left corner
centerPointsOfWindowInImageCoordinatesSystemForZeroCoverageCell = RasterIOPackageUtils.convertFromCRSToImageCoordinatesSystemFor(calculatedCenterPointInEPSG25832ForZeroCoverageCell[0],calculatedCenterPointInEPSG25832ForZeroCoverageCell[1],datasetElevationsTIFFInEPSG25832)
pixelValuesOfCenterPointsOfZeroCoverageCell = mainTIFFImageDatasetContents[0][centerPointsOfWindowInImageCoordinatesSystemForZeroCoverageCell[0]][centerPointsOfWindowInImageCoordinatesSystemForZeroCoverageCell[1]]
centerPointInEPSG4326ForZeroCoveringCell = RasterIOPackageUtils.convertCoordsToDestEPSGForDataset(calculatedCenterPointInEPSG25832ForZeroCoverageCell,datasetElevationsTIFFInEPSG25832,destEPSG=4326)
centerPointOfKeyWindowAsGeoJSONInEPSG4326ForZeroCoverageCell = jsonUtils.buildGeoJSONForPointFor(centerPointInEPSG4326ForZeroCoveringCell)
"""average heights"""
averageHeightsForZeroCoverageCell = round(MiscUtils.getAverageFor(_pixelsValuesDoNotSatisfyThresholdInZeroCoverageCell),2)
assert len(_pixelsValuesDoNotSatisfyThresholdInZeroCoverageCell) > 0 and (len(_pixelsValuesDoNotSatisfyThresholdInZeroCoverageCell) <= (int(config['window']['width']) * int(config['window']['height'])) )
"""the following code block is for assertion only"""
if self.__devModeForWSAWANTIVer2 == config['DEVELOPMENT_MODE']['debug']:
assert pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt >= 0 and (pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt < (int(config['window']['width']) * int(config['window']['height'])) )
assert (pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt+countOfNoDataCells) == (int(config['window']['width']) * int(config['window']['height']))
print(f"profiling for gridCellClassifiedAs:{gridCellClassifiedAs}....>pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt:{pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt}")
print(f"profiling for gridCellClassifiedAs:{gridCellClassifiedAs}....>countOfNoDataCells:{countOfNoDataCells}")
pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt = (int(config['window']['width']) * int(config['window']['height'])) - (pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt + countOfNoDataCells)
assert pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt == 0, (f"WTF.")
print(f"profiling for gridCellClassifiedAs:{gridCellClassifiedAs}....>computed pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt:{pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt}")
print(f"\n")
centerPointAsTextInWKTInEPSG3857 = CoordinatesUtils.buildWKTPointFormatForSinglePointFor(calculatedCenterPointInEPSG25832ForZeroCoverageCell[0],calculatedCenterPointInEPSG25832ForZeroCoverageCell[1])
s = centerPointAsTextInWKTInEPSG3857.replace("POINT","")
s = s.replace("(","")
s = s.replace(")","")
s = s.strip()
s = s.split(" ")
centerPointsAsStringForZeroCoverageCell = s[0] + "\t" + s[1] + "\n"
centerPointsAsStringForZeroCoverageCell = centerPointsAsStringForZeroCoverageCell.replace('\'',"")
return gridCellClassifiedAs,ndviTIFFDetailsForZeroCoverageCell,areaOfCoverageForZeroCoverageCell,interceptionForZeroCoverageCell,fourCornersOfWindowInEPSG25832ZeroCoverageCell,outFromEPSG25832ToEPSG4326ForZeroCoverageCells,fourCornersOfWindowsAsGeoJSONInEPSG4326ForZeroCoverageCell,calculatedCenterPointInEPSG25832ForZeroCoverageCell,centerPointsOfWindowInImageCoordinatesSystemForZeroCoverageCell,pixelValuesOfCenterPointsOfZeroCoverageCell,centerPointInEPSG4326ForZeroCoveringCell,centerPointOfKeyWindowAsGeoJSONInEPSG4326ForZeroCoverageCell,averageHeightsForZeroCoverageCell,np.array(_pixelsValuesDoNotSatisfyThresholdInZeroCoverageCell).tolist(),pixelsValuesDoNotSatisfyThresholdInTIFFImageDatasetCnt,cntOfNoneKeyWindow,centerPointsAsStringForZeroCoverageCell,CntOfNDVIsWithNanValueInZeroCoverageCell

Infinity loop issue using for loops

import pandas as pd
import time
import yfinance as yf
import money_18
import talib
def backtest(df,us_code, profit_target, stop_loss, macd_diff):
pos_opened = False
open_price = 0
close_price = 0
pnl = 0
pnl_list = []
original_capital = 100000
temp_capital = original_capital
num_of_lot = 0
equity_value = 0
equity_value_list = []
dd_dollar = 0
dd_dollar_list = []
dd_pct = 0
dd_pct_list = []
mdd_dollar = 0
mdd_pct = 0
total_profit = 0
num_of_trade = 0
for i in range(1, len(df)):
now_date = df.loc[i,'Date']
now_open = df.loc[i,'Open']
now_high = df.loc[i,'High']
now_low = df.loc[i,'Low']
now_close = df.loc[i,'Close']
now_rsi = df.loc[i,'RSI']
now_upper_band = df.loc[i,'Upper_Band']
now_middle_band = df.loc[i,'Middle_Band']
now_lower_band = df.loc[i,'Lower_Band']
now_macd = df.loc[i,'MACD']
now_macd_signal = df.loc[i,'MACD_Signal']
now_macd_hist = df.loc[i,'MACD_Hist']
##### equity curve #####
equity_value = round(temp_capital + (now_open - open_price) * num_of_lot )
equity_value_list.append(equity_value)
temp_max_equity = max(equity_value_list)
dd_dollar = temp_max_equity - equity_value
dd_dollar_list.append(dd_dollar)
mdd_dollar = max(dd_dollar_list)
dd_pct = (temp_max_equity - equity_value) / temp_max_equity
dd_pct_list.append(dd_pct)
mdd_pct = max(dd_pct_list)
##### open position #####
if (pos_opened == False) and (i < len(df) - 1) and now_macd_hist > macd_diff :
pos_opened = True
open_price = now_close
num_of_lot = temp_capital // (open_price)
##### profit taking and stop loss #####
if (pos_opened == True) and ((now_open - open_price > profit_target * open_price) or (now_open - open_price < stop_loss * open_price) or (i == len(df) -1)):
pos_opened = False
close_price = now_open
pnl = (close_price - open_price) * num_of_lot
pnl_list.append(pnl)
open_price = 0
num_of_lot = 0
temp_capital = temp_capital + pnl
if len(pnl_list) > 0:
total_profit = sum(pnl_list)
num_of_trade = len(pnl_list)
return us_code, profit_target, stop_loss, total_profit, num_of_trade, mdd_dollar, mdd_pct, macd_diff
if __name__ == '__main__':
us_code_list = ['TSLA', 'AAPL']
macd_diff_list = [0, 0.05]
profit_target_list = [0.03, 0.06]
stop_loss_list = [-0.01, -0.02, -0.03]
start_date = '2020-01-01'
end_date = '2020-12-31'
df_dict = {}
for us_code in us_code_list:
df= yf.Ticker(us_code).history(start=start_date, end=end_date)
df= df[df['Volume'] > 0]
df = df[['Open', 'High', 'Low', 'Close']]
df['RSI'] = talib.RSI(df['Close'], timeperiod=14)
df['Upper_Band'], df['Middle_Band'], df['Lower_Band'] = talib.BBANDS(df['Close'], 20, 2, 2)
df['MACD'], df['MACD_Signal'], df['MACD_Hist'] = talib.MACD(df['Close'], fastperiod=12, slowperiod=26,
signalperiod=9)
df = df[df['MACD_Hist'].notna()]
df = df.reset_index()
df_dict[us_code] = df
save_us_code = ''
save_macd_diff = 0
save_profit_target = 0
save_stop_loss = 0
total_profit = 0
num_of_trade = 0
mdd_dollar = 0
mdd_pct = 0
save_us_code_list = []
save_macd_diff_list = []
save_profit_target_list = []
save_stop_loss_list = []
total_profit_list = []
num_of_trade_list = []
mdd_dollar_list = []
mdd_pct_list = []
result_dict = {}
for us_code in us_code_list:
for macd_diff in macd_diff_list:
for profit_target in profit_target_list:
for stop_loss in stop_loss_list:
print(us_code, macd_diff, profit_target, stop_loss) ## the problem should be starting from here##
save_us_code, save_profit_target, save_stop_loss, total_profit, num_of_trade, mdd_dollar, mdd_pct, macd_diff = backtest(df, us_code, profit_target, stop_loss, macd_diff)
save_us_code_list.append(save_us_code)
save_profit_target_list.append(save_profit_target)
save_stop_loss_list.append(save_stop_loss)
total_profit_list.append(total_profit)
num_of_trade_list.append(num_of_trade)
mdd_dollar_list.append(mdd_dollar)
mdd_pct_list.append(mdd_pct)
macd_diff_list.append(macd_diff)
I am working on the algo trade, however, I created a for loop to put my parameter into my backtest function. However, the for loop keeps looping non-stop.
I think the error starting from "for macd_diff in macd_diff_list:" because i try to print the result below that row, the result is already indefinite.

Now that you've shown the full code, your problem is obvious. Your original example didn't show the issue because you didn't include all relevant code. Here's your example with the relevant code that's causing the issue:
for us_code in us_code_list:
for macd_diff in macd_diff_list:
for profit_target in profit_target_list:
for stop_loss in stop_loss_list:
... # irrelevant code not shown
macd_diff_list.append(macd_diff)
The issue is that you're looping through each item in macd_diff_list, but then for each loop iteration, you add an item to that list. So of course the loop will be infinite. You need to be looping through a different list, or adding items to a different list.

While and for loop with global variable not working Python updated

Working for single symbol
todate = zerodha.get_trade_day(datetime.now().astimezone(to_india) - timedelta(days=0))
fromdate = zerodha.get_trade_day(datetime.now().astimezone(to_india) - timedelta(days=5))
symbol = "ZINC20MAYFUT"
instype = "MCX"
Timeinterval = "5minute"
tradeDir = 0 #neutral
while (True):
histdata1 = zerodha.get_history(symbol, fromdate, todate, Timeinterval, instype)
df = pd.DataFrame(histdata1)
df = heikinashi(df)
df = bollinger_bands(df,field='h_close',period=20, numsd=2)
df1 =pd.DataFrame(df, columns=['date','volume','close','h_close','middle_band', 'upper_band'])
pp = pd.DataFrame(df1.tail(3))
print(pp)
dfCToList = pp['h_close'].tolist()
dfCList = list(pp['h_close'])
dfHValues = pp['h_close'].values
dfBMValues = pp['middle_band'].values
H_last = dfHValues[2] # tail 1
BM_last = dfBMValues[2] # tail 1
if (H_last > BM_last and (tradeDir == 0 or tradeDir == -1)):
print("buy")
tradeDir = 1 # up
if (H_last < BM_last and (tradeDir == 0 or tradeDir == 1)):
print("SELL")
tradeDir = -1 # down
# pdb.set_trace()
Question: When conditions meet its Printing "BUY/SELL" again and again. I want to just print a single time when condition meet the first time
todate = zerodha.get_trade_day(datetime.now().astimezone(to_india) - timedelta(days=0))
fromdate = zerodha.get_trade_day(datetime.now().astimezone(to_india) - timedelta(days=5))
tradeDir = 0 #neutral
def script():
global tradeDir
##For historical Data##
symbol = ["ZINC20MAYFUT" ,"CRUDEOIL20MAYFUT","GOLD20JUNFUT"]
instype = "MCX"
Timeinterval = "5minute"
for symbol in symbol:
global tradeDir
histdata1 = zerodha.get_history(symbol, fromdate, todate, Timeinterval, instype)
df = pd.DataFrame(histdata1)
df = heikinashi(df)
df = bollinger_bands(df,field='h_close',period=20, numsd=2)
df1 =pd.DataFrame(df, columns=['date','volume','close','h_close','middle_band', 'upper_band'])
pp = pd.DataFrame(df1.tail(3))
print(pp)
dfCToList = pp['h_close'].tolist()
dfCList = list(pp['h_close'])
dfHValues = pp['h_close'].values
dfBMValues = pp['middle_band'].values
H_last = dfHValues[2] # tail 1
BM_last = dfBMValues[2] # tail 1
if (H_last > BM_last and (tradeDir == 0 or tradeDir == -1)):
print("buy")
tradeDir = 1 # up
if (H_last < BM_last and (tradeDir == 0 or tradeDir == 1)):
print("SELL")
tradeDir = -1 # down
# pdb.set_trace()
while True:
try:
script()
except Exception as e:
sleep(2)
continue
When conditions meet its Printing "BUY/SELL" again and again. I want to just print a single time when condition meet the first time full Script and should run continuously

If you want the code to stop looping after the first time it prints "buy" or "SELL", you just need to add a break statement after each of the prints (inside the scope of the containing if blocks).

Single list.count instead of multiple

Im parsed list of crew witch one looks like:
20;mechanic;0;68
21;cook;0;43
22;scientist;0;79
23;manager;1;65
24;mechanic;1;41
etc
And now I'm trying to figure out how to count number of workers who have 60 or more stamina( the last element in each employee )
There is my code:
with open('employee.txt', 'r') as employee_list:
count = 0
for employee in employee_list.readlines():
employee_data = employee.rstrip().split(';')
if int(employee_data[3]) >= 60:
count += 1
print(count)
Print from terminal:
1
2
3
...
90
And there is the right answer I think, but is there anyway to get only one 'total' count, not a 90ty strings ?

Just print one line after the loop is done.
with open('employee.txt', 'r') as employee_list:
count = 0
for employee in employee_list.readlines():
employee_data = employee.rstrip().split(';')
if int(employee_data[3]) >= 60:
count += 1
print(count)
But I would also recommend using pandas for data manipulation. For example:
df = pd.read_csv('employee.txt', sep=';')
df.columns = ['col1', 'col2', 'col3', 'stamina']
Then just filter and get the size:
df[df.stamina >= 60].size

So after a day of thinking I wrote this and get right answer ( maybe someone will find this helpful):
def total_resist_count():
# with open('employee.txt', 'r') as employee_list:
employee_list = [input() for i in range(120)]
candidates = []
for employee in employee_list:
employee_data = employee.rstrip().split(';')
if int(employee_data[3]) >= 60:
candidates.append(employee_data)
return candidates
required_professionals = {
'computers specialist': 5,
'cook': 3,
'doctor': 5,
'electrical engineer': 4,
'manager': 1,
'mechanic': 8,
'scientist': 14
}
expedition_total = 40
female_min = 21
male_min = 12
def validate_solution(cur_team, num_females, num_males):
global expedition_total, female_min, male_min
if sum(cur_team) != expedition_total or num_females < female_min or num_males < male_min:
return False
num_of_free_vacancies = 0
for k in required_professionals:
num_of_free_vacancies += required_professionals[k]
if num_of_free_vacancies > 0:
return False
return True
TEAM = None
def backtrack(candidates, cur_team, num_females, num_males):
global required_professionals, expedition_total, TEAM
if sum(cur_team) > expedition_total or TEAM is not None:
return
if validate_solution(cur_team, num_females, num_males):
team = []
for i, used in enumerate(cur_team):
if used == 1:
team.append(candidates[i])
TEAM = team
return
for i in range(len(candidates)):
if cur_team[i] == 0 and required_professionals[candidates[i][1]] > 0:
cur_team[i] = 1
required_professionals[candidates[i][1]] -= 1
if candidates[i][2] == '1':
backtrack(candidates, cur_team, num_females, num_males + 1)
else:
backtrack(candidates, cur_team, num_females + 1, num_males)
required_professionals[candidates[i][1]] += 1
cur_team[i] = 0
if __name__ == '__main__':
ec = decode_fcc_message()
candidates = total_resist_count(ec)
cur_team = [0] * len(candidates)
backtrack(candidates, cur_team, 0, 0)
s = ""
for t in TEAM:
s += str(t[0]) + ';'
print(s)

monte carlo simulation python

I would like to simulate a seven game baseball playoff series. Let's say I have the the win probabilities for each game in the series. I would like to know the probabilities for each possible series outcome. ie TeamA in 4 games, TeamB in 4 games, TeamA in 5 games, etc.
This is what I came up with and it seems to work but I think it could be done better.
winPercGM1 = .5
winPercGM2 = .56
winPercGM3 = .47
winPercGM4 = .55
winPercGM5 = .59
winPercGM6 = .59
winPercGM7 = .38
winPercs = [winPercGM1, winPercGM2, winPercGM3, winPercGM4, winPercGM5, winPercGM6, winPercGM7]
def WinSeries():
teamAwins = 0
teamBwins = 0
for perc in winPercs:
if teamAwins == 4:
break
elif teamBwins == 4:
break
elif perc > np.random.random():
teamAwins += 1
else:
teamBwins += 1
return teamAwins, teamBwins
def RunFun(n):
teamAWins = []
teamBWins = []
for i in xrange(n):
result = WinSeries()
teamAWin = result[0]
teamBWin = result[1]
teamAWins.append(teamAWin)
teamBWins.append(teamBWin)
return teamAWins, teamBWins
n = 500000
results = RunFun(n)
teamAwinSeries = results[0]
teamBwinSeries = results[1]
teamBin4 = teamAwinSeries.count(0)/n
teamBin5 = teamAwinSeries.count(1)/n
teamBin6 = teamAwinSeries.count(2)/n
teamBin7 = teamAwinSeries.count(3) / n
teamAin4 = teamBwinSeries.count(0)/n
teamAin5 = teamBwinSeries.count(1)/n
teamAin6 = teamBwinSeries.count(2)/n
teamAin7 = teamBwinSeries.count(3) / n

This can be done easily with numpy (Python 2.7)
import numpy as np
probs = np.array([.5 ,.56 ,.47 ,.55 ,.59 ,.59 ,.38])
nsims = 500000
chance = np.random.uniform(size=(nsims, 7))
teamAWins = (chance > probs[None, :]).astype('i4')
teamBWins = 1 - teamAWins
teamAwincount = {}
teamBwincount = {}
for ngames in range(4, 8):
afilt = teamAWins[:, :ngames].sum(axis=1) == 4
bfilt = teamBWins[:, :ngames].sum(axis=1) == 4
teamAwincount[ngames] = afilt.sum()
teamBwincount[ngames] = bfilt.sum()
teamAWins = teamAWins[~afilt]
teamBWins = teamBWins[~bfilt]
teamAwinprops = {k : 1. * count/nsims for k, count in teamAwincount.iteritems()}
teamBwinprops = {k : 1. * count/nsims for k, count in teamBwincount.iteritems()}
Output:
>>> sum(teamAwinprops.values()) + sum(teamBwinprops.values())
1.0
>>> teamAwincount
{4: 26186, 5: 47062, 6: 59222, 7: 95381}
>>> teamBwincount
{4: 36187, 5: 79695, 6: 97802, 7: 58465}

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to optimize the finding of divergences between 2 signals - python

Related

parallelized tasks are not ditributed a cross available cpus

Infinity loop issue using for loops

While and for loop with global variable not working Python updated

Single list.count instead of multiple

monte carlo simulation python

Categories

Resources