Error recognizing parameters for a spatial join using ArcPy - python

I'm trying to iterate a spatial join through a folder - then iterate a second spatial join through the outputs of the first.
This is my initial script:
import arcpy, os, sys, glob
'''This script loops a spatial join through all the feature classes
in the input folder, then performs a second spatial join on the output
files'''
#set local variables
input = "C:\\Users\\Ryck\\Test\\test_Input"
boundary = "C:\\Users\\Ryck\\Test\\area_Input\\boundary_Test.shp"
admin = "C:\\Users\\Ryck\\Test\\area_Input\\admi_Boundary_Test.shp"
outloc = "C:\\Users\\Ryck\\Test\\join_02"
#overwrite any files with the same name
arcpy.env.overwriteOutput = True
#perform spatial joins
for fc in input:
outfile = outloc + fc
join1 = [arcpy.SpatialJoin_analysis(fc,boundary,outfile) for fc in
input]
for fc in join1:
arcpy.SpatialJoin_analysis(fc,admin,outfile)
I keep receiving Error00732: Target Features: Dataset C does not exist or is not supported.
I'm sure this is a simple error, but none of the solutions that have previously been recommended to solve this error allow me to still output my results to their own folder.
Thanks in advance for any suggestions

You appear to be trying to loop through a given directory, performing the spatial join on (shapefiles?) contained therein.
However, this syntax is a problem:
input = "C:\\Users\\Ryck\\Test\\test_Input"
for fc in input:
# do things to fc
In this case, the for loop is iterating over a string. So each time through the loop, it takes one character at a time: first C, then :, then \... and of course the arcpy function fails with this input, because it expects a file path, not a character. Hence the error: Target Features: Dataset C does not exist...
To instead loop through files in your input directory, you need a couple extra steps. Build a list of files, and then iterate through that list.
arcpy.env.workspace = input # sets "workspace" to input directory, for next tool
shp_list = arcpy.ListFiles("*.shp") # list of all shapefiles in workspace
for fc in shp_list:
# do things to fc
(Ref. this answer on GIS.SE.)

After working through some kinks, and thanks to the advice of #erica, I decided to abandon my original concept of a nested for loop, and approach more simply. I'm still working on a GUI that will create system arguments that can be assigned to the variables and then used as parameters for the spatial joins, but for now, this is the solution I've worked out.
import arcpy
input = "C:\\Users\\Ryck\\Test\\test_Input\\"
boundary = "C:\\Users\\Ryck\\Test\\area_Input\\boundary_Test.shp"
outloc = "C:\\Users\\ryck\\Test\\join_01"
admin = "C:\\Users\\Ryck\\Test\\area_Input\\admin_boundary_Test.shp"
outloc1 = "C:\\Users\\Ryck\\Test\\join_02"
arcpy.env.workspace = input
arcpy.env.overwriteOutput = True
shp_list = arcpy.ListFeatureClasses()
print shp_list
for fc in shp_list:
join1 =
arcpy.SpatialJoin_analysis(fc,boundary,"C:\\Users\\ryck\\Test\\join_01\\" +
fc)
arcpy.env.workspace = outloc
fc_list = arcpy.ListFeatureClasses()
print fc_list
for fc in fc_list:
arcpy.SpatialJoin_analysis(fc,admin,"C:\\Users\\ryck\\Test\\join_02\\" +
fc)
Setting multiple environments and using the actual paths feels clunky, but it works for me at this point.

Related

Subgroup ArcPy list Query

Mornig, folks.
I have two equal sets of layers, disposed in subgroups in my ArcGIS Pro (2.9.0), as shown here.
It's important that they have the same name (Layer1, Layer2, ...) in both groups.
Now, I'm writing an ArcPy code that makes a Definition Query, but I want to do it only in one specific sub layer (Ex. Compare\Layer1 and Compare\Layer2).
For now, I have this piece of code that, I hope, can help.
p = arcpy.mp.ArcGISProject('current')
m = p.listMaps()[0]
l = m.listLayers()
for row in l:
print(row.name)
COD_QUERY = 123
for row in l:
if row.name in ('Compare\Layer1'):
row.definitionQuery = "CODIGO_EOL = {}".format(COD_QUERY)
print('ok')
When I write 'Compare\layer1' where's supposed to select only the Layer1 placed in the Compare group, the code doesn't work as expected and does the Query both in Compare\Layer1 and Base\Layer2. That's the exact problem tha I'm having.
Hope I can find some help with u guys. XD
The layer's name (or longName) does not include the group layer's name.
Try using a wildcard (follow the link and search for listLayers) and filter for the particular group layer. A group layer object has a method listLayers too, you can again leverage it to get a specific layer.
import arcpy
COD_QUERY = 123
project = arcpy.mp.ArcGISProject("current")
map = project.listMaps()[0]
filtered_group_layers = map.listLayers("Compare")
if filtered_group_layers and filtered_group_layers[0].isGroupLayer:
filtered_layers = filtered_group_layers[0].listLayers("Layer1")
if filtered_layers:
filtered_layers[0].definitionQuery = f"CODIGO_EOL = {COD_QUERY}"
Or you can use loops. The key here is to filter out the group layers using isGroupLayer property before accessing the layer's listLayers method.
import arcpy
COD_QUERY = 123
project = arcpy.mp.ArcGISProject("current")
map = project.listMaps()[0]
group_layers = (layer for layer in map.listLayers() if layer.isGroupLayer)
for group_layer in group_layers:
if group_layer.name in "Compare":
for layer in group_layer.listLayers():
if layer.name in "Layer1":
layer.definitionQuery = f"CODIGO_EOL = {COD_QUERY}"

How do I preserve the colours in a STEP file when modifying the geometry in Open Cascade?

I'm writing a script in python using Open Cascade Technology (using the pyOCCT package for Anaconda) to import STEP files, defeature them procedurally and re-export them. I want to preserve the product hierarchy, names and colours as much as possible. Currently the script can import STEP files, simplify all of the geometry while roughly preserving the hierarchy and re-export the step file. The problem is no matter how I approach the problem, I can't manage to make it preserve the colours of the STEP file in a few particular cases.
Here's the model I pass in to the script:
And here's the result of the simplification:
In this case, the simplification has worked correctly but the colours of some of the bodies were not preserved. The common thread is that the bodies that loose their colours are children of products which only have other bodies as their children (ie: they don't contain sub-products).
This seems to be related to the way that Open Cascade imports STEP files which are translated as follows:
Alright, now for some code:
from OCCT.STEPControl import STEPControl_Reader, STEPControl_Writer, STEPControl_AsIs
from OCCT.BRepAlgoAPI import BRepAlgoAPI_Defeaturing
from OCCT.TopAbs import TopAbs_FACE, TopAbs_SHAPE, TopAbs_COMPOUND
from OCCT.TopExp import TopExp_Explorer
from OCCT.ShapeFix import ShapeFix_Shape
from OCCT.GProp import GProp_GProps
from OCCT.BRepGProp import BRepGProp
from OCCT.TopoDS import TopoDS
from OCCT.TopTools import TopTools_ListOfShape
from OCCT.BRep import BRep_Tool
from OCCT.Quantity import Quantity_ColorRGBA
from OCCT.ShapeBuild import ShapeBuild_ReShape
from OCCT.STEPCAFControl import STEPCAFControl_Reader, STEPCAFControl_Writer
from OCCT.XCAFApp import XCAFApp_Application
from OCCT.XCAFDoc import XCAFDoc_DocumentTool, XCAFDoc_ColorGen, XCAFDoc_ColorSurf
from OCCT.XmlXCAFDrivers import XmlXCAFDrivers
from OCCT.TCollection import TCollection_ExtendedString
from OCCT.TDF import TDF_LabelSequence
from OCCT.TDataStd import TDataStd_Name
from OCCT.TDocStd import TDocStd_Document
from OCCT.TNaming import TNaming_NamedShape
from OCCT.Interface import Interface_Static
# DBG
def export_step(shape, path):
writer = STEPControl_Writer()
writer.Transfer( shape, STEPControl_AsIs )
writer.Write(path)
# DBG
def print_shape_type(label, shapeTool):
if shapeTool.IsFree_(label):
print("Free")
if shapeTool.IsShape_(label):
print("Shape")
if shapeTool.IsSimpleShape_(label):
print("SimpleShape")
if shapeTool.IsReference_(label):
print("Reference")
if shapeTool.IsAssembly_(label):
print("Assembly")
if shapeTool.IsComponent_(label):
print("Component")
if shapeTool.IsCompound_(label):
print("Compound")
if shapeTool.IsSubShape_(label):
print("SubShape")
# Returns a ListOfShape containing the faces to be removed in the defeaturing
# NOTE: For concisness I've simplified this algorithm and as such it *MAY* not produce exactly
# the same output as shown in the screenshots but should still do SOME simplification
def select_faces(shape):
exp = TopExp_Explorer(shape, TopAbs_FACE)
selection = TopTools_ListOfShape()
nfaces = 0
while exp.More():
rgb = None
s = exp.Current()
exp.Next()
nfaces += 1
face = TopoDS.Face_(s)
gprops = GProp_GProps()
BRepGProp.SurfaceProperties_(face, gprops)
area = gprops.Mass()
surf = BRep_Tool.Surface_(face)
if area < 150:
selection.Append(face)
#log(f"\t\tRemoving face with area: {area}")
return selection, nfaces
# Performs the defeaturing
def simplify(shape):
defeaturer = BRepAlgoAPI_Defeaturing()
defeaturer.SetShape(shape)
sel = select_faces(shape)
if sel[0].Extent() == 0:
return shape
defeaturer.AddFacesToRemove(sel[0])
defeaturer.SetRunParallel(True)
defeaturer.SetToFillHistory(False)
defeaturer.Build()
if (not defeaturer.IsDone()):
return shape# TODO: Handle errors
return defeaturer.Shape()
# Given the label of an entity it finds it's displayed colour. If the entity has no defined colour the parents are searched for defined colours as well.
def find_color(label, colorTool):
col = Quantity_ColorRGBA()
status = False
while not status and label != None:
try:
status = colorTool.GetColor(label, XCAFDoc_ColorSurf, col)
except:
break
label = label.Father()
return (col.GetRGB().Red(), col.GetRGB().Green(), col.GetRGB().Blue(), col.Alpha(), status, col)
# Finds all child shapes and simplifies them recursively. Returns true if there were any subshapes.
# For now this assumes all shapes passed into this are translated as "SimpleShape".
# "Assembly" entities should be skipped as we don't need to touch them, "Compound" entities should work with this as well, though the behaviour is untested.
# Use the print_shape_type(shapeLabel, shapeTool) method to identify a shape.
def simplify_subshapes(shapeLabel, shapeTool, colorTool, set_colours=None):
labels = TDF_LabelSequence()
shapeTool.GetSubShapes_(shapeLabel, labels)
#print_shape_type(shapeLabel, shapeTool)
#print(f"{shapeTool.GetShape_(shapeLabel).ShapeType()}")
cols = {}
for i in range(1, labels.Length()+1):
label = labels.Value(i)
currShape = shapeTool.GetShape_(label)
print(f"\t{currShape.ShapeType()}")
if currShape.ShapeType() == TopAbs_COMPOUND:
# This code path should never be taken as far as I understand
simplify_subshapes(label, shapeTool, colorTool, set_colours)
else:
''' See the comment at the bottom of the main loop for an explanation of the function of this block
col = find_color(label, colorTool)
#print(f"{name} RGBA: {col[0]:.5f} {col[1]:.5f} {col[2]:.5f} {col[3]:.5f} defined={col[4]}")
cols[label.Tag()] = col
if set_colours != None:
colorTool.SetColor(label, set_colours[label.Tag()][5], XCAFDoc_ColorSurf)'''
# Doing both of these things seems to result in colours being reset but the geometry doesn't get replaced
nshape = simplify(currShape)
shapeTool.SetShape(label, nshape) # This doesn't work
return labels.Length() > 0, cols
# Set up XCaf Document
app = XCAFApp_Application.GetApplication_()
fmt = TCollection_ExtendedString('MDTV-XCAF')
doc = TDocStd_Document(fmt)
app.InitDocument(doc)
shapeTool = XCAFDoc_DocumentTool.ShapeTool_(doc.Main())
colorTool = XCAFDoc_DocumentTool.ColorTool_(doc.Main())
# Import the step file
reader = STEPCAFControl_Reader()
reader.SetNameMode(True)
reader.SetColorMode(True)
Interface_Static.SetIVal_("read.stepcaf.subshapes.name", 1) # Tells the importer to import subshape names
reader.ReadFile("testcolours.step")
reader.Transfer(doc)
labels = TDF_LabelSequence()
shapeTool.GetShapes(labels)
# Simplify each shape that was imported
for i in range(1, labels.Length()+1):
label = labels.Value(i)
shape = shapeTool.GetShape_(label)
# Assemblies are just made of other shapes, so we'll skip this and simplify them individually...
if shapeTool.IsAssembly_(label):
continue
# This function call here is meant to be the fix for the bug described.
# The idea was to check if the TopoDS_Shape we're looking at is a COMPOUND and if so we would simplify and call SetShape()
# on each of the sub-shapes instead in an attempt to preserve the colours stored in the sub-shape's labels.
#status, loadedCols = simplify_subshapes(label, shapeTool, colorTool)
#if status:
#continue
shape = simplify(shape)
shapeTool.SetShape(label, shape)
# The code gets a bit messy here because this was another attempt at fixing the problem by building a dictionary of colours
# before the shapes were simplified and then resetting the colours of each subshape after simplification.
# This didn't work either.
# But the idea was to call this function once to generate the dictionary, then simplify, then call it again passing in the dictionary so it could be re-applied.
#if status:
# simplify_subshapes(label, shapeTool, colorTool, loadedCols)
shapeTool.UpdateAssemblies()
# Re-export
writer = STEPCAFControl_Writer()
Interface_Static.SetIVal_("write.step.assembly", 2)
Interface_Static.SetIVal_("write.stepcaf.subshapes.name", 1)
writer.Transfer(doc, STEPControl_AsIs)
writer.Write("testcolours-simplified.step")
There's a lot of stuff here for a minimum reproducible example but the general flow of the program is that we import the step file:
reader.ReadFile("testcolours.step")
reader.Transfer(doc)
Then we iterate through each label in the file (essentially every node in the tree):
labels = TDF_LabelSequence()
shapeTool.GetShapes(labels)
# Simplify each shape that was imported
for i in range(1, labels.Length()+1):
label = labels.Value(i)
shape = shapeTool.GetShape_(label)
We skip any labels marked as assemblies since they contain children and we only want to simplify individual bodies. We then call simplify(shape) which performs the simplification and returns a new shape, we then call shapeTool.SetShape() to bind the new shape to the old label.
The thing that doesn't work here is that as explained, Component3 and Component4 don't get marked as Assemblies and are treated as SimpleShapes and when they are simplified as one shape, the colours are lost.
One solution I attempted was to call a method simplify_subshapes() which would iterate through each of the subshapes, and do the same thing as the main loop, simplifying them and then calling SetShape(). This ended up being even worse as it resulted in those bodies not being simplified at all but still loosing their colours.
I also attempted to use the simplify_subshapes() method to make a dictionary of all the colours of the subshapes, then simplify the COMPOUND shape and then call the same method again to this time re-apply the colours to the subshapes using the dictionary (the code for this is commented out with an explanation as to what it did).
col = find_color(label, colorTool)
#print(f"{name} RGBA: {col[0]:.5f} {col[1]:.5f} {col[2]:.5f} {col[3]:.5f} defined={col[4]}")
cols[label.Tag()] = col
if set_colours != None:
colorTool.SetColor(label, set_colours[label.Tag()][5], XCAFDoc_ColorSurf)
As far as I see it the issue could be resolved either by getting open cascade to import Component3 and Component4 as Assemblies OR by finding a way to make SetShape() work as intended on subshapes.
Here's a link to the test file:
testcolours.step

How to save a qgis graph to a shapefile to use in networkx?

I created a graph from a layer using the code below. I want to save this graph to a shapefile for further use in networkx.
I don't want to also save it as a QGIS layer. So how can I simply save it without giving a layer as the first argument of writeAsVectorFormat?
And if I try give an existing layer as argument it gives a strange bug: the code runs, and Windows shows the .shp file at the recent files in Windows Explorer, but when I want to open it it says that the file does not exist, and I also can't see it in the folder where it should be.
I also can't find how to just create a random layer, so if it's really needed to create a layer, can someone tell me how to do it?
Thank you for help
from qgis.analysis import *
vectorLayer = qgis.utils.iface.mapCanvas().currentLayer()
director = QgsVectorLayerDirector(vectorLayer, 12, '2.0', '3.0', '1.0', QgsVectorLayerDirector.DirectionBoth)
# The index of the field that contains information about the edge speed
attributeId = 1
# Default speed value
defaultValue = 50
# Conversion from speed to metric units ('1' means no conversion)
toMetricFactor = 1
strategy = QgsNetworkSpeedStrategy(attributeId, defaultValue, toMetricFactor)
director.addStrategy(strategy)
builder = QgsGraphBuilder(vectorLayer.crs())
startPoint = QgsPointXY(16.8346339,46.8931070)
endPoint = QgsPointXY(16.8376039,46.8971058)
tiedPoints = director.makeGraph(builder, [startPoint, endPoint])
graph = builder.graph()
vl = QgsVectorLayer("Point", "temp", "memory")
QgsVectorFileWriter.writeAsVectorFormat(v1, "zzsh.shp", "CP1250", vectorLayer.crs(), "ESRI Shapefile")
#QgsProject.instance().mapLayersByName('ZZMap07 copy')[0]

Process 100 of feature classes through script and feature class name to end of each output

NOTE: Work constraints I must use python 2.7 (I know - eyeroll) and standard modules. I'm still learning python.
I have about 100 tiled 'area of interest' polygons in a geodatabase that need to be processed through my script. My script has been tested on individual tiles & works great. I need advice how to iterate this process so I don't have to run one at a time. (I don't want to iterate ALL 100 at once in case something fails - I just want to make a list or something to run about 10-15 at a time). I also need to add the tile name that I am processing to each feature class that I output.
So far I have tried using fnmatch.fnmatch which errors because it does not like a list. I changed syntax to parenthesis which did NOT error but did NOT print anything.
I figure once that naming piece is done, running the process in the for loop should work. Please help with advice what I am doing wrong or if there is a better way - thanks!
This is just a snippet of the full process:
tilename = 'T0104'
HIFLD_fc = os.path.join(work_dir, 'fc_clipped_lo' + tilename)
HIFLD_fc1 = os.path.join(work_dir, 'fc1_hifldstr_lo' + tilename)
HIFLD_fc2 = os.path.join(work_dir, 'fc2_non_ex_lo' + tilename)
HIFLD_fc3 = os.path.join(work_dir, 'fc3_no_wilder_lo' + tilename)
arcpy.env.workspace = (env_dir)
fcs = arcpy.ListFeatureClasses()
tile_list = ('AK1004', 'AK1005')
for tile in fcs:
filename, ext = os.path.splitext(tile)
if fnmatch.fnmatch(tile, tile_list):
print(tile)
arcpy.Clip_analysis(HIFLD_fc, bufferOut2, HIFLD_fc1, "")
print('HIFLD clipped for analysis')
arcpy.Clip_analysis(HIFLD_fc, env_mask, HIFLD_masked_rds, "")
print('HIFLD clipped by envelopes and excluded from analysis')
arcpy.Clip_analysis(HIFLD_masked_rds, wild_mask, HIFLD_excluded, "")
print('HIFLD clipped by wilderness mask and excluded from analysis')
arcpy.MakeFeatureLayer_management(HIFLD_fc1, 'hifld_lyr')
arcpy.SelectLayerByLocation_management('hifld_lyr', "COMPLETELY_WITHIN", bufferOut1, "", "NEW_SELECTION", "INVERT")
if arcpy.GetCount_management('hifld_lyr') > 0:
arcpy.CopyFeatures_management('hifld_lyr', HIFLD_fc2)
print('HIFLD split features deleted fc2')
else:
pass

python sparse matrix creation paralellize to speed up

I am creating a sparse matrix file, by extracting the features from an input file. The input file contains in each row, one film id, and then followed by some feature IDs and that features score.
6729792 4:0.15568 8:0.198796 9:0.279261 13:0.17829 24:0.379707
the first number is the ID of the film, and then the value to the left of the colon is feature ID and the value to the right is the score of that feature.
Each line represents one film, and the number of feature:score pairs vary from one film to another.
here is how I construct my sparse matrix.
import sys
import os
import os.path
import time
import numpy as np
from Film import Film
import scipy
from scipy.sparse import coo_matrix, csr_matrix, rand
def sparseCreate(self, Debug):
a = rand(self.total_rows, self.total_columns, format='csr')
l, m = a.shape[0], a.shape[1]
f = tb.open_file("sparseFile.h5", 'w')
filters = tb.Filters(complevel=5, complib='blosc')
data_matrix = f.create_carray(f.root, 'data', tb.Float32Atom(), shape=(l, m), filters=filters)
index_film = 0
input_data = open('input_file.txt', 'r')
for line in input_data:
my_line = np.array(line.split())
id_film = my_line[0]
my_line = np.core.defchararray.split(my_line[1:], ":")
self.data_matrix_search_normal[str(id_film)] = index_film
self.data_matrix_search_reverse[index_film] = str(id_film)
for element in my_line:
if int(element[0]) in self.selected_features:
column = self.index_selected_feature[str(element[0])]
data_matrix[index_film, column] = float(element[1])
index_film += 1
self.selected_matrix = data_matrix
json.dump(self.data_matrix_search_reverse,
open(os.path.join(self.output_path, "data_matrix_search_reverse.json"), 'wb'),
sort_keys=True, indent=4)
my_films = Film(
self.selected_matrix, self.data_matrix_search_reverse, self.path_doc, self.output_path)
x_matrix_unique = self.selected_matrix[:, :]
r_matrix_unique = np.asarray(x_matrix_unique)
f.close()
return my_films
Question:
I feel that this function is too slow on big datasets, and it takes too long to calculate.
How can I improve and accelerate it? maybe using MapReduce? What is wrong in this function that makes it too slow?
IO + conversions (from str, to str, even 2 times to str of the same var, etc) + splits + explicit loops. Btw, there is CSV python module which may be used to parse your input file, you can experiment with it (I suppose you use space as delimiter). Also I' see you convert element[0] to int/str which is bad - you create many tmp. object. If you call this function several times, you may to try to reuse some internal objects (array?). Also, you can try to implement it in another style: with map or list comprehension, but experiments are needed...
General idea of Python code optimization is to avoid explicit Python byte-code execution and to prefer native/C Python functions (for anything). And sure try to solve so many conversions. Also if input file is yours you can format it to fixed length of fields - this helps you to avoid split/parse totally (only string indexing).

Categories

Resources