How get a Python pathlib Path from an Azure blob datastore? - python

I am trying to do some custom manipulation of a torch.utils.data.DataLoader in AzureML but cannot get it to instantiate directly from my azureml.core.Datastore :
ws = Workspace( # ... etc ... )
ds = Datastore.get(ws, datastore_name='my_ds')
am = ds.as_mount()
# HOW DO I GET base_path, data_file from am?
dataloader = DataLoader(
ListDataset(base_path, data_file), #... etc...
)
The value of am.path() is "$AZUREML_DATAREFERENCE_my_ds" but I cannot figure out how to go from that to a pathlib.Path as is expected by the constructor to ListDataset. Things I've tried include Path(am.path()) and Path(os.environ[am.path()]) but they don't seem to work.
It's clear that there's some answer, since :
script_params = {
'--base_path': ds.as_mount(),
'--epochs': 30,
'--batch_size' : 16,
'--use_cuda': 'true'
}
torch = PyTorch(source_directory='./',
script_params=script_params,
compute_target=compute_target,
entry_script='train.py',
pip_packages=packages,
use_gpu=True)
seems to create a legit object.

You can perhaps try using the DataPath class. It exposes attributes such as path_on_datastore which might be the path you're looking for.
To construct this class from your DataReference object i.e. variable am; you can use create_from_data_reference() method.
Example:
ds = Datastore.get(ws, datastore_name='my_ds')
am = ds.as_mount()
dp = DataPath().create_from_data_reference(am)
base_path = dp.path_on_datastore

The above code generated an error for me, removing the parentheses after the DataPath instantiation like below made the code run.
ds = Datastore.get(ws, datastore_name='my_ds')
am = ds.as_mount()
dp = DataPath.create_from_data_reference(am)
base_path = dp.path_on_datastore
Thank you for the code snippet, very useful!

Related

How to solve great expectations "MetricResolutionError: Cannot compile Column object until its 'name' is assigned." Error?

I am trying to use great expectations, The function i want to use is "expect_compound_columns_to_be_unique".
This is the code (main code - template);
import datetime
import pandas as pd
import great_expectations as ge
import great_expectations.jupyter_ux
from great_expectations.core.batch import BatchRequest
from great_expectations.checkpoint import SimpleCheckpoint
from great_expectations.exceptions import DataContextError
context = ge.data_context.DataContext()
# Note that if you modify this batch request, you may save the new version as a .json file
# to pass in later via the --batch-request option
batch_request = {'datasource_name': 'impala_okh', 'data_connector_name': 'default_inferred_data_connector_name', 'data_asset_name': 'okh.okh_forecast_prod', 'limit': 1000}
# Feel free to change the name of your suite here. Renaming this will not remove the other one.
expectation_suite_name = "okh_forecast_prod"
try:
suite = context.get_expectation_suite(expectation_suite_name=expectation_suite_name)
print(f'Loaded ExpectationSuite "{suite.expectation_suite_name}" containing {len(suite.expectations)} expectations.')
except DataContextError:
suite = context.create_expectation_suite(expectation_suite_name=expectation_suite_name)
print(f'Created ExpectationSuite "{suite.expectation_suite_name}".')
validator = context.get_validator(
batch_request=BatchRequest(**batch_request),
expectation_suite_name=expectation_suite_name
)
column_names = [f'"{column_name}"' for column_name in validator.columns()]
print(f"Columns: {', '.join(column_names)}.")
validator.head(n_rows=5, fetch_all=False)
the function (error in here);
validator.expect_compound_columns_to_be_unique(['column1', 'column2'])
Then i am getting following error;
MetricResolutionError: Cannot compile Column object until its 'name' is assigned.
How can i solve this problem?

Access STEP Instance ID's with PythonOCC

Let's suppose I'm using this STEP file data as input:
#417=ADVANCED_FACE('face_1',(#112),#405,.F.);
#418=ADVANCED_FACE('face_2',(#113),#406,.F.);
#419=ADVANCED_FACE('face_3',(#114),#407,.F.);
I'm using pythonocc-core to read the STEP file.
Then the following code will print the names of the ADVANCED_FACE instances (face_1,face_2 and face_3):
from OCC.Core.STEPControl import STEPControl_Reader
from OCC.Core.TopExp import TopExp_Explorer
from OCC.Core.TopAbs import TopAbs_FACE
from OCC.Core.StepRepr import StepRepr_RepresentationItem
reader = STEPControl_Reader()
tr = reader.WS().TransferReader()
reader.ReadFile('model.stp')
reader.TransferRoots()
shape = reader.OneShape()
exp = TopExp_Explorer(shape, TopAbs_FACE)
while exp.More():
s = exp.Current()
exp.Next()
item = tr.EntityFromShapeResult(s, 1)
item = StepRepr_RepresentationItem.DownCast(item)
name = item.Name().ToCString()
print(name)
How can I access the identifiers of the individual shapes? (#417,#418 and #419)
Minimal reproduction
https://github.com/flolu/step-occ-instance-ids
Create a STEP model after reader.TransferRoots() like this:
model = reader.StepModel()
And access the ID like this in the loop:
id = model.IdentLabel(item)
The full code looks like this and can also be found on GitHub:
from OCC.Core.STEPControl import STEPControl_Reader
from OCC.Core.TopExp import TopExp_Explorer
from OCC.Core.TopAbs import TopAbs_FACE
from OCC.Core.StepRepr import StepRepr_RepresentationItem
reader = STEPControl_Reader()
tr = reader.WS().TransferReader()
reader.ReadFile('model.stp')
reader.TransferRoots()
model = reader.StepModel()
shape = reader.OneShape()
exp = TopExp_Explorer(shape, TopAbs_FACE)
while exp.More():
s = exp.Current()
exp.Next()
item = tr.EntityFromShapeResult(s, 1)
item = StepRepr_RepresentationItem.DownCast(item)
label = item.Name().ToCString()
id = model.IdentLabel(item)
print('label', label)
print('id', id)
Thanks to temurka1 for pointing this out!
I was unable to run your code due to issues installing the pythonocc module, however, I suspect that you should be able to inspect the StepRep_RepresentationItem object (prior to string conversion) by traversing __dict__ on it to discover/access whatever attributes/properties/methods of the object you may need:
entity = tr.EntityFromShapeResult(s, 1)
item = StepRepr_RepresentationItem.DownCast(entity)
print(entity.__dict__)
print(item.__dict__)
If necessary the inspect module exists to pry deeper into the object.
References
https://docs.python.org/3/library/stdtypes.html#object.__dict__
https://docs.python.org/3/library/inspect.html
https://github.com/tpaviot/pythonocc-core/blob/66d6e1ef6b7552a1110a90e86a1ed34eb12ecf16/src/SWIG_files/wrapper/StepElement.pyi

openai.error.InvalidRequestError: Engine not found

Tried accessing the OpenAPI example - Explain code
But it shows error as -
InvalidRequestError: Engine not found
enter code response = openai.Completion.create(
engine="code-davinci-002",
prompt="class Log:\n def __init__(self, path):\n dirname = os.path.dirname(path)\n os.makedirs(dirname, exist_ok=True)\n f = open(path, \"a+\")\n\n # Check that the file is newline-terminated\n size = os.path.getsize(path)\n if size > 0:\n f.seek(size - 1)\n end = f.read(1)\n if end != \"\\n\":\n f.write(\"\\n\")\n self.f = f\n self.path = path\n\n def log(self, event):\n event[\"_event_id\"] = str(uuid.uuid4())\n json.dump(event, self.f)\n self.f.write(\"\\n\")\n\n def state(self):\n state = {\"complete\": set(), \"last\": None}\n for line in open(self.path):\n event = json.loads(line)\n if event[\"type\"] == \"submit\" and event[\"success\"]:\n state[\"complete\"].add(event[\"id\"])\n state[\"last\"] = event\n return state\n\n\"\"\"\nHere's what the above class is doing:\n1.",
temperature=0,
max_tokens=64,
top_p=1.0,
frequency_penalty=0.0,
presence_penalty=0.0,
stop=["\"\"\""]
)
I've been trying to access the engine named code-davinci-002 which is a private beta version engine. So without access it's not possible to access the engine. It seems only the GPT-3 models are of public usage. We need to need to join the OpenAI Codex Private Beta Waitlist in order to access Codex models through API.
Please note that your code is not very readable.
However, from the given error, I think it has to do with the missing colon : in the engine name.
Change this line from:
engine="code-davinci-002",
to
engine="code-davinci:002",
If you are using a finetuned model instead of an engine, you'd want to use model= instead of engine=.
response = openai.Completion.create(
model="<finetuned model>",
prompt=

How to instantiate an ontology using rdflib?

I have an ontology where I have defined series of classes, subclasses and properties. Now I want to automatically instantiate the ontology with Python code and save it in RDF/XML again and load it in Protege. I have written the following code:
def instantiating_ontology(rdf_address):
from rdflib import *
g = Graph()
input_RDF = g.parse(rdf_address)
#input_RDF = g.open(rdf_address, create=False)
myNamespace="http://www.semanticweb.org/.../ontologies/2015/3/RNO_V5042_RDF"
rno = Namespace(myNamespace+"#")
nodeClass = URIRef(rno+"Node")
arcClass = URIRef(rno+"Arc")
#owlNamespace = 'http://www.w3.org/2002/07/owl#NamedIndividual'
namedIndividual = URIRef('http://www.w3.org/2002/07/owl#NamedIndividual')
rdftype = URIRef("http://www.w3.org/1999/02/22-rdf-syntax-ns#type")
for i in range(0,100):
individualName = rno + "arc_"+str(arcID)
#arc_individual= BNode(individualName)
arc_individual = BNode()
#g.add()
#g.add((arc_individual,rdftype, namedIndividual))
g.add((arc_individual,rdftype, arcClass))
g.add((arc_individual,rdftype, arcClass))
#g.commit()
output_address ="RNO_V5042_RDF.owl"
g.serialize(destination = output_address)
The file contains the added triples to the rdf/xml:
<rdf:Description rdf:nodeID="N0009844208f0490887a02160fbbf8b98">
<rdf:type rdf:resource="http://www.semanticweb.org/ehsan.abdolmajidi/ontologies/2015/3/RNO_V5042#Arc"/>
but when I open the file in Protege there are no instances for the classes.
Can someone tell me if the way I defined instances is wrong or I should use different tags?
After playing around with the code and the results, I realized that the notion rdf:nodeID should be replaced with rdf:about. to do so I only needed to change
for i in range(0,100):
individualName = rno + "arc_"+str(arcID)
#arc_individual= BNode(individualName)
arc_individual = BNode() #---> remove this one
arc_individual = URIRef(individualName) #----> add this one
g.add((arc_individual,rdftype, arcClass))
g.add((arc_individual,rdftype, arcClass))
arc_individual = URIRef(individualName)
that might seem easy but took me sometime to understand. I hope this can help others. :D

LibreOffice - How to create a file dialog via python macro?

I'd like to know if it's possible to create a standard file dialog to save a pdf via a python macro. I've tried to write some code based on this outdated documentation: wiki.openoffice.org but LibreOffice crashes after execution:
import os
import uno
import sys
import traceback
from com.sun.star.ui.dialogs.TemplateDescription import FILESAVE_SIMPLE
def file_dialog():
try:
oCtx = uno.getComponentContext()
oServiceManager = oCtx.getServiceManager()
oFilePicker = oServiceManager.createInstanceWithArgumentsAndContext(
'com.sun.star.ui.dialogs.FilePicker',
(FILESAVE_SIMPLE,),
oCtx
)
oFilePicker.Title = 'Export as'
#oDisp = oFilePicker.Text
oFilePicker.execute()
except:
pass
#oDisp = traceback.format_exc(sys.exc_info()[2])
At the end I need to pass the selected path to write the document, but oDisp = oFilePicker.Text returns: (<type 'exceptions.AttributeError'>. Moreover is there a way to set the file type?
Does anyone have experience with it?
I used Xray on the oFilePicker object. There are a couple of interesting methods called setCurrentFilter and appendFilterGroup. Just based on the names, they might be used to filter what file types are visible. Unfortunately I'm not sure how to use them.
Also with Xray, I determined that Text is not a method or property of the oFilePicker object. I'm not sure what the code snippet is trying to do there? If retrieve the filepath, 1) that needs to be done after the .execute and 2) the selected filepath is stored as an array of strings, so the path has to be pulled out of the array. Most of my work in OpenOffice is in StarBasic; below is a working example in Basic of printing the filepath selected by the user:
Sub TestFilePicker
oFilePickerDlg = createUnoService( "com.sun.star.ui.dialogs.FilePicker" )
oFilePickerDlg.setTitle("My test title")
If oFilePickerDlg.execute() > 0 Then
Print ConvertFromURL(oFilePickerDlg.Files(0))
End If
End Sub
Answer given and accepted (because the question was cross posted!) here:
import uno
from com.sun.star.beans import PropertyValue
#shortcut:
createUnoService = (
XSCRIPTCONTEXT
.getComponentContext()
.getServiceManager()
.createInstance
)
def pypdf_test():
desktop = XSCRIPTCONTEXT.getDesktop()
doc = desktop.getCurrentComponent()
# filter data
fdata = []
fdata1 = PropertyValue()
fdata1.Name = "SelectPdfVersion"
fdata1.Value = 1
fdata2 = PropertyValue()
fdata2.Name = "Quality"
fdata2.Value = 100
fdata.append(fdata1)
fdata.append(fdata2)
args = []
arg1 = PropertyValue()
arg1.Name = "FilterName"
arg1.Value = "writer_web_pdf_Export"
arg2 = PropertyValue()
arg2.Name = "FilterData"
arg2.Value = uno.Any("[]com.sun.star.beans.PropertyValue", tuple(fdata) )
args.append(arg1)
args.append(arg2)
fileurl = FilePicker()
if fileurl:
doc.storeToURL( fileurl, tuple(args) )
def FilePicker(path=None, mode=1):
"""
Datei öffnen: `mode in (0, 6, 7, 8, 9)`
Datei Schreiben `mode in (1, 2, 3, 4, 5, 10)`
see: ('''http://api.libreoffice.org/docs/idl/ref/
namespacecom_1_1sun_1_1star_1_1ui_1_1
dialogs_1_1TemplateDescription.html''' )
"""
filepicker = createUnoService( "com.sun.star.ui.dialogs.OfficeFilePicker" )
if path:
filepicker.setDisplayDirectory(path )
filepicker.initialize( ( mode,) )
if filepicker.execute():
return filepicker.getFiles()[0]

Categories

Resources