download from URL Kubeflow pipeline

download from URL Kubeflow pipeline - python

I have been trying to create a kubeflow pipeline in my local machine and created a simple one component pipeline to download data from a given url.
import kfp
import kfp.components as comp
downloader_op = kfp.components.load_component_from_url(path)
def my_pipeline(url):
downloader_task = downloader_op(url = url)
This works fine if I execute this pipeline through python code
client = kfp.Client()
client.create_run_from_pipeline_func(my_pipeline,arguments=
{
'url': 'https://storage.googleapis.com/ml-pipeline-playground/iris-csv-files.tar.gz'
})
But It shows an error when I try to execute the same code through Kubeflow UI.
First I execute the above code which create a yaml file using below line and then upload the yaml file in kubeflow UI to create a new run.
kfp.compiler.Compiler().compile(pipeline_func=my_pipeline,package_path='pipeline.yaml')
It give below error
This step is in Error state with this message: Error (exit code 1): cannot enter chroot for container named "main": no PID known - maybe short running container
I am not able to understand why it creates problems through UI and what is the solution.

Related

AWS CDK Python: Error when trying to create A Record using zone from lookup

Using Python3.8, CDK 2.19.0
I want to create an A Record against a hosted zone that's already in my AWS account.
I am doing the following:
hosted_zone = route53.HostedZone.from_hosted_zone_attributes(self, "zone",
zone_name="my.awesome.zone.",
hosted_zone_id="ABC12345DEFGHI"
)
route53.ARecord(self, "app_record_set",
target=self.lb.load_balancer_dns_name, # this is declared above, and works fine.
zone=hosted_zone,
record_name="test-cdk.my.awesome.zone"
)
Inside my app.py I have:
env_EU = cdk.Environment(account="12345678901112", region="eu-west-1")
app = cdk.App()
create_a_record = DomianName(app, "DomianName", env=env_EU)
When I run cdk synth I get the following error:
➜ cdk synth
jsii.errors.JavaScriptError:
Error: Expected object reference, got "${Token[TOKEN.303]}"
File ".../.venv/lib/python3.8/site-packages/jsii/_kernel/providers/process.py", line 326, in send
...(full traceback)
Subprocess exited with error 1
I've tried from_lookup (rather than from_hosted_zone_attributes, Python3.9/Node 17/12/16 (just in case) but nothing is helping. I get the same error every time.
If I comment out the A Record creation, then the synth completes as expected.
cdk.context.json also has the correct hosted zone cached BUT only happens if I comment out the A record creation.

The ARecord target expects a type of RecordTarget. You are passing a string (token). Use a LoadBalancerTarget:
import aws_cdk.aws_elasticloadbalancingv2 as elbv2
# zone: route53.HostedZone
# lb: elbv2.ApplicationLoadBalancer
route53.ARecord(self, "AliasRecord",
zone=zone,
target=route53.RecordTarget.from_alias(targets.LoadBalancerTarget(lb))
)

How to dump and utilize multiple ML algorithm objects in one single pickle file in Azure ML workspace?

I am trying to create a ML model in Azure ML Workspace using Jupyter notebook. I am not using AutoML feature or Designer provided by Azure, and want to run the complete code prepared locally.
There are 3 different algorithms used in my ML Model. I am confused how can I save all the objects in one single pickle file, which I can utilize later in "Inference configuration" and "Score.py" file? Also, once saved how can I access them in "Score.py" file (which is the main file that contains the driver code)?
Currently I am using following method:
import pickle
f= 'prediction.pkl'
all_models=[Error_Message_countvector, ErrorMessage_tfidf_fit, model_naive]
with open(f, 'wb') as files:
pickle.dump(all_models, files)
and to access the objects:
cv_output = loaded_model[0].transform(input_series)
tfidf_output = loaded_model[1].transform(cv_output)
loaded_model_prediction = loaded_model[2].predict(tfidf_output)
Somehow, this method works fine when I run in the same cell as the entire code. But it throws error when I deploy the complete model.
My "Score.py" file looks something like this:
import json
from azureml.core.model import Model
import joblib
import pandas as pd
def init():
global prediction_model
prediction_model_path = Model.get_model_path("prediction")
prediction_model = joblib.load(prediction_model_path)
def run(data):
try:
data = json.loads(data)
input_string= str(data['errorMsg']).strip()
input_series=pd.Series(input_string)
cv_output= prediction_model[0].transform(input_series)
tfidf_output = prediction_model[1].transform(cv_output)
result = prediction_model[2].predict(tfidf_output)
return {'response' : result }
except Exception as e:
error = str(e)
return {'response' : error }
and the error received on deployment is:
Error:
{
"code": "AciDeploymentFailed",
"statusCode": 400,
"message": "Aci Deployment failed with exception: Error in entry script, AttributeError: module '__main__' has no attribute 'text_cleaning', please run print(service.get_logs()) to get details.",
"details": [
{
"code": "CrashLoopBackOff",
"message": "Error in entry script, AttributeError: module '__main__' has no attribute 'text_cleaning', please run print(service.get_logs()) to get details."
}
]
}
Can anyone help me understand the issue or figure out if there is something missing/wrong in the code?
What is the right way of saving multiple algorithm objects in one single pickle file?

> Can anyone help me understand the issue or figure out if there is something missing/wrong in the code?
From your error message:
"Error in entry script, AttributeError: module 'main' has no attribute 'text_cleaning'...
It seems like one your first cv_output from prediction_model is trying to call a function called text_cleaning which has not been imported by your scoring script.
> What is the right way of saving multiple algorithm objects in one single pickle file?
If you want to persist a sequence of transformations, like the one in your example, the best practice is to use the Pipeline class from sklearn:
https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

Create file system/container if not found

I'm trying to export a CSV to an Azure Data Lake Storage but when the file system/container does not exist the code breaks. I have also read through the documentation but I cannot seem to find anything helpful for this situation.
How do I go about creating a container in Azure Data Lake Storage if the container specified by the user does not exist?
Current Code:
try:
file_system_client = service_client.get_file_system_client(file_system="testfilesystem")
except Exception:
file_system_client = service_client.create_file_system(file_system="testfilesystem")
Traceback:
(FilesystemNotFound) The specified filesystem does not exist.
RequestId:XXXX
Time:2021-03-31T13:39:21.8860233Z

The try catch pattern should be not used here since the Azure Data lake gen2 library has the built in exists() method for file_system_client.
First, make sure you've installed the latest version library: azure-storage-file-datalake 12.3.0. If you're not sure which version you're using, please use pip show azure-storage-file-datalake command to check the current version.
Then you can use the code below:
from azure.storage.filedatalake import DataLakeServiceClient
service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
"https", "xxx"), credential="xxx")
#the get_file_system_client method will not throw error if the file system does not exist, if you're using the latest library 12.3.0
file_system_client = service_client.get_file_system_client("filesystem333")
print("the file system exists: " + str(file_system_client.exists()))
#create the file system if it does not exist
if not file_system_client.exists():
file_system_client.create_file_system()
print("the file system is created.")
#other code
I've tested it locally, it can work successfully:

Upload file using telegram-upload in flask?

We can upload file using telegram-upload library by using the following command on terminal
telegram-upload file1.mp4 /path/to/file2.mkv
But if I want to call this inside python function, How should I do it. I mean in a python function if users passes the file path as an argument, then that function should be able to upload the file to telegram server.It is not mentioned in the documentation.
In other words I want to ask how to execute or run shell commands from inside python function?

For telegram-upload you can use upload method in telegram_upload.management and
for telegram-download use download method in the same file.
Or you can see how they are implemented there.
from telegram_upload.client import Client
from telegram_upload.config import default_config, CONFIG_FILE
from telegram_upload.exceptions import catch
from telegram_upload.files import NoDirectoriesFiles, RecursiveFiles
DIRECTORY_MODES = {
'fail': NoDirectoriesFiles,
'recursive': RecursiveFiles,
}
def upload(files, to, config, delete_on_success, print_file_id, force_file, forward, caption, directories,
no_thumbnail):
"""Upload one or more files to Telegram using your personal account.
The maximum file size is 1.5 GiB and by default they will be saved in
your saved messages.
"""
client = Client(config or default_config())
client.start()
files = DIRECTORY_MODES[directories](files)
if directories == 'fail':
# Validate now
files = list(files)
client.send_files(to, files, delete_on_success, print_file_id, force_file, forward, caption, no_thumbnail)

I found the solution.Using os module we can run command line strings inside python function i.e. os.system('telegram-upload file1.mp4 /path/to/file2.mkv')

pyinstaller Error starting service: The service did not respond to the start or control request in a timely fashion

I have been searching since a couple of days for a solution without success.
We have a windows service build to copy some files from one location to another one.
So I build the code shown below with Python 3.7.
The full coding can be found on Github.
When I run the service using python all is working fine, I can install the service and also start the service.
This using commands:
Install the service:
python jis53_backup.py install
Run the service:
python jis53_backup.py start
When I now compile this code using pyinstaller with command:
pyinstaller -F --hidden-import=win32timezone jis53_backup.py
After the exe is created, I can install the service but when trying to start the service I get the error:
Error starting service: The service did not respond to the start or
control request in a timely fashion
I have gone through multiple posts on Stackoverflow and on Google related to this error however, without success. I don't have the option to install the python 3.7 programs on the PC's that would need to run this service. That's why we are trying to get a .exe build.
I have made sure to have the path updated according to the information that I found in the different questions.
Image of path definitions:
I also copied the pywintypes37.dll file.
From -> Python37\Lib\site-packages\pywin32_system32
To -> Python37\Lib\site-packages\win32
Does anyone have any other suggestions on how to get this working?
'''
Windows service to copy a file from one location to another
at a certain interval.
'''
import sys
import time
from distutils.dir_util import copy_tree
import servicemanager
import win32serviceutil
import win32service
from HelperModules.CheckFileExistance import check_folder_exists, create_folder
from HelperModules.ReadConfig import (check_config_file_exists,
create_config_file, read_config_file)
from ServiceBaseClass.SMWinService import SMWinservice
sys.path += ['filecopy_service/ServiceBaseClass',
'filecopy_service/HelperModules']
class Jis53Backup(SMWinservice):
_svc_name_ = "Jis53Backup"
_svc_display_name_ = "JIS53 backup copy"
_svc_description_ = "Service to copy files from server to local drive"
def start(self):
self.conf = read_config_file()
if not check_folder_exists(self.conf['dest']):
create_folder(self.conf['dest'])
self.isrunning = True
def stop(self):
self.isrunning = False
def main(self):
self.ReportServiceStatus(win32service.SERVICE_RUNNING)
while self.isrunning:
# Copy the files from the server to a local folder
# TODO: build function to trigger only when a file is changed.
copy_tree(self.conf['origin'], self.conf['dest'], update=1)
time.sleep(30)
if __name__ == '__main__':
if sys.argv[1] == 'install':
if not check_config_file_exists():
create_config_file()
if len(sys.argv) == 1:
servicemanager.Initialize()
servicemanager.PrepareToHostSingle(Jis53Backup)
servicemanager.StartServiceCtrlDispatcher()
else:
win32serviceutil.HandleCommandLine(Jis53Backup)

I was also facing this issue after compiling using pyinstaller. For me, the issue was that I was using the paths to configs and logs file in dynamic way, for ex:
curr_path = os.path.dirname(os.path.abspath(__file__))
configs_path = os.path.join(curr_path, 'configs', 'app_config.json')
opc_configs_path = os.path.join(curr_path, 'configs', 'opc.json')
log_file_path = os.path.join(curr_path, 'logs', 'application.log')
This was working fine when I was starting the service using python service.py install/start. But after compiling it using pyinstaller, it always gave me error of not starting in timely fashion.
To resolve this, I made all the dynamic paths to static, for ex:
configs_path = 'C:\\Program Files (x86)\\ScantechOPC\\configs\\app_config.json'
opc_configs_path = 'C:\\Program Files (x86)\\ScantechOPC\\configs\\opc.json'
debug_file = 'C:\\Program Files (x86)\\ScantechOPC\\logs\\application.log'
After compiling via pyinstaller, it is now working fine without any error. Looks like when we do dynamic path, it doesn't get the actual path to files and thus it gives error.
Hope this solves your problem too. Thanks

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

download from URL Kubeflow pipeline - python

Related

AWS CDK Python: Error when trying to create A Record using zone from lookup

How to dump and utilize multiple ML algorithm objects in one single pickle file in Azure ML workspace?

Create file system/container if not found

Upload file using telegram-upload in flask?

pyinstaller Error starting service: The service did not respond to the start or control request in a timely fashion

Categories

Resources