Running mlflow ui in AWS Sagemaker - python

I want to run mlflow UI in sagemaker but it simply does not work, When it outputs the http address going to it results in a "this site cannot be reached"
Here is the code:
def mlflow_test(server_uri, experiment_name):
mlflow.set_tracking_uri(server_uri)
mlflow.set_experiment(experiment_name)
with mlflow.start_run():
params = {
"n-estimators": 100,
"min-samples-leaf": 10,
"features": 'feature_test'
}
mlflow.log_params(params)
mlflow.log_metric('foo', 5)
mlflow.end_run()
running that code will return:
[2022-05-24 15:48:44 +0000] [27820] [INFO] Starting gunicorn 20.1.0
[2022-05-24 15:48:44 +0000] [27820] [INFO] Listening at: http://127.0.0.1:5000 (27820)
[2022-05-24 15:48:44 +0000] [27820] [INFO] Using worker: sync
[2022-05-24 15:48:44 +0000] [27823] [INFO] Booting worker with pid: 27823
Going to the http://127.0.0.1:5000 link won't work. Anyone know how to get mlflow ui running in sagemaker? There's not much info on this that's at an easy to understand level. I just want to log my metrics and params in sagemaker and view them using the mlflow ui

Are you working in SageMaker Studio or a Classic Notebook Instance? You could technically use the Studio terminal to try launch MLflow, but I would not recommend this. It would be better to setup something like this on EC2 where you have full control over the setup of your environment.

Related

Azure - Python Dash Application

I have a Dash application and I have got some questions for the Azure (App Services) deployment. I use git in Deployment Center.
1) In my requirements.txt I have a packages that is causing the issue - pywin32. It gives me the below error during the deployment:
ERROR: Could not find a version that satisfies the requirement pywin32==302 (from versions: none)
ERROR: No matching distribution found for pywin32==302
It happens during the installation of dependencies.
2) When I remove pywin32==302 from requirements.txt, I can build and deploy however the applications shows me the error (I did before Flask deployment and it worked).
Any ideas how to fix it please?
Logs here:
2021-12-14T14:09:25.051923007Z Updated PYTHONPATH to ':/tmp/8d9bee966ce1769/antenv/lib/python3.8/site-packages'
2021-12-14T14:09:25.497472285Z [2021-12-14 14:09:25 +0000] [37] [INFO] Starting gunicorn 20.1.0
2021-12-14T14:09:25.500486120Z [2021-12-14 14:09:25 +0000] [37] [INFO] Listening at: http://0.0.0.0:8000 (37)
2021-12-14T14:09:25.504178862Z [2021-12-14 14:09:25 +0000] [37] [INFO] Using worker: sync
2021-12-14T14:09:25.507938905Z [2021-12-14 14:09:25 +0000] [39] [INFO] Booting worker with pid: 39
2021-12-14T14:09:26.617892557Z Application object must be callable.
2021-12-14T14:09:26.619256872Z [2021-12-14 14:09:26 +0000] [39] [INFO] Worker exiting (pid: 39)
2021-12-14T14:09:26.677663238Z [2021-12-14 14:09:26 +0000] [37] [INFO] Shutting down: Master
2021-12-14T14:09:26.677730439Z [2021-12-14 14:09:26 +0000] [37] [INFO] Reason: App failed to load.
/home/LogFiles/2021_12_14_pl0sdlwk00000V_docker.log (https://*****.scm.azurewebsites.net/api/vfs/LogFiles/2021_12_14_pl0sdlwk00000V_docker.log)
2021-12-14T14:04:22.384Z INFO - Stopping site ***** because it failed during startup.
2021-12-14T14:09:08.430Z INFO - Starting container for site
2021-12-14T14:09:08.430Z INFO - docker run -d -p 1972:8000 --name taxdevelopment_0_bbeb99e2 -e WEBSITE_SITE_NAME=***** -e WEBSITE_AUTH_ENABLED=False -e WEBSITE_ROLE_INSTANCE_ID=0 -e WEBSITE_HOSTNAME=*****.azurewebsites.net -e WEBSITE_INSTANCE_ID=31a267ed7b71ec86982412cc9dc4ad2f31ca2b8f51b692363aa765c405b03b84 appsvc/python:3.8_20210810.1
2021-12-14T14:09:08.431Z INFO - Logging is not enabled for this container.Please use https://aka.ms/linux-diagnostics to enable logging to see container logs here.
2021-12-14T14:09:10.641Z INFO - Initiating warmup request to container *****_0_bbeb99e2 for site *****
2021-12-14T14:09:33.292Z ERROR - Container *****_0_bbeb99e2 for site ***** has exited, failing site start
2021-12-14T14:09:33.294Z ERROR - Container *****_0_bbeb99e2 didn't respond to HTTP pings on port: 8000, failing site start. See container logs for debugging.
2021-12-14T14:09:33.300Z INFO - Stopping site ***** because it failed during startup.
/home/LogFiles/webssh/.log (https://*****.scm.azurewebsites.net/api/vfs/LogFiles/webssh/.log)
ERROR - Container *****_0_bbeb99e2 for site ***** has exited, failing site start
ERROR - Container *****_0_bbeb99e2 didn't respond to HTTP pings on port: 8000, failing site start. See container logs for debugging.
The error message shows that failing to start site on linux container. It might be using some additional references of your Dash application. You can use Container logs to view the detailed information of error.
Docker logs appear on the Container Settings page in the portal.
You can find the Docker log in the /LogFiles directory. You can access this via the Kudu (Advanced Tools) Bash console or by using
an FTP client to access it.
You can use our API to download the current logs.
Refer here
Any way you already removed the pywin32 == 302 from requirements.txt and you can check the same pypiwin32==302 in requirements.txt and remove it.
Refer here for more information

airflow webserver Workers terminating due to signal 11

I'm trying to start the airflow webserver with an existing application and the latest version (2.2.2). The executor is set as LocalExecutor, the datastore is Postgres, Python v3.9. Upon start up, it launches 4 workers which promptly die. It then spins its wheels continually trying to restart them.
Here's an example of the messages showing a worker starting and dying with signal 11 (segmentation violation). This is all within a second of starting.
Using worker: sync
[2021-11-30 17:29:31 -0500] [12529] [INFO] Booting worker with pid: 12529
[2021-11-30 17:29:31 -0500] [12530] [INFO] Booting worker with pid: 12530
[2021-11-30 17:29:31 -0500] [12531] [INFO] Booting worker with pid: 12531
[2021-11-30 17:29:31 -0500] [12532] [INFO] Booting worker with pid: 12532
Running the Gunicorn Server with:
Workers: 4 sync
Host: 0.0.0.0:8080
Timeout: 120
settings.prepare_engine_args(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=12529
[WARNING] Worker with pid 12529 was terminated due to signal 11
Any suggestions on how to debug these workers?
I gave up on trying to run Airflow directly on my M1 Mac. I suspect it has something to do with the emulator for the M1 chip, but don't know for sure. I fell back to running Airflow in Docker. It takes a very long time to start, but does run OK that way.

Dash/Flask Python App Azure Deployment: Application object must be callable

so I'm trying to deploy my Dash app onto Azure using Azure Web App. When I use the "Github Actions" method of deployment, I get the following (I even made my repository public for this):
Package deployment using ZIP Deploy initiated.
##[error]Failed to deploy web package to App Service.
##[error]Deployment Failed with Error: Error: Failed to deploy web package to App Service.
Unauthorized (CODE: 401)
##[warning]Error: Failed to update deployment history.
Unauthorized (CODE: 401)
App Service Application URL: http://my-site.azurewebsites.net
I've also tried deploying with "Kudu App Build" instead of the Github Actions and that "deploys successfully", giving me a status of "Success(Active)" but once I go to my website it fails, then when I go to Diagnosis and solve problems in the azure portal I receive the following error:
2020-06-05T14:32:08.567155259Z
2020-06-05T14:32:08.567159059Z Documentation: http://aka.ms/webapp-linux
2020-06-05T14:32:08.567163159Z Python 3.8.0
2020-06-05T14:32:08.567166959Z Note: Any data outside '/home' is not persisted
2020-06-05T14:32:09.126144120Z Starting OpenBSD Secure Shell server: sshd.
2020-06-05T14:32:09.177989757Z Site's appCommandLine: gunicorn --bind=0.0.0.0 --timeout 600 application:app
2020-06-05T14:32:09.179066899Z Launching oryx with: -appPath /home/site/wwwroot -output /opt/startup/startup.sh -virtualEnvName antenv -defaultApp /opt/defaultsite -userStartupCommand 'gunicorn --bind=0.0.0.0 --timeout 600 application:app'
2020-06-05T14:32:09.289253728Z Oryx Version: 0.2.20200114.13, Commit: 204922f30f8e8d41f5241b8c218425ef89106d1d, ReleaseTagName: 20200114.13
2020-06-05T14:32:09.315423056Z Found build manifest file at '/home/site/wwwroot/oryx-manifest.toml'. Deserializing it...
2020-06-05T14:32:09.322220623Z Build Operation ID: |mqhAw8K+i4s=.a5925a0c_
2020-06-05T14:32:09.937709104Z Writing output script to '/opt/startup/startup.sh'
2020-06-05T14:32:10.656204831Z Found virtual environment .tar.gz archive.
2020-06-05T14:32:10.656991662Z Removing existing virtual environment directory /antenv...
2020-06-05T14:32:10.667494675Z Extracting to directory /antenv...
2020-06-05T14:32:42.888599159Z Using packages from virtual environment antenv located at /antenv.
2020-06-05T14:32:42.891102058Z Updated PYTHONPATH to ':/antenv/lib/python3.8/site-packages'
2020-06-05T14:32:44.966320399Z [2020-06-05 14:32:44 +0000] [42] [INFO] Starting gunicorn 20.0.4
2020-06-05T14:32:44.973255871Z [2020-06-05 14:32:44 +0000] [42] [INFO] Listening at: http://0.0.0.0:8000 (42)
2020-06-05T14:32:44.974184808Z [2020-06-05 14:32:44 +0000] [42] [INFO] Using worker: sync
2020-06-05T14:32:45.047728498Z [2020-06-05 14:32:45 +0000] [44] [INFO] Booting worker with pid: 44
2020-06-05T14:32:50.315649084Z Application object must be callable.
2020-06-05T14:32:50.321677821Z [2020-06-05 14:32:50 +0000] [44] [INFO] Worker exiting (pid: 44)
2020-06-05T14:32:50.630621560Z [2020-06-05 14:32:50 +0000] [42] [INFO] Shutting down: Master
2020-06-05T14:32:50.630656561Z [2020-06-05 14:32:50 +0000] [42] [INFO] Reason: App failed to load.
That was the application error, here is the container error
Please use https://aka.ms/linux-diagnostics to enable logging to see container logs here.
2020-06-05T14:30:22.831Z INFO - Initiating warmup request to container my-site_0_06a70784 for site my-site
2020-06-05T14:30:38.890Z INFO - Waiting for response to warmup request for container my-site_0_06a70784. Elapsed time = 16.0594968 sec
2020-06-05T14:30:54.492Z INFO - Waiting for response to warmup request for container my-site_0_06a70784. Elapsed time = 31.6616936 sec
2020-06-05T14:31:09.663Z ERROR - Container my-site_0_06a70784 for site my-site has exited, failing site start
2020-06-05T14:31:09.695Z ERROR - Container my-site_0_06a70784 didn't respond to HTTP pings on port: 8000, failing site start. See container logs for debugging.
2020-06-05T14:31:10.217Z INFO - Stopping site my-site because it failed during startup.
2020-06-05T14:31:11.646Z INFO - Starting container for site
2020-06-05T14:31:11.647Z INFO - docker run -d -p 4040:8000 --name my-site_0_9c00d1f9 -e WEBSITE_SITE_NAME=my-site -e WEBSITE_AUTH_ENABLED=False -e WEBSITE_ROLE_INSTANCE_ID=0 -e WEBSITE_HOSTNAME=my-site.azurewebsites.net -e WEBSITE_INSTANCE_ID=b78e390e9ef390f579e4e316c4d4ea6c7f187e4af48b856ebd3562fda3c5ef4f appsvc/python:3.8_20200101.1 gunicorn --bind=0.0.0.0 --timeout 600 application:app
Here is what my application.py looks like (I've tried several variations of this to no avail):
import app as application
app = application.app
if __name__ == "__main__":
app.run_server(debug=False)
And at the top of my app.py file I have:
app = dash.Dash(__name__)
This is my requirements.txt:
bs4==0.0.1
dash==1.12.0
dash-table==4.7.0
Flask==1.1.2
numpy==1.18.0
pandas==1.0.4
requests==2.12.4
Lastly, this is the startup command I use in the Configuration->General Settings->Startup Command
gunicorn --bind=0.0.0.0 --timeout 600 application:app
The application you should run when deployment is dash.Dash.server(in your case it's app.server), it's the underly Flask application. So, you have to update the second line of your application.py to this:
app = application.server
See more in Dash docs.

how to make my python-flask website available 24*7 using nginx and gunicorn

How to configure gunicorn to make sure my python-flask website is available 24*7 ?
The issue I am facing is: As soon I kill my terminal window, the website is no more reachable.
I am using rhel7.6 to host a website using python-flask.
I have configured nginx as the web server and gunicorn as the application server.
.
I will really appreciate if some one can help me in using/configuring gunicorn to make sure my website is available 24*7.
Please have some of my code as below:
[root#syed-dashboard-4 ~]# pwd
/root
[root#syed-dashboard-4 ~]#
[root#syed-dashboard-4 ~]# cat hello.py
#!/usr/bin/python
from flask import Flask
app = Flask(__name__)
#app.route("/")
def hello():
return "Hello magentabox!"
#if __name__ == "__main__":
# app.run(host='10.145.29.23',port=5000)
[root#syed-dashboard-4 ~]#
[root#syed-dashboard-4 ~]# gunicorn hello:app
[2019-07-17 10:34:11 +0000] [9346] [INFO] Starting gunicorn 19.9.0
[2019-07-17 10:34:11 +0000] [9346] [INFO] Listening at: http://127.0.0.1:8000 (9346)
[2019-07-17 10:34:11 +0000] [9346] [INFO] Using worker: sync
[2019-07-17 10:34:11 +0000] [9351] [INFO] Booting worker with pid: 9351
I am pretty new to web development and as I mentioned as soon I close the terminal, the website is no more reachable. I can share the nginx configuration logs as well if that helps fixing my issue.
Thanks much.
You can use the supervisor. This is the professional way to run your server for 24*7.
Please add below file in your supervisor config to run.
[program:your_project_name]
command=/home/virtualenvpath/your_env/bin/gunicorn --log-level debug run_apiengine:main_app --bind 0.0.0.0:5006 --workers 5 --worker-class gevent
stdout_logfile=/home/your_path_to_log/supervisor_stdout.log
stderr_logfile=/home/your_path_to_log/supervisor_stderr.log
user=your_user
autostart=true
autorestart=true
environment=PYTHONPATH="$PYTHONPATH:/home/path_to_your_project";OAUTHLIB_INSECURE_TRANSPORT='1';
Configure this in supervisor it will run for 24*7. And whenever your machine restarts it will auto start.
On Linux, you can start your app in a tmux session that you detach once you started your server.
# Create a new tmux session
tmux new -s server
# Start your gunicorn server
cd /path/to/app
gunicorn hello:app
# Detach the current tmux session using Ctrl - B + D
You can close your terminal and your server will still be running.

Django + Gunincorn - deploy and stay connected to the port?

Correct me if I am wrong: I can use gunicorn to deploy a django project, for instance I can deploy my app - helloapp in this way:
$ cd env
$ . bin/activate
(env) $ cd ..
(env) $ pip install -r requirements.txt
(env) root#localhost:/var/www/html/helloapp# gunicorn helloapp.wsgi:application
[2017-05-18 22:22:38 +0000] [1779] [INFO] Starting gunicorn 19.7.1
[2017-05-18 22:22:38 +0000] [1779] [INFO] Listening at: http://127.0.0.1:8000 (1779)
[2017-05-18 22:22:38 +0000] [1779] [INFO] Using worker: sync
[2017-05-18 22:22:38 +0000] [1783] [INFO] Booting worker with pid: 1783
So now my django site is running at http://127.0.0.1:8000.
But it will not be available anymore as soon as I close/ exit my terminal. So how can I have it stayed connected to the port 8000 even if I have closed my terminal?
As with any long-running process, you need to run it as a service under some kind of manager. Since you're on Ubuntu, you probably want to use systemd; full instructions are in the gunicorn deployment docs. Note, you will also need to configure nginx as a reverse proxy in front of gunicorn.

Categories

Resources