How to deploy flow remotly on Prefect server?

How to deploy flow remotly on Prefect server? - python

I started working with the Prefect Orchestration tool.
My goal is to set up a server managing my automation on different other PCs and servers.
I do not fully understand the architecture of Prefect yet (with all these Agents etc.) but I managed to start a server on a remote Ubuntu environment.
To access the UI remotely I created a config.toml and added following lines:
[server]
endpoint = "<IPofserver>:4200/graphql"
[server.ui]
apollo_url = "http://<IPofserver>:4200/graphql"
[telemetry]
[server.telemetry]
enabled = false
The telemetry part is just to disable sending analysis data to Prefect.
Afterswards it was possible to accesss the UI from another PC and also to start an Agent on another PC with:
prefect agent local start --api "http://<IPofserver>:4200/graphql"
But how can I deploy flows now? A do not find an option to set their api like for the agent.
Even if I try to register a flow on the machine where the server itself is runnig I get following error message:
Traceback (most recent call last): File "", line 1, in
File
"/usr/local/lib/python3.10/dist-packages/prefect/core/flow.py", line
1726, in register
registered_flow = client.register( File "/usr/local/lib/python3.10/dist-packages/prefect/client/client.py",
line 831, in register
project = self.graphql(query_project).data.project # type: ignore File
"/usr/local/lib/python3.10/dist-packages/prefect/client/client.py",
line 443, in graphql
result = self.post( File "/usr/local/lib/python3.10/dist-packages/prefect/client/client.py",
line 398, in post
response = self._request( File "/usr/local/lib/python3.10/dist-packages/prefect/client/client.py",
line 633, in _request
response = self._send_request( File "/usr/local/lib/python3.10/dist-packages/prefect/client/client.py",
line 497, in _send_request
response = session.post( File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line
635, in post
return self.request("POST", url, data=data, json=json, **kwargs) File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py",
line 587, in request
resp = self.send(prep, **send_kwargs) File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line
695, in send
adapter = self.get_adapter(url=request.url) File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line
792, in get_adapter
raise InvalidSchema(f"No connection adapters were found for {url!r}") requests.exceptions.InvalidSchema: No connection adapters
were found for ':4200/graphql'
Used Example Code:
import prefect
from prefect import task, Flow
#task
def say_hello():
logger = prefect.context.get("logger")
logger.info("Hello, Cloud!")
with Flow("hello-flow") as flow:
say_hello()
# Register the flow under the "tutorial" project
flow.register(project_name="Test")

If you are getting started with Prefect, I'd recommend using Prefect 2.0 - check this documentation page on getting started and this one about the underlying architecture.
If you still need help with Prefect Server and Prefect 1.0, check this extensive troubleshooting guide and if that doesn't help, send us a message on Slack, and we'll try to help you there.

Related

Authenticate a python application in VM with Managed Service Identity(MSI)

I am trying to use MSI example provided in below link :
https://learn.microsoft.com/en-us/python/azure/python-sdk-azure-authenticate?view=azure-python#mgmt-auth-msi
To do that, I created a linux VM , installed MSI extension on it and running above code in a python application and when I run that python application I get the following error:
[azureuser#vish-redhat ~]$ python msi-auth.py
No handlers could be found for logger "msrestazure.azure_active_directory"
Traceback (most recent call last):
File "msi-auth.py", line 10, in <module>
subscription = next(subscription_client.subscriptions.list())
File "/usr/lib/python2.7/site-packages/msrest/paging.py", line 121, in __next__
self.advance_page()
File "/usr/lib/python2.7/site-packages/msrest/paging.py", line 107, in advance_page
self._response = self._get_next(self.next_link)
File "/usr/lib/python2.7/site-packages/azure/mgmt/resource/subscriptions/v2016_06_01/operations/subscriptions_operations.py", line 207, in internal_paging
request, header_parameters, **operation_config)
File "/usr/lib/python2.7/site-packages/msrest/service_client.py", line 191, in send
session = self.creds.signed_session()
File "/usr/lib/python2.7/site-packages/msrestazure/azure_active_directory.py", line 685, in signed_session
self.set_token()
File "/usr/lib/python2.7/site-packages/msrestazure/azure_active_directory.py", line 681, in set_token
self.scheme, _, self.token = get_msi_token(self.resource, self.port, self.msi_conf)
File "/usr/lib/python2.7/site-packages/msrestazure/azure_active_directory.py", line 590, in get_msi_token
result = requests.post(request_uri, data=payload, headers={'Metadata': 'true'})
File "/usr/lib/python2.7/site-packages/requests/api.py", line 108, in post
return request('post', url, data=data, json=json, **kwargs)
File "/usr/lib/python2.7/site-packages/requests/api.py", line 50, in request
response = session.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 464, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 415, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', error(111, 'Connection refused'))
[azureuser#vish-redhat ~]$
Code:
from msrestazure.azure_active_directory import MSIAuthentication
from azure.mgmt.resource import ResourceManagementClient, SubscriptionClient
# Create MSI Authentication
credentials = MSIAuthentication()
# Create a Subscription Client
subscription_client = SubscriptionClient(credentials)
subscription = next(subscription_client.subscriptions.list())
subscription_id = subscription.subscription_id
# Create a Resource Management client
resource_client = ResourceManagementClient(credentials, subscription_id)
# List resource groups as an example. The only limit is what role and policy are assigned to this MSI token.
for resource_group in resource_client.resource_groups.list():
print(resource_group.name)

You need install Python SDK in your Linux VM. Please refer to this official document.
pip install azure
Also, you need give Owner role for your VM on subscription level.
More information about this please refer to this link.
Now, you could use this code to test on VM. I test in my lab, it works for me.
Note: You need modify resource_client = ResourceManagementClient(credentials, subscription_id) to resource_client = ResourceManagementClient(credentials, str(subscription_id)), it requires a string type.

A connection error is usually because the extension is not yet available. You can try if the extension is available using the CLI with az login --msi
https://learn.microsoft.com/en-us/azure/active-directory/managed-service-identity/how-to-use-vm-sign-in
If it works, your VM is created correctly with MSI support. It it doesn't, probably your extension is not configured correctly.
Note that we changed the way to get a token with MSI from inside a VM. We now use IMDS:
https://learn.microsoft.com/en-us/azure/virtual-machines/windows/instance-metadata-service
Starting with the next release of the CLI (the first one of April 2018), CLI will authenticate with IMDS directly and not use the VM extension anymore. This is already shipped in the underlying library msrestazure in its 0.4.25 version. This one will bypass completely your VM extension to use IMDS and is the prefered scenario now. Could you try with this version of msrestazure? If it works with 0.4.25 but not in 0.4.24, this likely means your VM extension is not installed correctly, but you don't care since it's a deprecated scenario :)
Note that in order to get a token, your VM doesn't need any special permissions or ownership of subscription. However, for this token to be useful you need it :). But since your error is related to the "get a token" part and not permission, I would just kindly suggest that you might need this complementary info for later if you have permissions issues:
https://learn.microsoft.com/en-us/azure/active-directory/managed-service-identity/howto-assign-access-cli
(full disclosure, I work at MS in the SDK/CLI team and wrote the MSI support)

using Google BigQuery through Python script

I want to make some very easy tasks on BigQuery via a python script. I found this package which does not work well. Indeed, when I try this code:
from bigquery import get_client
project_id = 'txxxxxxxxxxxxxxxxxx9'
# Service account email address as listed in the Google Developers Console.
service_account = '7xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com'
# PKCS12 or PEM key provided by Google.
key = '/home/fxxxxxxxxxxxx/Dropbox/access_keys/google_storage/xxxxxxxxxxxxxxxxxxxxx.pem'
client = get_client(project_id, service_account=service_account, private_key_file=key, readonly=True)
# Submit an async query.
results = client.get_table_schema('newdataset', 'newtable2')
print('results')
I get this error:
/home/xxxxxx/anaconda3/envs/snakes/bin/python2.7 /home/xxxxxx/Dropbox/Prog/bigQuery_daily_import/src/main.py
Traceback (most recent call last):
File "/home/xxxxxx/Dropbox/Prog/bigQuery_daily_import/src/main.py", line 9, in <module>
client = get_client(project_id, service_account=service_account, private_key_file=key, readonly=True)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/bigquery/client.py", line 83, in get_client
readonly=readonly)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/bigquery/client.py", line 101, in _get_bq_service
service = build('bigquery', 'v2', http=http)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/util.py", line 142, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/googleapiclient/discovery.py", line 196, in build
cache)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/googleapiclient/discovery.py", line 242, in _retrieve_discovery_doc
resp, content = http.request(actual_url)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 565, in new_request
self._refresh(request_orig)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 835, in _refresh
self._do_refresh_request(http_request)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 862, in _do_refresh_request
body = self._generate_refresh_request_body()
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 1541, in _generate_refresh_request_body
assertion = self._generate_assertion()
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 1670, in _generate_assertion
private_key, self.private_key_password), payload)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/_pycrypto_crypt.py", line 121, in from_string
pkey = RSA.importKey(parsed_pem_key)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 665, in importKey
return self._importKeyDER(der)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 588, in _importKeyDER
raise ValueError("RSA key format is not supported")
ValueError: RSA key format is not supported
Process finished with exit code 1
My question: is there a tutorial in python which shows how to communicate easily with BigQuery: importing a dataset from google storage or S3, querying something, exporting the result to google storage.

A lot depends on your environment, and once you've figure that out everything should be super simple. I see the only problem on the error log you pasted is figuring out authentication.
Python pandas has had support for BigQuery for a while:
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.gbq.read_gbq.html
And I did a video with the creators of the module:
https://www.youtube.com/watch?v=gLeTDUMb7HY
Now, the simplest and fastest way these days to launch an Jupyter notebook with all of the Google Cloud goodies you mention is our new Google Datalab project:
https://cloud.google.com/datalab/
The only Datalab caveat is that it works on cloud servers, but if you want a fully managed Jupyter/IPython environment, totally secure, persistent, and ready to handle BigQuery, storage, etc... try it out.
Meanwhile, if you are writing a web application look at how other web applications solve this task.
For example, re:dash code to connect to BigQuery:
https://github.com/EverythingMe/redash/blob/master/redash/query_runner/big_query.py

Error in python webservice client using suds

Im using python 2.7 and suds 0.4 in windows and linux and in both cases I get the same error when calling a method of a web service:
Traceback (most recent call last):
File "wsclient.py", line 23, in <module>
client.service.Echo()
File "build\bdist.win32\egg\suds\client.py", line 542, in __call__
File "build\bdist.win32\egg\suds\client.py", line 602, in invoke
File "build\bdist.win32\egg\suds\client.py", line 643, in send
File "build\bdist.win32\egg\suds\client.py", line 678, in succeeded
File "build\bdist.win32\egg\suds\bindings\binding.py", line 149, in get_reply
AttributeError: 'NoneType' object has no attribute 'promotePrefixes'
My code is really simple:
import suds.bindings
suds.bindings.binding.envns = ('SOAP-ENV', 'http://www.w3.org/2003/05/soap-envelope')
from suds.client import Client
url = 'http://servicios.publipayments.com/ServicioDW.svc?wsdl'
client = Client(url)
print client
client.service.Echo()
As you can see I already did what the author of suds suggest here and also did the logging in here. But the result is the same.
Any ideas will be appreciated.
Regards.

My understanding about web services was not good enough and the service implementation is using http for exposing the wsdl AND https for the service endpoint.
So after a helpful hint of the service author I declared the client as:
client = Client('http://someUrl?wsdl',
location='https://someUrl/Service.svc')
And that solved the problem. There was nothing bad with suds.

Redis example giving HTTP 400: Bad request error

I am trying to cache MySQL queries in my Cherrypy server.
I could not figure out how to solve the error when I was installing pylibmc, so I decided to use Redis-py.
Here I am trying a very simple example.
import redis
cache = redis.StrictRedis(host='localhost', port=8080, db=0)
...
...
cache.set('0', '1') # I also tested with other string keys, but failed with same error
and it's throwing the following error!
[05/May/2014:13:11:13] HTTP Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/cherrypy/_cprequest.py", line 656, in respond
response.body = self.handler()
File "/Library/Python/2.7/site-packages/cherrypy/lib/encoding.py", line 188, in __call__
self.body = self.oldhandler(*args, **kwargs)
File "/Library/Python/2.7/site-packages/cherrypy/_cpdispatch.py", line 34, in __call__
return self.callable(*self.args, **self.kwargs)
File "server.py", line 92, in submit_data
cache.set(str(idx), '1')#res)
File "/Library/Python/2.7/site-packages/redis/client.py", line 897, in set
return self.execute_command('SET', *pieces)
File "/Library/Python/2.7/site-packages/redis/client.py", line 461, in execute_command
return self.parse_response(connection, command_name, **options)
File "/Library/Python/2.7/site-packages/redis/client.py", line 471, in parse_response
response = connection.read_response()
File "/Library/Python/2.7/site-packages/redis/connection.py", line 339, in read_response
response = self._parser.read_response()
File "/Library/Python/2.7/site-packages/redis/connection.py", line 118, in read_response
(str(byte), str(response)))
InvalidResponse: Protocol Error: H, TTP/1.1 400 Bad Request
I could not figure out what was wrong, and my website runs without a problem on localhost at port 8080 when I am not using Redis.

Your webserver is running on port 8080, not your Redis server. Your Redis server is most likely running on port 6379, unless you changed your config for some reason. Right now you are trying to run Redis queries against your webserver, and that is not going to work. Make sure you are connecting to the correct Redis server address and port and then try again.

dev-server HTTP Error 403: Forbidden

After updating from 1.7.5 (where everything worked fine) I'm getting a HTTP Error 403: Forbidden when trying to open any sites via localhost. Strange thing is I have pretty much the same setup at home as here at work and everything works there... Might be an issue with proxy server we're using at work, since that's the only difference I can think of? Here's the error log I'm getting, so if anyone knows what's going on please help (;
Traceback (most recent call last):
File "U:\Dev\GAE\lib\cherrypy\cherrypy\wsgiserver\wsgiserver2.py", line 1302, in communicate
req.respond()
File "U:\Dev\GAE\lib\cherrypy\cherrypy\wsgiserver\wsgiserver2.py", line 831, in respond
self.server.gateway(self).respond()
File "U:\Dev\GAE\lib\cherrypy\cherrypy\wsgiserver\wsgiserver2.py", line 2115, in respond
response = self.req.server.wsgi_app(self.env, self.start_response)
File "U:\Dev\GAE\google\appengine\tools\devappserver2\wsgi_server.py", line 246, in __call__
return app(environ, start_response)
File "U:\Dev\GAE\google\appengine\tools\devappserver2\request_rewriter.py", line 311, in _rewriter_middleware
response_body = iter(application(environ, wrapped_start_response))
File "U:\Dev\GAE\google\appengine\tools\devappserver2\python\request_handler.py", line 89, in __call__
self._flush_logs(response.get('logs', []))
File "U:\Dev\GAE\google\appengine\tools\devappserver2\python\request_handler.py", line 220, in _flush_logs
apiproxy_stub_map.MakeSyncCall('logservice', 'Flush', request, response)
File "U:\Dev\GAE\google\appengine\api\apiproxy_stub_map.py", line 94, in MakeSyncCall
return stubmap.MakeSyncCall(service, call, request, response)
File "U:\Dev\GAE\google\appengine\api\apiproxy_stub_map.py", line 320, in MakeSyncCall
rpc.CheckSuccess()
File "U:\Dev\GAE\google\appengine\api\apiproxy_rpc.py", line 156, in _WaitImpl
self.request, self.response)
File "U:\Dev\GAE\google\appengine\ext\remote_api\remote_api_stub.py", line 200, in MakeSyncCall
self._MakeRealSyncCall(service, call, request, response)
File "U:\Dev\GAE\google\appengine\ext\remote_api\remote_api_stub.py", line 226, in _MakeRealSyncCall
encoded_response = self._server.Send(self._path, encoded_request)
File "U:\Dev\GAE\google\appengine\tools\appengine_rpc.py", line 393, in Send
f = self.opener.open(req)
File "U:\Dev\Python\lib\urllib2.py", line 410, in open
response = meth(req, response)
File "U:\Dev\Python\lib\urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "U:\Dev\Python\lib\urllib2.py", line 448, in error
return self._call_chain(*args)
File "U:\Dev\Python\lib\urllib2.py", line 382, in _call_chain
result = func(*args)
File "U:\Dev\Python\lib\urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden
INFO 2013-04-19 12:28:52,576 server.py:561] default: "GET / HTTP/1.1" 500 -
INFO 2013-04-19 12:28:52,619 server.py:561] default: "GET /favicon.ico HTTP/1.1" 304 -
Also, the launcher throws an error when closing:
Traceback (most recent call last):
File "launcher\mainframe.pyc", line 327, in OnStop
File "launcher\taskcontroller.pyc", line 167, in Stop
File "launcher\dev_appserver_task_thread.pyc", line 82, in stop
File "launcher\taskthread.pyc", line 107, in stop
File "launcher\platform.pyc", line 397, in KillProcess
pywintypes.error: (5, 'TerminateProcess', 'Access is denied.')

I had this very same issue with my MacOSX when using a proxy server using Google App Engine Launcher 1.8.6. Apparently there's an issue with "proxy_bypass" on "urllib2.py".
There are two possible solutions:
Downgrade to 1.7.5, but, who wants to downgrade?
Edit "[GAE Instalattion path]/google/appengine/tools/appengine_rpc.py" and look for the line that says
opener.add_handler(fancy_urllib.FancyProxyHandler())
In my computer it was line 578, and then put a hash (#) at the beginning of the line, like this:
`#opener.add_handler(fancy_urllib.FancyProxyHandler())`
Save the file, stop and then restart your application. Now dev_appserver.py shouldn't try to use any proxy server at all.
If your application uses any external resources like a SOAP Webservice or something like that and you can't reach the server without the proxy server, then you'll have to downgrade. Please keep in mind that external javascript files (like facebook SDK or similar) are loaded from your browser, not from your application.
Since I'm not using any external REST or SOAP services it worked for me!
Hopefully it will work for you as well.

Try either:
-Accessing it through a different proxy. I.E a . proxy within a proxy
-Accessing it through your local IP i.e 192.168.1.1

I faced the same issue with version 1.9.5. Seems that the API proxy is sending some RPCs to the proxy server, which are then being rejected with HTTP 403 (since proxy servers are generally configured to reject connection attempts to arbitrary ports). In my case I was using the urlfetch module in my app to access external web pages, so disabling the proxy server was not a choice for me.
This is how I worked around the issue some time back (most probably it was based on comments found under this issue, but I cannot remember the exact sources).
NOTE:
For this approach to work, you'll have to know the hostname/IP address and default port of your proxy server, and change them appropriately in the code if you happen to connect to a different proxy server.
When you are not behind the proxy server, you will have to revert the applied changes in order to return to a working state (if you want internet access inside your app).
Here it goes:
Disable proxy settings for the Python (Google App Engine Launcher) environment in some way. (In my case it was easy since I was launching the dev_appserver.py from a Terminal shell (on Linux), and the unset http_proxy and unset https_proxy commands did the trick.)
Edit {App Engine SDK root}/google/appengine/api/urlfetch_stub.py. Find the code block
if _CONNECTION_SUPPORTS_TIMEOUT:
connection = connection_class(host, timeout=deadline)
else:
connection = connection_class(host)
(lines 376-379 in my case) and replace it with:
if _CONNECTION_SUPPORTS_TIMEOUT:
if host[:9] == 'localhost' or host[:9] == '127.0.0.1':
connection = connection_class(host, timeout=deadline)
else:
connection = connection_class('your_proxy_host_goes_here', your_proxy_port_number_goes_here, timeout=deadline)
else:
if host[:9] == 'localhost' or host[:9] == '127.0.0.1':
connection = connection_class(host)
else:
connection = connection_class('your_proxy_host_goes_here', your_proxy_port_number_goes_here)
replacing the placeholders your_proxy_host_goes_here and your_proxy_port_number_goes_here with appropriate values.
(I believe this code can be written more elegantly, though... any Python geeks out there? :) )
In my case, I also had to delete the existing compiled file urlfetch_stub.pyc (located in the same directory as urlfetch_stub.py) because the SDK didn't seem to pick up the changes until I did so.
Now you can use dev_appserver to launch your app, and use urlfetch-backed services within the app, free from HTTP 403 errors.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.