Cannot get gcloud to work with Python and Pycharm - python

I am trying to connect to the Google App Engine Datastore from my local machine. I have spent all day digging in to this without any luck.
I have tried the approach here (as well as alot of other suggestions from SO such as Using gcloud-python in GAE and Unable to run dev_appserver.py with gcloud):
How to access a remote datastore when running dev_appserver.py?
I first installed gcloud based on this description from google:
https://cloud.google.com/appengine/docs/python/tools/using-libraries-python-27
According to the description I should add the following to my appengine_config.py:
from google.appengine.ext import vendor
vendor.add('lib')
If I do that I get an error saying ImportError: No module named gcloud
If I then move the code to my main.py it seems to pickup the lib-folder and the modules there. That seems a bit strange to me, since I thought appengine_config was being run first to make sure things were initialised.
But now I am getting the following stack trace:
ERROR 2016-09-23 17:22:30,623 cgi.py:122] Traceback (most recent call last):
File "/Users/thomasd/Documents/github/myapp/main.py", line 10, in <module>
from gcloud import datastore
File "/Users/thomasd/Documents/github/myapp/lib/gcloud/__init__.py", line 17, in <module>
from pkg_resources import get_distribution
File "/Users/thomasd/Documents/github/myapp/lib/pkg_resources/__init__.py", line 2985, in <module>
#_call_aside
File "/Users/thomasd/Documents/github/myapp/lib/pkg_resources/__init__.py", line 2971, in _call_aside
f(*args, **kwargs)
File "/Users/thomasd/Documents/github/myapp/lib/pkg_resources/__init__.py", line 3013, in _initialize_master_working_set
dist.activate(replace=False)
File "/Users/thomasd/Documents/github/myapp/lib/pkg_resources/__init__.py", line 2544, in activate
declare_namespace(pkg)
File "/Users/thomasd/Documents/github/myapp/lib/pkg_resources/__init__.py", line 2118, in declare_namespace
_handle_ns(packageName, path_item)
File "/Users/thomasd/Documents/github/myapp/lib/pkg_resources/__init__.py", line 2057, in _handle_ns
loader.load_module(packageName)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pkgutil.py", line 246, in load_module
mod = imp.load_module(fullname, self.file, self.filename, self.etc)
File "/Library/Python/2.7/site-packages/google/cloud/logging/__init__.py", line 18, in <module>
File "/usr/local/google_appengine/google/appengine/tools/devappserver2/python/sandbox.py", line 999, in load_module
raise ImportError('No module named %s' % fullname)
ImportError: No module named google.cloud.logging.client
What am I doing wrong here?

The google-cloud library is not working on App Engine and most likely you don't even have to since you can use the build in functionality.
From the official docs you can use it like this:
import cloudstorage as gcs

I solved it this way:-
1.) Create a lib folder in your project path.
2.) Install gcloud libraries by running following command into terminal from your project path:-
pip install -t lib gcloud
3.) Create an appengine_config.py module in your project and add following lines of code:-
import sys
import os.path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'lib'))
4.) After this, you can import like this:-
from gcloud import datastore
5.) To save data into live google datastore from local:-
client = datastore.Client("project-id")
key = client.key('Person')
entity = datastore.Entity(key=key)
entity['name'] = ashish
entity['age'] = 23
client.put(entity)
It will save an entity named Person having properties name and age. Do not forget to specify your correct project id.

Old question but this may be worth including:
I'm unsure the state of your requirements.txt file but I scrounged mine a bit and noticed setuptools was not included.
pip freeze doesn't export setuptools related question
Assuming you're following the tutorial, you likely installed those libraries EXCEPT for setuptools to lib.
I added setuptools=={verionnumber} to requirements.txt and that fixed this related issue for me.

Related

MLFlow -> ModuleNotFoundError: No module named 'sqlalchemy.future'

It seems to use MLFlow Model Registry locally, one option is to build my own backend database with SQLite.
I've found a site, which advised to run:
mlflow server --backend-store-uri sqlite:///mlflow.db --default-artifact-root ./artifacts --host 0.0.0.0 --port 5000
When running the command above, I get the following error message:
2022/05/22 23:08:58 ERROR mlflow.cli: Error initializing backend store
2022/05/22 23:08:58 ERROR mlflow.cli: No module named 'sqlalchemy.future'
Traceback (most recent call last):
File "/home/username/.local/lib/python3.8/site-packages/mlflow/cli.py", line 426, in server
initialize_backend_stores(backend_store_uri, default_artifact_root)
File "/home/username/.local/lib/python3.8/site-packages/mlflow/server/handlers.py", line 259, in initialize_backend_stores
_get_tracking_store(backend_store_uri, default_artifact_root)
File "/home/username/.local/lib/python3.8/site-packages/mlflow/server/handlers.py", line 244, in _get_tracking_store
_tracking_store = _tracking_store_registry.get_store(store_uri, artifact_root)
File "/home/username/.local/lib/python3.8/site-packages/mlflow/tracking/_tracking_service/registry.py", line 39, in get_store
return self._get_store_with_resolved_uri(resolved_store_uri, artifact_uri)
File "/home/username/.local/lib/python3.8/site-packages/mlflow/tracking/_tracking_service/registry.py", line 49, in _get_store_with_resolved_uri
return builder(store_uri=resolved_store_uri, artifact_uri=artifact_uri)
File "/home/username/.local/lib/python3.8/site-packages/mlflow/server/handlers.py", line 110, in _get_sqlalchemy_store
from mlflow.store.tracking.sqlalchemy_store import SqlAlchemyStore
File "/home/username/.local/lib/python3.8/site-packages/mlflow/store/tracking/sqlalchemy_store.py", line 11, in <module>
from sqlalchemy.future import select
ModuleNotFoundError: No module named 'sqlalchemy.future'
This seems odd, because if I run pip freeze, the sqlalchemy shows up, or if I do from sqlalchemy.future import select in a notebook, I get no error.
I think this may related to using a virtual environment. The current one I'm using is in /home/username/folder/mlflow/.mlflow but mlflow seems to be looking elsewhere for the file...
I resolved the issue by downgrading to a lower version of mlflow, (from v1.26.0 to v1.23.1).

GCP dataflow with python. "AttributeError: Can't get attribute '_JsonSink' on module 'dataflow_worker.start'

I am new in GCP dataflow.
I try to read text files(one-line JSON string) into JSON format from GCP cloud storage, then split it based on values of certain field and output to GCP cloud storage (as JSON string text file).
Here is my code
However, I encounter some error on GCP dataflow:
Traceback (most recent call last):
File "main.py", line 169, in <module>
run()
File "main.py", line 163, in run
shard_name_template='')
File "C:\ProgramData\Miniconda3\lib\site-packages\apache_beam\pipeline.py", line 426, in __exit__
self.run().wait_until_finish()
File "C:\ProgramData\Miniconda3\lib\site-packages\apache_beam\runners\dataflow\dataflow_runner.py", line 1346, in wait_until_finish
(self.state, getattr(self._runner, 'last_error_msg', None)), self)
apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 773, in run
self._load_main_session(self.local_staging_directory)
File "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 489, in _load_main_session
pickler.load_session(session_file)
File "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", line 287, in load_session
return dill.load_session(file_path)
File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in load_session
module = unpickler.load()
File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 474, in find_class
return StockUnpickler.find_class(self, module, name)
AttributeError: Can't get attribute '_JsonSink' on <module 'dataflow_worker.start' from '/usr/local/lib/python3.7/site-packages/dataflow_worker/start.py'>
I am able to run this script locally, but it fails when I try to use dataflowRunner
Please give me some suggestions.
PS. apache-beam version: 2.15.0
[Update1]
I try #Yueyang Qiu suggestion, add
pipeline_options.view_as(SetupOptions).save_main_session = True
The provided link says:
DoFn's in this workflow relies on global context (e.g., a module
imported at module level)
This link supports the suggestion above.
However, the same error occurred.
So, I am thinking whether my implementation of _JsonSink (inherit from filebasedsink.FileBasedSink) is wrong or something else needed to be added.
Any opinion would be appreciated, thank you all!
You have encountered a known issue that currently (as of 2.17.0 release), Beam does not support super() calls in main module on Python 3. Please take a look at possible solutions in BEAM-6158. Udi's answer is a good way to address this until BEAM-6158 is resolved, this way you don't have to run your pipeline on Python 2.
Using the guidelines from here, I managed get your example to run.
Directory structure:
./setup.py
./dataflow_json
./dataflow_json/dataflow_json.py (no change from your example)
./dataflow_json/__init__.py (empty file)
./main.py
setup.py:
import setuptools
setuptools.setup(
name='dataflow_json',
version='1.0',
install_requires=[],
packages=setuptools.find_packages(),
)
main.py:
from __future__ import absolute_import
from dataflow_json import dataflow_json
if __name__ == '__main__':
dataflow_json.run()
and you run the pipeline with python main.py.
Basically what's happening is that the '--setup_file=./setup.py' flag tells Beam to create a package and install it on the Dataflow remote worker. The __init__.py file is required for setuptools to identify the dataflow_json/ directory as a package.
I finally find out the problem:
the class '_jsonsink' I implement using some features form Python3
However, I do not aware of what version of Python I am using for 'Dataflowrunner'
(Actually, I have not figured out how to specify the python version for dataflow runner on GCP. Any suggestions?)
Hence, I re-write my code to Python2-compatible version, everything works fine!
Thanks for all of you!
Can you try setting option save_main_session = True as in here: https://github.com/apache/beam/blob/a2b0ad14f1525d1a645cb26f5b8ec45692d9d54e/sdks/python/apache_beam/examples/cookbook/coders.py#L88.

Google App Engine: No module named google.api

I have installed the latest version of google cloud sdk, google-cloud-sdk-app-engine-python on my Ubuntu PC as mentioned in the docs in-order to test google-cloud-endpoints-framework sample app.
But on invoking an api request, I got the below traceback. Seems like there is a conflict between google package inside GAE sdk and the google package installed automatically to the lib folder because of google-endpoints package.
$ dev_appserver.py app.yaml
INFO 2017-03-14 07:51:36,173 devappserver2.py:764] Skipping SDK update check.
INFO 2017-03-14 07:51:36,199 api_server.py:268] Starting API server at: http://localhost:44561
INFO 2017-03-14 07:51:36,213 dispatcher.py:199] Starting module "default" running at: http://localhost:8080
INFO 2017-03-14 07:51:36,213 admin_server.py:116] Starting admin server at: http://localhost:8000
INFO 2017-03-14 07:51:45,811 module.py:806] default: "GET /_ah/start HTTP/1.1" 404 -
ERROR 2017-03-14 07:51:45,877 wsgi.py:263]
Traceback (most recent call last):
File "/usr/lib/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/usr/lib/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 299, in _LoadHandler
handler, path, err = LoadObject(self._handler)
File "/usr/lib/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 85, in LoadObject
obj = __import__(path[0])
File "/home/gemini/gae projects/python-docs-samples/appengine/standard/endpoints-frameworks-v2/echo/main.py", line 19, in <module>
import endpoints
File "/home/gemini/gae projects/python-docs-samples/appengine/standard/endpoints-frameworks-v2/echo/lib/endpoints/__init__.py", line 29, in <module>
from apiserving import *
File "/home/gemini/gae projects/python-docs-samples/appengine/standard/endpoints-frameworks-v2/echo/lib/endpoints/apiserving.py", line 74, in <module>
from google.api.control import client as control_client
File "/usr/lib/google-cloud-sdk/platform/google_appengine/google/appengine/tools/devappserver2/python/sandbox.py", line 1001, in load_module
raise ImportError('No module named %s' % fullname)
ImportError: No module named google.api
I tried creating a seperate virtualenv but the problem still exists.
Here is the reply from a google guy..
Local development with endpoints framework v2 isn't currently supported, you'll need to deploy the app.
https://github.com/GoogleCloudPlatform/python-docs-samples/issues/853
Your error :
ImportError: No module named google.api
So you need first to install gcloud python module and google-api-python-client module with:
pip install --upgrade gcloud
pip install --upgrade google-api-python-client
from here
I had a similar issue with other Google packages in my lib directory.
I solved/monkey patched it using the following code in my appengine_config.py file:
import sys
import os
import google
from google.appengine.ext import vendor
lib_directory = os.path.dirname(__file__) + "<relative path to lib dir>"
google.__path__.append(os.path.join(lib_directory, 'google'))
logging.info("importing lib %s" % (lib_directory))
vendor.add(lib_directory)

How do I install PDFNet python module?

I have been running in circles trying to figure out how to get my Django app to recognise the trial version of PDFNet:
https://pypi.python.org/pypi/PDFTron-PDFNet-SDK-for-Python/5.7
I tried adding the files to my ~/usr/bin directory, I tried dropping them into my virtualenv's bin directory. Neither has worked. I have read all the documentation i can find. I am too new of a Python developer to look at this package and know how to install it and utilize it in my project.
Please help!
update I attempted to create an app folder, and list this component as an app in the app list, but when I run the code to start the application, I get the following error:
Library not loaded: #rpath/libPDFNetC.dylib
I placed all of the lib files into this folder within my project:
__init.py (empty)
_PDFNetPython.so
libPDFNetC.dylib
PDFNetPython.py
PDFNetRuby.bundle
I used the following import code at the top of the py file I was attempting to use the component on:
import site
site.addsitedir("../pdfnetc")
from pdfnetc.PDFNetPython import *
I put the lib files into a app folder named pdfnetc. once I had this, the import statements were no longer listed as unfound by pycharm.
here is the stack trace:
Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 2217, in <module>
globals = debugger.run(setup['file'], None, None)
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1643, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Users/ntregillus/myapp/manage.py", line 12, in <module>
execute_from_command_line(sys.argv)
File "/Users/ntregillus/Envs/myapp/lib/python2.7/site-packages/django/core/management/__init__.py", line 385, in execute_from_command_line
utility.execute()
File "/Users/ntregillus/Envs/myapp/lib/python2.7/site-packages/django/core/management/__init__.py", line 354, in execute
django.setup()
File "/Users/ntregillus/Envs/myapp/lib/python2.7/site-packages/django/__init__.py", line 21, in setup
apps.populate(settings.INSTALLED_APPS)
File "/Users/ntregillus/Envs/myapp/lib/python2.7/site-packages/django/apps/registry.py", line 108, in populate
app_config.import_models(all_models)
File "/Users/ntregillus/Envs/myapp/lib/python2.7/site-packages/django/apps/config.py", line 202, in import_models
self.models_module = import_module(models_module_name)
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/Users/ntregillus/myapp/statements/models.py", line 12, in <module>
from statements.managers import StatementTemplateManager, StatementManager
File "/Users/ntregillus/myapp/statements/managers.py", line 8, in <module>
from statements.utils import render_to_pdf, StatementContextBuilder
File "/Users/ntregillus/myapp/statements/utils.py", line 23, in <module>
from pdfnetc.PDFNetPython import *
File "/Users/ntregillus/myapp/pdfnetc/PDFNetPython.py", line 28, in <module>
_PDFNetPython = swig_import_helper()
File "/Users/ntregillus/myapp/pdfnetc/PDFNetPython.py", line 24, in swig_import_helper
_mod = imp.load_module('_PDFNetPython', fp, pathname, description)
ImportError: dlopen(/Users/ntregillus/myapp/pdfnetc/_PDFNetPython.so, 2): Library not loaded: #rpath/libPDFNetC.dylib
Referenced from: /Users/ntregillus/myapp/pdfnetc/_PDFNetPython.so
Reason: image not found
python -m pip install PDFNetPython3
home page link here
Below are steps to install the PDFNet module:
Download the Python and Ruby prebuilt binaries. Make sure you download the right architecture for your Python interpreter.
Extract the downloaded zip file and navigate to it.
Navigate to the /PDFNetC/Lib directory of the Python SDK download and execute:
> chmod a+x fix_rpaths.sh
./fix_rpaths.sh
I am not knowledgable of django, but take a close look at the sample py files that are in the SDK. For example, take a look at PDFNetC/Samples/AddImageTest/PYTHON folder. In there is a .py file, and look at the first 4 lines of code, they indicate what needs to load.
Please note that you need all the files that are in PDFNetC/Lib folder, including the PDFNetC.dll file.
If this does not help, please post the exact (first) error message you get.
It just works fine for me when i try to run the shell script, btw iam using the python as my programming language for pdfnet,
Try to use these commands to run the pdfn
#!/bin/sh
TEST_NAME=PDFATest
export LD_LIBRARY_PATH=../../../PDFNetC/Lib
python -u $TEST_NAME.py
The PDFATest specifies the python code, and then the export will get the required packages such as the PDFNet which is in
../../Lib
folder so make sure you use that file in the lib and then run your code and you will not have any issues with PDFNet, btw i dont answer this with respect to django but sure this will help you out.

Unable to start App Engine application after updating it via Google Cloud SDK

Recently, I have updated Google App Engine from 1.9.17 to 1.9.18 via Google Cloud SDK by using command 'gcloud components update' in Windows 7 64 bit. After that I wasn't able to start any project using the App Engine launcher. Getting this error:
Traceback (most recent call last):
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\dev_appserver.py", line 83, in <module>
_run_file(__file__, globals())
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\dev_appserver.py", line 79, in _run_file
execfile(_PATHS.script_file(script_name), globals_)
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\devappserver2.py", line 36, in <module>
from google.appengine.tools.devappserver2 import dispatcher
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\dispatcher.py", line 29, in <module>
from google.appengine.tools.devappserver2 import module
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\module.py", line 71, in <module>
from google.appengine.tools.devappserver2 import vm_runtime_factory
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\vm_runtime_factory.py", line 25, in <module>
from google.appengine.tools.devappserver2 import vm_runtime_proxy
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\vm_runtime_proxy.py", line 29, in <module>
from google.appengine.tools.devappserver2 import log_manager
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\log_manager.py", line 34, in <module>
from google.appengine.tools.docker import containers
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\docker\containers.py", line 47, in <module>
import docker
ImportError: No module named docker
2015-03-05 19:11:27 (Process exited with code 1)
I even installed the latest Google Cloud SDK, but I'm getting the same error.
I'm able to install the appengine SDK 1.9.18(without using Google Cloud SDK) and able to run the project successfully.
This error is happening only for the App Engine launcher installed via Google Cloud SDK in Windows 7.
This issue is raised in App Engine Issue Tracker: Issue 125. I recommend you to star this issue.
This has happened to me today to reinstall the app engine sdk. I could not run my code in the launcher.
I remember reading that is not used pip app engine, but now I have solved the problem.
In short what I did was:
Install pip the footsteps of https://pip.pypa.io/en/latest/installing.html (this also correctly install the setuptools)
Install docker-py by pip: pip install docker-py and ready, I can now run my code in the launcher
P.S.
Previously I tried to install the docker-py package, downloaded from https://github.com/docker/docker-py, but lacked setuptools, downloaded and installed the package did not work. So use this with pip.
This is currently an issue with the dev_appserver bundled in the Cloud SDK. A fix will be out soon. In the meanwhile, your options are:
1) Use gcloud preview app run to run your app when using the Cloud SDK
2) Install the standalone AppEngine SDK (which you already mentioned in your question)
If installing docker-py doesn't work and the stacktrace shows that the error line is:
from docker import docker
Change this line to:
import docker
Source
Jumping on the answer from #Tzach and adding some info.
The file to modify is containers.py
for me it is located here :
C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\docker
If you can't modify it because the file is open in an application, it is in fact that the folder is protected. Just copy/paste the file on your desktop and modify it from there. then copy it back in the original folder.

Categories

Resources