How to install libraries python-docx / docx on Google Cloud Shell? - python

I work with python-docx and docx in my PC, but when I've cloned the project to Google Cloud, the problems arose.
Both docx and python-docx are installed there:
igorsavinkin555#cloudshell:~/corrections-msword (coral-heuristic-5610)$ pip install docx --user
Requirement already satisfied: docx in /home/igorsavinkin555/.local/lib/python2.7/site-packages (0.2.4)
Requirement already satisfied: lxml in /home/igorsavinkin555/.local/lib/python2.7/site-packages (from docx) (4.2.4)
Requirement already satisfied: Pillow>=2.0 in /home/igorsavinkin555/.local/lib/python2.7/site-packages (from docx) (5.2.0)
igorsavinkin555#cloudshell:~/corrections-msword (coral-heuristic-215610)$ pip install python-docx --user
Requirement already satisfied: python-docx in /home/igorsavinkin555/.local/lib/python2.7/site-packages (0.8.7)
Requirement already satisfied: lxml>=2.3.2 in /home/igorsavinkin555/.local/lib/python2.7/site-packages (from python-docx) (4.2.4)
igorsavinkin555#cloudshell:~/corrections-msword (coral-heuristic-215610)$
Problem with docx.Document:
igorsavinkin555#cloudshell:~/corrections-msword (coral-heuristic-215610)$ dev_appserver.py $PWD
...
INFO 2018-09-07 14:31:48,503 api_server.py:275] Starting API server at: http://0.0.0.0:41739
INFO 2018-09-07 14:31:48,518 dispatcher.py:270] Starting module "default" running at: http://0.0.0.0:8080
INFO 2018-09-07 14:31:48,519 admin_server.py:152] Starting admin server at: http://0.0.0.0:8000
INFO 2018-09-07 14:31:50,533 instance.py:294] Instance PID: 727
ERROR 2018-09-07 14:32:00,844 wsgi.py:263]
Traceback (most recent call last):
File "/google/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/google/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 299, in _LoadHandler
handler, path, err = LoadObject(self._handler)
File "/google/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 85, in LoadObject
obj = __import__(path[0])
File "/home/igorsavinkin555/corrections-msword/main.py", line 2, in <module>
from docx.document import Document
ImportError: No module named docx.document
Update
The installing of those 3-d party libraries into project lib folder has benefited, now these packages in project/lib folder. Yet, now the error in lxml library:
File "/home/igorsavinkin555/corrections-msword/main.py", line 2, in <module>
from docx.document import Document
File "/home/igorsavinkin555/corrections-msword/lib/docx/__init__.py", line 3, in <module>
from docx.api import Document # noqa
File "/home/igorsavinkin555/corrections-msword/lib/docx/api.py", line 14, in <module>
from docx.package import Package
File "/home/igorsavinkin555/corrections-msword/lib/docx/package.py", line 11, in <module>
from docx.opc.package import OpcPackage
File "/home/igorsavinkin555/corrections-msword/lib/docx/opc/package.py", line 12, in <module>
from .part import PartFactory
File "/home/igorsavinkin555/corrections-msword/lib/docx/opc/part.py", line 12, in <module>
from .oxml import serialize_part_xml
File "/home/igorsavinkin555/corrections-msword/lib/docx/opc/oxml.py", line 12, in <module>
from lxml import etree
File "/google/google-cloud-sdk/platform/google_appengine/google/appengine/tools/devappserver2/python/runtime/sandbox.py
", line 1095, in load_module
raise ImportError('No module named %s' % fullname)
ImportError: No module named lxml.etree
By the way, this was a right way since:
You can install additional software packages on the Coogle Cloud Shell virtual machine instance but the installation will not persist after the instance terminates unless you install the software in your $HOME directory
(source).

Using Google Cloud Shell is only possible to install persistent data in the $HOME directory. Reaching inactive time limit of 60 minutes, VM Instance will be terminated. Accessing it afterwards, it will be provisioned from the image on the new VM Instance.
However, Google Cloud Shell has 5GB persistent dis storage located in $HOME directory, that does not expire, but may be recycled. User receives email prior recycle process.

Related

Google Python cloud-dataflow instances broke without new deployment (failed pubsub import)

I have defined a few different Cloud Dataflow jobs for Python in the Google AppEngine Flex Environment. I have defined my requirements in a requirements.txt file, included my setup.py file, and everything was working just fine. My last deployment was on May 3rd, 2018. Looking through logs, I see that one of my jobs began failing on May 22nd, 2018. The job fails with a stack trace resulting from a bad import, seen below.
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 582, in do_work
work_executor.execute()
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 166, in execute
op.start()
File "apache_beam/runners/worker/operations.py", line 294, in apache_beam.runners.worker.operations.DoOperation.start (apache_beam/runners/worker/operations.c:10607)
def start(self):
File "apache_beam/runners/worker/operations.py", line 295, in apache_beam.runners.worker.operations.DoOperation.start (apache_beam/runners/worker/operations.c:10501)
with self.scoped_start_state:
File "apache_beam/runners/worker/operations.py", line 300, in apache_beam.runners.worker.operations.DoOperation.start (apache_beam/runners/worker/operations.c:9702)
pickler.loads(self.spec.serialized_fn))
File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 225, in loads
return dill.loads(s)
File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 277, in loads
return load(file)
File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 266, in load
obj = pik.load()
File "/usr/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self)
File "/usr/lib/python2.7/pickle.py", line 1090, in load_global
klass = self.find_class(module, name)
File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 423, in find_class
return StockUnpickler.find_class(self, module, name)
File "/usr/lib/python2.7/pickle.py", line 1124, in find_class
__import__(module)
File "/usr/local/lib/python2.7/dist-packages/dataflow_pipeline/tally_overages.py", line 27, in <module>
from google.cloud import pubsub
File "/usr/local/lib/python2.7/dist-packages/google/cloud/pubsub.py", line 17, in <module>
from google.cloud.pubsub_v1 import PublisherClient
File "/usr/local/lib/python2.7/dist-packages/google/cloud/pubsub_v1/__init__.py", line 17, in <module>
from google.cloud.pubsub_v1 import types
File "/usr/local/lib/python2.7/dist-packages/google/cloud/pubsub_v1/types.py", line 26, in <module>
from google.iam.v1.logging import audit_data_pb2
ImportError: No module named logging
So the main issue seems to come from the pubsub dependency relying on importing google.iam.v1.logging, which is installed from grpc-google-iam-v1.
Here is my requirements.txt file
Flask==0.12.2
apache-beam[gcp]==2.1.1
gunicorn==19.7.1
google-cloud-dataflow==2.1.1
google-cloud-datastore==1.3.0
pytz
google-cloud-pubsub
google-gax
grpc-google-iam-v1
googleapis-common-protos
google-cloud==0.32
six==1.10.0
protobuf
I am able to run everything locally just fine by doing the following from my project.
$ virtualenv --no-site-packages .
$ . bin/activate
$ pip install --ignore-installed -r requirements.txt
$ python main.py
No handlers could be found for logger "oauth2client.contrib.multistore_file"
INFO:werkzeug: * Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)
INFO:werkzeug: * Restarting with stat
No handlers could be found for logger "oauth2client.contrib.multistore_file"
WARNING:werkzeug: * Debugger is active!
INFO:werkzeug: * Debugger PIN: 317-820-645
specifically, I am able to do the following locally just fine
$ python
>>> from google.cloud import pubsub
>>> import google.iam.v1.logging
>>> google.iam.v1.logging.__file__
'/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/google/iam/v1/logging/__init__.pyc'
So I know that the installation of the grpc-google-iam-v1 package is working just fine locally.. the required files are there.
My questions are
Why is the install of grpc-google-iam-v1 on the Google AppEngine Flex Environment not installing all of the files correctly? I must be missing the /site-packages/google/iam/v1/logging directory.
Why would this randomly start failing? I didn't do any more deploys, the same code was running and working on the 21st and then it broke on the 22nd on May.
I was able to get the pipeline running again after changing the requirements.txt file to
Flask==0.12.2
apache-beam[gcp]
google-cloud-dataflow
gunicorn==19.7.1
google-cloud-datastore==1.3.0
pytz
google-cloud-pubsub
google-gax
grpc-google-iam-v1
googleapis-common-protos
google-cloud==0.32
six==1.10.0
protobuf
so simply removing the version requirements from apache-beam[gcp] and google-cloud-dataflow did the trick.
Building on the solution provided by John Allard, removing the version from the requirements.txt will automatically default to the latest version. Thus, with no version specified for apache-beam[gcp] , google-cloud-dataflow and google-cloud-pubsub they will all run on the latest version and solve the dependency issue. The requirements.txt will look like the following:
Flask==0.12.2
apache-beam[gcp]
gunicorn==19.7.1
google-cloud-dataflow
google-cloud-datastore==1.3.0
pytz
google-cloud-pubsub
google-gax
grpc-google-iam-v1
googleapis-common-protos
google-cloud==0.32
six==1.10.0
protobuf

Google App Engine: No module named google.api

I have installed the latest version of google cloud sdk, google-cloud-sdk-app-engine-python on my Ubuntu PC as mentioned in the docs in-order to test google-cloud-endpoints-framework sample app.
But on invoking an api request, I got the below traceback. Seems like there is a conflict between google package inside GAE sdk and the google package installed automatically to the lib folder because of google-endpoints package.
$ dev_appserver.py app.yaml
INFO 2017-03-14 07:51:36,173 devappserver2.py:764] Skipping SDK update check.
INFO 2017-03-14 07:51:36,199 api_server.py:268] Starting API server at: http://localhost:44561
INFO 2017-03-14 07:51:36,213 dispatcher.py:199] Starting module "default" running at: http://localhost:8080
INFO 2017-03-14 07:51:36,213 admin_server.py:116] Starting admin server at: http://localhost:8000
INFO 2017-03-14 07:51:45,811 module.py:806] default: "GET /_ah/start HTTP/1.1" 404 -
ERROR 2017-03-14 07:51:45,877 wsgi.py:263]
Traceback (most recent call last):
File "/usr/lib/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/usr/lib/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 299, in _LoadHandler
handler, path, err = LoadObject(self._handler)
File "/usr/lib/google-cloud-sdk/platform/google_appengine/google/appengine/runtime/wsgi.py", line 85, in LoadObject
obj = __import__(path[0])
File "/home/gemini/gae projects/python-docs-samples/appengine/standard/endpoints-frameworks-v2/echo/main.py", line 19, in <module>
import endpoints
File "/home/gemini/gae projects/python-docs-samples/appengine/standard/endpoints-frameworks-v2/echo/lib/endpoints/__init__.py", line 29, in <module>
from apiserving import *
File "/home/gemini/gae projects/python-docs-samples/appengine/standard/endpoints-frameworks-v2/echo/lib/endpoints/apiserving.py", line 74, in <module>
from google.api.control import client as control_client
File "/usr/lib/google-cloud-sdk/platform/google_appengine/google/appengine/tools/devappserver2/python/sandbox.py", line 1001, in load_module
raise ImportError('No module named %s' % fullname)
ImportError: No module named google.api
I tried creating a seperate virtualenv but the problem still exists.
Here is the reply from a google guy..
Local development with endpoints framework v2 isn't currently supported, you'll need to deploy the app.
https://github.com/GoogleCloudPlatform/python-docs-samples/issues/853
Your error :
ImportError: No module named google.api
So you need first to install gcloud python module and google-api-python-client module with:
pip install --upgrade gcloud
pip install --upgrade google-api-python-client
from here
I had a similar issue with other Google packages in my lib directory.
I solved/monkey patched it using the following code in my appengine_config.py file:
import sys
import os
import google
from google.appengine.ext import vendor
lib_directory = os.path.dirname(__file__) + "<relative path to lib dir>"
google.__path__.append(os.path.join(lib_directory, 'google'))
logging.info("importing lib %s" % (lib_directory))
vendor.add(lib_directory)

Cannot get gcloud to work with Python and Pycharm

I am trying to connect to the Google App Engine Datastore from my local machine. I have spent all day digging in to this without any luck.
I have tried the approach here (as well as alot of other suggestions from SO such as Using gcloud-python in GAE and Unable to run dev_appserver.py with gcloud):
How to access a remote datastore when running dev_appserver.py?
I first installed gcloud based on this description from google:
https://cloud.google.com/appengine/docs/python/tools/using-libraries-python-27
According to the description I should add the following to my appengine_config.py:
from google.appengine.ext import vendor
vendor.add('lib')
If I do that I get an error saying ImportError: No module named gcloud
If I then move the code to my main.py it seems to pickup the lib-folder and the modules there. That seems a bit strange to me, since I thought appengine_config was being run first to make sure things were initialised.
But now I am getting the following stack trace:
ERROR 2016-09-23 17:22:30,623 cgi.py:122] Traceback (most recent call last):
File "/Users/thomasd/Documents/github/myapp/main.py", line 10, in <module>
from gcloud import datastore
File "/Users/thomasd/Documents/github/myapp/lib/gcloud/__init__.py", line 17, in <module>
from pkg_resources import get_distribution
File "/Users/thomasd/Documents/github/myapp/lib/pkg_resources/__init__.py", line 2985, in <module>
#_call_aside
File "/Users/thomasd/Documents/github/myapp/lib/pkg_resources/__init__.py", line 2971, in _call_aside
f(*args, **kwargs)
File "/Users/thomasd/Documents/github/myapp/lib/pkg_resources/__init__.py", line 3013, in _initialize_master_working_set
dist.activate(replace=False)
File "/Users/thomasd/Documents/github/myapp/lib/pkg_resources/__init__.py", line 2544, in activate
declare_namespace(pkg)
File "/Users/thomasd/Documents/github/myapp/lib/pkg_resources/__init__.py", line 2118, in declare_namespace
_handle_ns(packageName, path_item)
File "/Users/thomasd/Documents/github/myapp/lib/pkg_resources/__init__.py", line 2057, in _handle_ns
loader.load_module(packageName)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pkgutil.py", line 246, in load_module
mod = imp.load_module(fullname, self.file, self.filename, self.etc)
File "/Library/Python/2.7/site-packages/google/cloud/logging/__init__.py", line 18, in <module>
File "/usr/local/google_appengine/google/appengine/tools/devappserver2/python/sandbox.py", line 999, in load_module
raise ImportError('No module named %s' % fullname)
ImportError: No module named google.cloud.logging.client
What am I doing wrong here?
The google-cloud library is not working on App Engine and most likely you don't even have to since you can use the build in functionality.
From the official docs you can use it like this:
import cloudstorage as gcs
I solved it this way:-
1.) Create a lib folder in your project path.
2.) Install gcloud libraries by running following command into terminal from your project path:-
pip install -t lib gcloud
3.) Create an appengine_config.py module in your project and add following lines of code:-
import sys
import os.path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'lib'))
4.) After this, you can import like this:-
from gcloud import datastore
5.) To save data into live google datastore from local:-
client = datastore.Client("project-id")
key = client.key('Person')
entity = datastore.Entity(key=key)
entity['name'] = ashish
entity['age'] = 23
client.put(entity)
It will save an entity named Person having properties name and age. Do not forget to specify your correct project id.
Old question but this may be worth including:
I'm unsure the state of your requirements.txt file but I scrounged mine a bit and noticed setuptools was not included.
pip freeze doesn't export setuptools related question
Assuming you're following the tutorial, you likely installed those libraries EXCEPT for setuptools to lib.
I added setuptools=={verionnumber} to requirements.txt and that fixed this related issue for me.

Shapely install on Mac with pyenv-virtualenv failed

OS: mac, El Capitan
Python version: 2.7.11
I tried to install shapely after setting up a brand new pyenv-virtualenv and changing my directory to use that 'pyenv version' and I get the following error:
dhcp-uris-1290:seam Chelsea$ pip install shapely
Collecting shapely
Using cached Shapely-1.5.15.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/1l/r_vh6y997rdgk6sdnkqsntpr0000gn/T/pip-build-odrxBR/shapely/setup.py", line 38, in <module>
from shapely._buildcfg import geos_version_string, geos_version, \
File "/private/var/folders/1l/r_vh6y997rdgk6sdnkqsntpr0000gn/T/pip-build-odrxBR/shapely/shapely/_buildcfg.py", line 88, in <module>
clibs = get_geos_config('--clibs')
File "/private/var/folders/1l/r_vh6y997rdgk6sdnkqsntpr0000gn/T/pip-build-odrxBR/shapely/shapely/_buildcfg.py", line 71, in get_geos_config
stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()
File "/Users/Chelsea/.pyenv/versions/2.7.11/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/Users/Chelsea/.pyenv/versions/2.7.11/lib/python2.7/subprocess.py", line 1334, in _execute_child
child_exception = pickle.loads(data)
File "/Users/Chelsea/.pyenv/versions/2.7.11/lib/python2.7/pickle.py", line 1388, in loads
return Unpickler(file).load()
File "/Users/Chelsea/.pyenv/versions/2.7.11/lib/python2.7/pickle.py", line 864, in load
dispatch[key](self)
File "/Users/Chelsea/.pyenv/versions/2.7.11/lib/python2.7/pickle.py", line 972, in load_string
raise ValueError, "insecure string pickle"
ValueError: insecure string pickle
-----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/1l/r_vh6y997rdgk6sdnkqsntpr0000gn/T/pip-build-odrxBR/shapely/
The whole reason I got into this virtualenv business is because I was getting insecure string pickle errors when trying to use matplotlib with system python on el capitan, which is known to be an issue (https://github.com/matplotlib/matplotlib/issues/5314) , so I tried to build my own new framework and run it, that didn't work, and now tried to set up a virtualenv but it looks like virtualenv might just not play nicely with certain python packages? :'(
Shapely install correctly OUTSIDE of a virtualenv on my machine, with system python. Sadly system python, as mentioned, was not playing nicely with matplotlib.
P.S. these are the packages I have installed in this particular pyenv-virtualenv:
cycler (0.10.0)
distribute (0.7.3)
matplotlib (1.5.1)
numpy (1.11.0)
pip (8.1.2)
pyparsing (2.1.4)
python-dateutil (2.5.3)
pytz (2016.4)
setuptools (21.1.0)
six (1.10.0)
wheel (0.29.0)

Unable to start App Engine application after updating it via Google Cloud SDK

Recently, I have updated Google App Engine from 1.9.17 to 1.9.18 via Google Cloud SDK by using command 'gcloud components update' in Windows 7 64 bit. After that I wasn't able to start any project using the App Engine launcher. Getting this error:
Traceback (most recent call last):
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\dev_appserver.py", line 83, in <module>
_run_file(__file__, globals())
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\dev_appserver.py", line 79, in _run_file
execfile(_PATHS.script_file(script_name), globals_)
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\devappserver2.py", line 36, in <module>
from google.appengine.tools.devappserver2 import dispatcher
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\dispatcher.py", line 29, in <module>
from google.appengine.tools.devappserver2 import module
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\module.py", line 71, in <module>
from google.appengine.tools.devappserver2 import vm_runtime_factory
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\vm_runtime_factory.py", line 25, in <module>
from google.appengine.tools.devappserver2 import vm_runtime_proxy
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\vm_runtime_proxy.py", line 29, in <module>
from google.appengine.tools.devappserver2 import log_manager
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\log_manager.py", line 34, in <module>
from google.appengine.tools.docker import containers
File "C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\docker\containers.py", line 47, in <module>
import docker
ImportError: No module named docker
2015-03-05 19:11:27 (Process exited with code 1)
I even installed the latest Google Cloud SDK, but I'm getting the same error.
I'm able to install the appengine SDK 1.9.18(without using Google Cloud SDK) and able to run the project successfully.
This error is happening only for the App Engine launcher installed via Google Cloud SDK in Windows 7.
This issue is raised in App Engine Issue Tracker: Issue 125. I recommend you to star this issue.
This has happened to me today to reinstall the app engine sdk. I could not run my code in the launcher.
I remember reading that is not used pip app engine, but now I have solved the problem.
In short what I did was:
Install pip the footsteps of https://pip.pypa.io/en/latest/installing.html (this also correctly install the setuptools)
Install docker-py by pip: pip install docker-py and ready, I can now run my code in the launcher
P.S.
Previously I tried to install the docker-py package, downloaded from https://github.com/docker/docker-py, but lacked setuptools, downloaded and installed the package did not work. So use this with pip.
This is currently an issue with the dev_appserver bundled in the Cloud SDK. A fix will be out soon. In the meanwhile, your options are:
1) Use gcloud preview app run to run your app when using the Cloud SDK
2) Install the standalone AppEngine SDK (which you already mentioned in your question)
If installing docker-py doesn't work and the stacktrace shows that the error line is:
from docker import docker
Change this line to:
import docker
Source
Jumping on the answer from #Tzach and adding some info.
The file to modify is containers.py
for me it is located here :
C:\Program Files\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\docker
If you can't modify it because the file is open in an application, it is in fact that the folder is protected. Just copy/paste the file on your desktop and modify it from there. then copy it back in the original folder.

Categories

Resources