python executeable created with pyinstaller isnt working - python

I tried to compile a python project with pyinstaller into a single executable. Does anyone has a idea why it works when i run the python project, but fails to run when i execute the executable?
My dependecies are down below.
Pillow 9.3.0 9.3.0
altgraph 0.17.3 0.17.3
certifi 2022.9.24 2022.9.24
charset-normalizer 2.1.1 3.0.1
contourpy 1.0.6 1.0.6
cycler 0.11.0 0.11.0
fonttools 4.38.0 4.38.0
idna 3.4 3.4
kiwisolver 1.4.4 1.4.4
matplotlib 3.6.2 3.6.2
numpy 1.23.5 1.23.5
nvidia-cublas-cu11 11.10.3.66 11.11.3.6
nvidia-cuda-nvrtc-cu11 11.7.99 11.8.89
nvidia-cuda-runtime-cu11 11.7.99 11.8.89
nvidia-cudnn-cu11 8.5.0.96 8.6.0.163
opencv-python 4.6.0.66 4.6.0.66
packaging 21.3 21.3
pip 21.3.1 22.3.1
pyinstaller 5.6.2 5.6.2
pyinstaller-hooks-contrib 2022.13 2022.13
pyparsing 3.0.9 3.0.9
python-dateutil 2.8.2 2.8.2
requests 2.28.1 2.28.1
setuptools 60.2.0 65.6.3
six 1.16.0 1.16.0
torch 1.13.0 1.13.0
torchvision 0.14.0 0.14.0
typing-extensions 4.4.0 4.4.0
urllib3 1.26.13 1.26.13
wheel 0.38.4 0.38.4
I executed it and got this errors:
[19507] WARNING: file already exists but should not: /tmp/_MEIF8RVEq/torch/_C.cpython-39-x86_64-linux-gnu.so
[19507] WARNING: file already exists but should not: /tmp/_MEIF8RVEq/torch/_C_flatbuffer.cpython-39-x86_64-linux-gnu.so
torchvision/io/image.py:13: UserWarning: Failed to load image Python extension:
torch/_jit_internal.py:839: UserWarning: Unable to retrieve source for #torch.jit._overload function: <function _DenseLayer.forward at 0x7fc0c005b8b0>.
warnings.warn(
torch/_jit_internal.py:839: UserWarning: Unable to retrieve source for #torch.jit._overload function: <function _DenseLayer.forward at 0x7fc0c006fb80>.
warnings.warn(
torch/serialization.py:834: UserWarning: Couldn't retrieve source code for container of type RRDBNet. It won't be checked for correctness upon loading.
warnings.warn("Couldn't retrieve source code for container of "
torch/serialization.py:868: SourceChangeWarning: source code of class 'torch.nn.modules.conv.Conv2d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
torch/serialization.py:868: SourceChangeWarning: source code of class 'torch.nn.modules.container.Sequential' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
torch/serialization.py:834: UserWarning: Couldn't retrieve source code for container of type RRDB. It won't be checked for correctness upon loading.
warnings.warn("Couldn't retrieve source code for container of "
torch/serialization.py:834: UserWarning: Couldn't retrieve source code for container of type ResidualDenseBlock. It won't be checked for correctness upon loading.
warnings.warn("Couldn't retrieve source code for container of "
torch/serialization.py:868: SourceChangeWarning: source code of class 'torch.nn.modules.activation.LeakyReLU' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)

Related

I keep getting ModuleNotFoundError where is the issue?

So, I'm trying to install and use the google-images-download repo both through: pip install google-images-download and pip install git+https://github.com/Joeclinton1/google-images-download.git
I've tried installing it as SU as well. In PyCharm when I view packages I do see it but when I try this code:
from google_images_download import google_images_download
#instantiate the class
response = google_images_download.googleimagesdownload()
arguments = {"keywords":"aeroplane, school bus, dog in front of house",
"limit":10,"print_urls":False}
paths = response.download(arguments)
#print complete paths to the downloaded images
print(paths)
it gives this error continuously:
Traceback (most recent call last):
File "/Users/*x*/Desktop/SchoolPython/PythonUVA/Webscrape.py", line 1, in <module>
from google_images_download import google_images_download
ModuleNotFoundError: No module named 'google_images_download'
I think it might not be looking in the right filepath or library but any other repo I tried previously did work.
Any help is greatly appreciated.
*edit for versions
(3.9UVA) MacBook-Pro-van-Flavia:Webscrape.py flavia$ which pip
/Users/flavia/PycharmProjects/3.9UVA/bin/pip
(3.9UVA) MacBook-Pro-van-Flavia:Webscrape.py flavia$ which python
/Users/flavia/PycharmProjects/3.9UVA/bin/python
(3.9UVA) MacBook-Pro-van-Flavia:Webscrape.py flavia$ pip list
Package Version
---------------------- -----------
async-generator 1.10
attrs 21.4.0
certifi 2022.5.18.1
cffi 1.15.0
charset-normalizer 2.0.12
cryptography 37.0.2
google-images-download 2.8.0
h11 0.13.0
idna 3.3
outcome 1.1.0
Pillow 9.1.1
pip 21.3.1
pycparser 2.21
pyOpenSSL 22.0.0
PySocks 1.7.1
requests 2.27.1
selenium 4.2.0
setuptools 60.2.0
sniffio 1.2.0
sortedcontainers 2.4.0
trio 0.20.0
trio-websocket 0.9.2
urllib3 1.26.9
wheel 0.37.1
wsproto 1.1.0

Why can't the import be resolved?

I've seen several answers to this question, albeit none of the solutions have worked for my particular situation. I'm trying to get started building an API with Flask. When I try to import Flask-RESTful, I get an error in VS Code. For context, I am using Windows 11. Here are the first two lines of my .py file:
from flask import Flask
from flask_restful import Resource, Api, reqparse
The error I get reads as:
Import "flask_restful" could not be resolved Pylance(reportMissingImports)
Now, to add more context, I've checked to make sure the interpreter path is set using Ctrl+Shift+P to open the Command Palette and selecting the correct (and the only) Python interpreter for the project inside my virtual environment. When I run pip list, I get this output:
(api) C:\Users\<Username>\OneDrive\Documents\PythonProjects\api>pip list
Package Version
----------------------- ---------
aiohttp 3.8.1
aiosignal 1.2.0
alembic 1.8.0
aniso8601 9.0.1
anyio 3.6.1
async-timeout 4.0.2
attrs 21.4.0
bleach 5.0.1
certifi 2022.6.15
charset-normalizer 2.1.0
click 8.1.3
click-log 0.4.0
colorama 0.4.5
deprecation 2.1.0
docutils 0.19
dotty-dict 1.3.0
Flask 2.1.2
Flask-Migrate 3.1.0
Flask-RESTful 0.3.9
Flask-SQLAlchemy 2.5.1
flask-swagger 0.2.14
frozenlist 1.3.0
gitdb 4.0.9
GitPython 3.1.27
gotrue 0.5.0
greenlet 1.1.2
h11 0.12.0
httpcore 0.14.7
httpx 0.21.3
idna 3.3
importlib-metadata 4.12.0
invoke 1.7.1
itsdangerous 2.1.2
Jinja2 3.1.2
keyring 23.6.0
Mako 1.2.1
MarkupSafe 2.1.1
multidict 6.0.2
packaging 21.3
pip 22.0.4
pkginfo 1.8.3
postgrest-py 0.10.2
psycopg2 2.9.3
pydantic 1.9.1
Pygments 2.12.0
pyparsing 3.0.9
python-dateutil 2.8.2
python-gitlab 3.6.0
python-semantic-release 7.28.1
pytz 2022.1
pywin32-ctypes 0.2.0
PyYAML 6.0
readme-renderer 35.0
realtime 0.0.4
requests 2.28.1
requests-toolbelt 0.9.1
rfc3986 1.5.0
semver 2.13.0
setuptools 58.1.0
setuptools-scm 7.0.4
six 1.16.0
smmap 5.0.0
sniffio 1.2.0
SQLAlchemy 1.4.39
storage3 0.3.4
supabase 0.5.8
supabase-client 0.2.4
tomli 2.0.1
tomlkit 0.10.2
tqdm 4.64.0
twine 3.8.0
typing_extensions 4.3.0
urllib3 1.26.10
webencodings 0.5.1
websockets 9.1
Werkzeug 2.1.2
wheel 0.37.1
yarl 1.7.2
zipp 3.8.0
Why would the flask...Flask import work, but not flask_restful? I can see both in the Lib\site-packages folder in my project directory and the output from pip list outside the virtual environment is different, which signals to me that there isn't an issue with the path or directories.
EDIT: I forgot to mention that when I run the code using Ctrl + Alt + N, I get this output:
Traceback (most recent call last):
File "c:\Users\<Username>\OneDrive\Documents\PythonProjects\api\api.py", line 3, in <module>
from flask_restful import Resource, Api, reqparse
ModuleNotFoundError: No module named 'flask_restful'
Again, no errors with importing flask, only with flask_restful.
Any help with this will be greatly appreciated! Thank you in advance for your time. I'm happy to provide more info if needed. Thanks.
EDIT: I have updated pip and attempted to simply run the program inside the command prompt. This is what I got. I'm still getting the import error inside VS Code, though. I am going to see if using a different version of Python makes a difference. Thanks everyone for all of your help so far, I appreciate it!
EDIT: Okay, it seems like the issue is a little closer to being solved. So, I updated pip. I retried setting the interpreter path and, which some of you mentioned, it turns out that I'd been doing it wrong. I had to do Ctrl + Shift + P >> Python: Select Interpreter >> Enter interpreter path and select the correct path that way. I did this by going into the project directory, going to the scripts folder, and selecting python.exe.
That solved the issue with Pylance. I no longer see an error in the editor when working on the project. However, the interpreter will not show in the bottom right hand corner of the window. That may just be a bug and I can either look through the issues on GitHub or open a new one some other time I assume.
When I run the code with Ctrl + Alt + N I get a ModuleNotFoundError relating to flask_restful again. But, when I run set flask_app=api.py >> flask run in the terminal, it has changed from a white background in the browser to a black background and displays the message it is intended to display (a simple "Hello, World" as a test).
Should I just keep going until I run into another issue? I also tried python -m api and that worked as well. Should I just ignore the VS Code output window? Also, sorry about the late replies. I appreciate everyone's help and patience.
Use the Ctrl+Shift+P command, search for and select Python:Select Interpreter(Or click directly on the python version displayed in the lower right corner), and select the correct interpreter.

Compiling scikit-image on Windows to run tests

I'm stumped. I'm developing some enhancements to scikit-image which are failing the automated build tests, probably due to rounding errors. I therefore need to get the automated tests running on my Windows system so that I can debug and work out what's wrong. I've so far tried two approaches, neither of which are working:
In my Anaconda Python 3.6 environment, when I try to run the automated tests, I am getting the following error:
RuntimeError: module compiled against API version 0xc but this version of numpy is 0xb
...which I have found reference to in other contexts, but have not been able to eliminate.
Since the automated test do run (but fail) on a Python 3.5-based system, I thought things might work if I tried a local Python 3.5 environment. Here, I am running into the issue that, despite being installed, the environment cannot find the MS C++ compiler cl.exe. It is installed in C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\VC\Tools\MSVC\14.15.26726\bin\HostX86\x64\ and is found and executed by my Python 3.6 environment, but my Python 3.5 environment doesn't find it despite me adding that directory to my PATH. I should add that my Python 3.6 environment finds it without the directory being added to the PATH. I understand that both Python 3.5 and 3.6 use MSVC 14.0.
I would prefer to fix the problem in my Python 3.6 environment if possible. Any assistance much appreciated.
Update
I have made a box-fresh Python 3.6 conda environment as follows:
conda create --name sk36 python=3.6
conda activate sk36
conda install scikit-image --only-deps
conda install cython
git clone https://github.com/scikit-image/scikit-image.git
cd scikit-image
pip install -e .
pytest skimage/feature
The specific error I am getting is as follows:
..\Anaconda3\lib\site-packages\py\_path\local.py:662: in pyimport
__import__(modname)
skimage\__init__.py:135: in <module>
from .data import data_dir
skimage\data\__init__.py:13: in <module>
from ..io import imread, use_plugin
skimage\io\__init__.py:7: in <module>
from .manage_plugins import *
skimage\io\manage_plugins.py:24: in <module>
from .collection import imread_collection_wrapper
skimage\io\collection.py:12: in <module>
from ..external.tifffile import TiffFile
skimage\external\tifffile\__init__.py:1: in <module>
from .tifffile import imsave, imread, imshow, TiffFile, TiffWriter, TiffSequence
skimage\external\tifffile\tifffile.py:292: in <module>
from . import _tifffile
E RuntimeError: module compiled against API version 0xc but this version of numpy is 0xb
...which appears to have something to do with tifffile. Since this package wasn't originally explicitly installed in my new environment, I tried installing various versions of it, including some which downgraded numpy and scipy. Still the same error as above.
Having done some more research it would appear that something is seeing numpy 1.13.x when in fact version 1.15.4 is installed. Here is the full output from conda list:
# Name Version Build Channel
blas 1.0 mkl anaconda
ca-certificates 2018.03.07 0 anaconda
certifi 2018.10.15 py36_0 anaconda
cloudpickle 0.6.1 py36_0 anaconda
cycler 0.10.0 py36h009560c_0 anaconda
cython 0.29 py36ha925a31_0 anaconda
dask-core 0.20.0 py36_0 anaconda
decorator 4.3.0 py36_0 anaconda
freetype 2.9.1 ha9979f8_1 anaconda
icc_rt 2017.0.4 h97af966_0 anaconda
icu 58.2 ha66f8fd_1 anaconda
imageio 2.4.1 py36_0 anaconda
intel-openmp 2019.0 118 anaconda
jpeg 9b hb83a4c4_2 anaconda
kiwisolver 1.0.1 py36h6538335_0 anaconda
libpng 1.6.35 h2a8f88b_0 anaconda
libtiff 4.0.9 h36446d0_2 anaconda
matplotlib 3.0.1 py36hc8f65d3_0 anaconda
mkl 2019.0 118 anaconda
mkl_fft 1.0.6 py36hdbbee80_0 anaconda
mkl_random 1.0.1 py36h77b88f5_1 anaconda
networkx 2.2 py36_1 anaconda
numpy 1.15.4 py36ha559c80_0 anaconda
numpy-base 1.15.4 py36h8128ebf_0 anaconda
olefile 0.46 py36_0 anaconda
openssl 1.0.2p hfa6e2cd_0 anaconda
package_has_been_revoked 1.0 0 enable_revoked
pillow 5.3.0 py36hdc69c19_0 anaconda
pip 18.1 py36_0 anaconda
pyparsing 2.3.0 py36_0 anaconda
pyqt 5.9.2 py36h6538335_2 anaconda
python 3.6.7 h33f27b4_1 anaconda
python-dateutil 2.7.5 py36_0 anaconda
pytz 2018.7 py36_0 anaconda
pywavelets 1.0.1 py36h8c2d366_0 anaconda
qt 5.9.6 vc14h1e9a669_2 anaconda
scikit-image 0.15.dev0 <pip>
scipy 1.1.0 py36h4f6bf74_1 anaconda
setuptools 40.5.0 py36_0 anaconda
sip 4.19.8 py36h6538335_0 anaconda
six 1.11.0 py36_1 anaconda
sqlite 3.25.2 hfa6e2cd_0 anaconda
tifffile 0.15.1 py36h452e1ab_1001 conda-forge
tk 8.6.8 hfa6e2cd_0 anaconda
toolz 0.9.0 py36_0 anaconda
tornado 5.1.1 py36hfa6e2cd_0 anaconda
vc 14.1 h21ff451_3 anaconda
vs2015_runtime 15.5.2 3 anaconda
wheel 0.32.2 py36_0 anaconda
wincertstore 0.2 py36h7fe50ca_0 anaconda
zlib 1.2.11 h8395fce_2 anaconda
Update 2
I've solved the problem for Python 3.6, and I think there's enough information above for the astute to be able to work out what was wrong. I'll put the solution in an answer below.
A cleanly built Python 3.5 environment can't find the compiler, so that issue still remains.
One approach you could try is to upgrade your numpy with
pip install numpy --upgrade
as described here: RuntimeError: module compiled against API version a but this version of numpy is 9
Otherwise (if for some reason you cannot upgrade numpy) I would suggest going with a virtual environment for scikit-image project. I just tried it on Windows 10 and was able to successfully execute tests. My steps (from cmd, inside the project folder):
conda uninstall scikit-image to remove any previously built/installed versions
conda -n scikit-image python=3.6 to create a virtual environment for this project (I used python 3.6, but you can change it to 3.5)
activate scikit-image activated the new virtual env
pip install -r requirements.txt -- installed dependencies (without this step I wasn't getting the dependencies for tests installed)
pip install -e .
pytest
It turns out that pytest wasn't actually installed in the correct environment, it was being invoked from base which did indeed have numpy 1.13.3 installed. Installing it in the cleanly built Python 3.6 environment solved the problem for Python 3.6 at least.

ipywidgets.embed missing dependencies? Key error when run in venv

I am writing a script that simply asks the google api for the latitudes and longitudes for a list of addresses read in from a csv file and outputs an html with the googlemap widget embedded. Further I hoped to run pyinstaller in order to make this into a .exe.
Running the code on my original conda environment it works fine however the .exe that pyinstaller creates is massive for such a small script (over 300mb). As such, I created a new virtual environment in which to work and have installed what I believe to be the bare minimum packages necessary and have rewritten the code to use as few packages as I am able which for the currently working portion of the code dropped it down considerably to just over 10 mb. (No numpy or pandas for me... ah well).
The code again works fine up until the final step:
from ipywidgets.embed import embed_minimal_html
embed_minimal_html("exporttest.html", None)
The above line should take any widgets, in particular the figure created from
fig = gmaps.figure(layout=figure_layout)
markers = gmaps.marker_layer(coordinates)
fig.add_layer(markers)
fig
Running the currently modified version in my original conda environment with all my of my usual packages installed this runs as expected without errors. Running on the virtual environment however on the mentioned lines I get the following key error:
KeyError Traceback (most recent call last)
c:\programdata\anaconda3\envs\synod_environ\lib\sre_parse.py in
parse_template(source, pattern)
1020 try:
-> 1021 this = chr(ESCAPES[this][1])
1022 except KeyError:
KeyError: '\\u'
During handling of the above exception, another exception occurred:
error Traceback (most recent call last)
<ipython-input-5-3359941239ab> in <module>
1 from ipywidgets.embed import embed_minimal_html
2
----> 3 embed_minimal_html("exporttest.html", None)
...
error: bad escape \u at position 0
(For clarification, key error has two slashes before the u, some frustration in getting this to post correctly)
As the code runs correctly in the one environment but not the other, I can only assume that I'm missing a package somewhere that ipywidgets requires, but running pip check doesn't notify me of anything missing.
pip list returns the following packages:
altgraph 0.16.1
backcall 0.1.0
bleach 3.0.2
certifi 2018.10.15
chardet 3.0.4
colorama 0.4.0
decorator 4.3.0
defusedxml 0.5.0
entrypoints 0.2.3
future 0.17.1
geojson 2.4.1
gmaps 0.8.2
idna 2.7
ipykernel 5.1.0
ipython 7.1.1
ipython-genutils 0.2.0
ipywidgets 7.4.2
jedi 0.13.1
Jinja2 2.10
jsonschema 2.6.0
jupyter 1.0.0
jupyter-client 5.2.3
jupyter-console 6.0.0
jupyter-core 4.4.0
macholib 1.11
MarkupSafe 1.0
mistune 0.8.4
nbconvert 5.4.0
nbformat 4.4.0
notebook 5.7.0
pandocfilters 1.4.2
parso 0.3.1
pefile 2018.8.8
pickleshare 0.7.5
pip 10.0.1
prometheus-client 0.4.2
prompt-toolkit 2.0.7
Pygments 2.2.0
PyInstaller 3.4
python-dateutil 2.7.5
pywin32-ctypes 0.2.0
pywinpty 0.5.4
pyzmq 17.1.2
qtconsole 4.4.2
requests 2.20.0
Send2Trash 1.5.0
setuptools 40.4.3
six 1.11.0
terminado 0.8.1
testpath 0.4.2
tornado 5.1.1
traitlets 4.3.2
urllib3 1.24
wcwidth 0.1.7
webencodings 0.5.1
wheel 0.32.2
widgetsnbextension 3.4.2
wincertstore 0.2
Any thoughts on how to further identify what went wrong, what package might be missing or how to fix the issue, and/or alternate ways to save a googlemaps output?
Fiddling with it and comparing from one environment to the other, I found that my virtual environment had ipywidgets 7.4.2 while the base environment had ipywidgets 7.2.1. Downgrading versions fixed the issue I was having.

Accessing Big Query from Cloud DataLab using Pandas

I have a Jypyter Notebook accessing Big Query using Pandas as the vehicle:
df = pd.io.gbq.read_gbq( query, project_id = 'xxxxxxx-xxxx' )
This works fine from my local machine! (great, in fact!)
But when I load the same notebook to Cloud DataLab I get:
DistributionNotFound: google-api-python-client
Which seems rather disappointing! I believe that the module should be installed with Pandas.. but somehow Google is not including it?
It would be most preferable for a bunch of reasons to not have to change the code from what we develop on our local machines to what is needed in Cloud DataLab, in this case we heavily parameterize the data access...
Ok I ran:
!pip install --upgrade google-api-python-client
Now when I run the notebook I get an auth prompt that I cannot resolve since DataLab is on a remote machine:
Your browser has been opened to visit:
>>> Browser string>>>>
If your browser is on a different machine then exit and re-run this
application with the command-line parameter
--noauth_local_webserver
Don't see an obvious answer to this?
I use the code suggested below by #Anthonios Partheniou from within the same notebook (executing it in a cell block) after updating the google-api-python-client in the notebook
and I got the following traceback:
TypeError Traceback (most recent call last)
<ipython-input-3-038366843e56> in <module>()
5 scope='https://www.googleapis.com/auth/bigquery',
6 redirect_uri='urn:ietf:wg:oauth:2.0:oob')
----> 7 storage = Storage('bigquery_credentials.dat')
8 authorize_url = flow.step1_get_authorize_url()
9 print 'Go to the following link in your browser: ' + authorize_url
/usr/local/lib/python2.7/dist-packages/oauth2client/file.pyc in __init__(self, filename)
37
38 def __init__(self, filename):
---> 39 super(Storage, self).__init__(lock=threading.Lock())
40 self._filename = filename
41
TypeError: object.__init__() takes no parameters
He mentions the need to be executing the notebook from the same folder yet the only way that I know of for executing a datalab notebook is via the repo?
While the new module of using the new Jupyter Datalab module is a possible alternative The ability to use the full Pandas BQ interface unchanged on local and DataLab instances would be hugely helpful! So xing my fingers for a solution!
pip installed:
GCPDataLab 0.1.0
GCPData 0.1.0
wheel 0.29.0
tensorflow 0.6.0
protobuf 3.0.0a3
oauth2client 1.4.12
futures 3.0.3
pexpect 4.0.1
terminado 0.6
pyasn1 0.1.9
jsonschema 2.5.1
mistune 0.7.2
statsmodels 0.6.1
path.py 8.1.2
ipython 4.1.2
nose 1.3.7
MarkupSafe 0.23
py-dateutil 2.2
pyparsing 2.1.1
pickleshare 0.6
pandas 0.18.0
singledispatch 3.4.0.3
PyYAML 3.11
nbformat 4.0.1
certifi 2016.2.28
notebook 4.0.2
cycler 0.10.0
scipy 0.17.0
ipython-genutils 0.1.0
pyasn1-modules 0.0.8
functools32 3.2.3-2
ipykernel 4.3.1
pandocfilters 1.2.4
decorator 4.0.9
jupyter-core 4.1.0
rsa 3.4.2
mock 1.3.0
httplib2 0.9.2
pytz 2016.3
sympy 0.7.6
numpy 1.11.0
seaborn 0.6.0
pbr 1.8.1
backports.ssl-match-hostname 3.5.0.1
ggplot 0.6.5
simplegeneric 0.8.1
ptyprocess 0.5.1
funcsigs 0.4
scikit-learn 0.16.1
traitlets 4.2.1
jupyter-client 4.2.2
nbconvert 4.1.0
matplotlib 1.5.1
patsy 0.4.1
tornado 4.3
python-dateutil 2.5.2
Jinja2 2.8
backports-abc 0.4
brewer2mpl 1.4.1
Pygments 2.1.3
end
Google BigQuery authentication in pandas is normally straight forward, except when pandas code is executed on a remote server. For example, running pandas on Datalab in the cloud. In that case, use the following code to create the credentials file that pandas needs to access Google BigQuery in Google Datalab.
from oauth2client.client import OAuth2WebServerFlow
from oauth2client.file import Storage
flow = OAuth2WebServerFlow(client_id='<Client ID from Google API Console>',
client_secret='<Client secret from Google API Console>',
scope='https://www.googleapis.com/auth/bigquery',
redirect_uri='urn:ietf:wg:oauth:2.0:oob')
storage = Storage('bigquery_credentials.dat')
authorize_url = flow.step1_get_authorize_url()
print 'Go to the following link in your browser: ' + authorize_url
code = raw_input('Enter verification code: ')
credentials = flow.step2_exchange(code)
storage.put(credentials)
Once you complete the process I don't expect you will see the error (as long as the notebook is in the same folder as the newly created 'bigquery_credentials.dat' file).
You also need to install the google-api-python-client python package as it is required by pandas for Google BigQuery support. You can run either of the following in a notebook to install it.
Either
!pip install google-api-python-client --no-deps
!pip install uritemplate --no-deps
!pip install simplejson --no-deps
or
%%bash
pip install google-api-python-client --no-deps
pip install uritemplate --no-deps
pip install simplejson --no-deps
The --no-deps option is needed so that you don't accidentally update a python package which is installed in datalab by default (to ensure other parts of datalab don't break).
Note: With pandas 0.19.0 (not released yet), it will be much easier to use pandas in Google Cloud Datalab. See Pull Request #13608
Note: You also have the option to use the (new) google datalab module inside of jupyter (and that way the code will also work in Google Datalab on the cloud). See the following related stack overflow answer:
How do I use gcp package from outside of google datalabs?

Categories

Resources