Download and use Python libraries locally - python

I'm working on a solution that uses scipy.stats.multivariate_normal, logpdf() function. The solution has to be uploaded to a server when it's evaluated, and that server has only Python 3.6 Anaconda, numpy and pickle installed. Any other library has to be attached as a file in the package that I upload.
The question is what files I need to download and how to use them locally in my script? Or do I need to manually implement logpdf with numpy only?

Related

missing_dependencies error using Pandas in Azure Web Job

I need to run some long running job via Azure Web Job in Python.
I am facing below error trying to import pandas.
File "D:\local\Temp\jobs\triggered\demo2\eveazbwc.iyd\pandas_init_.py", line 13
missing_dependencies.append(f"{dependency}: {e}")
The Web app (under which I will run the web job) also has python code using pandas, and it does not throw any error.
I have tried uploading pandas and numpy folder inside the zip file (creating venv, installing packages and zipping Lib/site-packages content), (for 32 bit and 64 bit python) as well as tried appending 'D:/home/site/wwwroot/my_app_name/env/Lib/site-packages' to sys.path.
I am not facing such issues in importing standard python modules or additional package modules like requests.
Error is also thrown in trying to import numpy.
So, I am assuming some kind of version mismatch is happening somewhere.
Any pointers to solve this will be really useful.
I have been using Python 3.x, not sure if I should try Python 2.x (virtual env, install package and zip content of Lib/site-packages).
Regards
Kunal
The key to solving the problem is to add Python 3.6.4 x64 Extension on the portal.
Steps:
Add Extensions on portal.
Create a requirements.txt file.
pandas==1.1.4
numpy==1.19.3
Create run.cmd.
Create below file and zip them into a single zip.
Upload the zip for the webjob.
For more details, please read this post.
Webjobs Running Error (3587fd: ERR ) from zipfile

No module named 'pyarrow.lib' found from lambda function

I have installed pyarrow version 0.14.0. I'm creating a package to run that from lambda.
While executing from lambda i'm getting error - No module named 'pyarrow.lib'
I have incorporated pyarrow package to my deployment zip file as well. My python version used is 3.7.
Can someone please help on this issue?
The underlying problem is that modules like pyarrow port their code from C/ C++. When you check pyarrow codebase, you will find in fact two pyarrow.lib files exist, but they have .pyx and .pxd file extensions. This is not pure Python code and therefore depends on underlying CPU architecture.
Bundling pyarrow with my code in the same zip did not work, irrespective of the .whl file I used. I was able to solve the problem in two ways.
1. Lambda Layer
I had to manually download .whl files for my required version for pyarrow and its dependency numpy. From http://pypi.org/project/pyarrow/, click on Download files and search for your matching version. cp39 means cpython 3.9. and x86 represents the CPU architecture. Follow the same steps for Numpy. I ended up downloading these files: pyarrow-8.0.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl and numpy-1.22.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
You then have to unzip them and create an archive where both sit together in a folder named python. This folder can be used to create a layer in Lambda. Attach this layer to your project and import pyarrow should now work.
I followed this guide for creating a pyarrow layer.
2. Docker container
The other solution is to use custom Docker images. This worked smoothly for me. I believe the AWS docs are exhaustive on that topic. I have written a PoC and all the steps that I followed here.

Import Python module into AWS Lambda

I have followed all the steps in the documentation:
https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html
create a directory.
Save all of your Python source files (the .py files) at the root level of this directory.
Install any libraries using pip at the root level of the directory.
Zip the content of the project-dir directory)
But after I uploaded the zip-file to lambda function, I got the error message when I test the script
my code:
import psycopg2
#my code...
the error:
Unable to import module 'myfilemane': No module named 'psycopg2._psycopg'
I don't know where is the suffix '_psycopg' from...
Any help regarding this?
You are using native libraries with lambda. We had this similar problem and here is how we solved it.
Spin a machine with AWS supported AMI that runs your real lambda.
https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html
As this writing, it is,
AMI name: amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2
Full documentation in installing native modules your python lambda.
https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html
Install the required modules required for your lambda,
pip install module-name -t /path/to/project-dir
and prepare your package to upload along with the native modules under lambda ami environment.
Hope this helps.
I believe this is caused because psycopg2 needs to be build an compiled with statically linked libraries for Linux. Please reference Using psycopg2 with Lambda to Update Redshift (Python) for more details on this issue. Another [reference][1] of problems of compiling psycopg2 on OSX.
There are a few solutions, but basically it comes down to installing the library on a Linux machine and using that as the Psycopg2 Library in your upload package.

ImportError: No module named cassandra in Azure Machine Learning Studio

I am trying to install python package cassandra driver in Azure Machine Learning studio. I am following this answer from here. Unfortunately i don't see any wheel file for cassandra-driver https://pypi.python.org/pypi/cassandra-driver/ so i downloaded the .tar file and converted to zip.
I included this .zip file as dataset and connected to python script
But when i run it, it says No module named cassandra
Does this work only with wheel file? Any solution is much appreciated.
I am using Python Version : Anoconda 4.0/Python 3.5
I got it working. Changed the folder inside .zip file to "cassandra" (just like cassandra package).
And in the Python script, i added
from cassandra import *

Azure ML Python with Script Bundle cannot import module

In Azure ML, I'm trying to execute a Python module that needs to import the module pyxdameraulevenshtein (https://pypi.python.org/pypi/pyxDamerauLevenshtein).
I followed the usual way, which is to create a zip file and then import it; however for this specific module, it seems to never be able to find it. The error message is as usual:
ImportError: No module named 'pyxdameraulevenshtein'
Has anyone included this pyxdameraulevenshtein module in Azure ML with success ?
(I took the package from https://pypi.python.org/pypi/pyxDamerauLevenshtein.)
Thanks for any help you can provide,
PH
I viewed the pyxdameraulevenshtein module page, there are two packages you can download which include a wheel file for MacOS and a source code tar file. I don't think you can directly use the both on Azure ML, because the MacOS one is just a share library .so file for darwin which is not compatible with Azure ML, and the other you need to first compile it.
So my suggestion is as below for using pyxdameraulevenshtein.
First, compile the source code of pyxdameraulevenshtein to a DLL file on Windows, please refer to the document for Python 2/3 or search for doing this.
Write a Python script using the DLL you compiled to implement your needs, please refer to the SO thread How can I use a DLL file from Python? for how to use DLL from Python and refer to the Azure offical tutorial to write your Python script
Package your Python script and DLL file as a zip file, then to upload the zip file to use it in Execute Python script model of Azure ML.
Hope it helps.
Adding the path to pyxdameraulevenshtein to your system path should alleviate this issue. The script checks the system path that the python script is running on and doesn't know where else to look for anything other than the default packages. If your python script is in the same directory as the pyxdameraulevenshtein package in your ZIP file, this should do the trick. Because you are running this within Azure ML and can't be sure of the exact location of your script each time you run it, this solution should account for that.
import os
import sys
sys.path.append(os.path.join(os.getcwd(), 'pyxdameraulevenshtein'))
import pyxdameraulevenshtein

Categories

Resources