I am trying to install python package cassandra driver in Azure Machine Learning studio. I am following this answer from here. Unfortunately i don't see any wheel file for cassandra-driver https://pypi.python.org/pypi/cassandra-driver/ so i downloaded the .tar file and converted to zip.
I included this .zip file as dataset and connected to python script
But when i run it, it says No module named cassandra
Does this work only with wheel file? Any solution is much appreciated.
I am using Python Version : Anoconda 4.0/Python 3.5
I got it working. Changed the folder inside .zip file to "cassandra" (just like cassandra package).
And in the Python script, i added
from cassandra import *
Related
I need to run some long running job via Azure Web Job in Python.
I am facing below error trying to import pandas.
File "D:\local\Temp\jobs\triggered\demo2\eveazbwc.iyd\pandas_init_.py", line 13
missing_dependencies.append(f"{dependency}: {e}")
The Web app (under which I will run the web job) also has python code using pandas, and it does not throw any error.
I have tried uploading pandas and numpy folder inside the zip file (creating venv, installing packages and zipping Lib/site-packages content), (for 32 bit and 64 bit python) as well as tried appending 'D:/home/site/wwwroot/my_app_name/env/Lib/site-packages' to sys.path.
I am not facing such issues in importing standard python modules or additional package modules like requests.
Error is also thrown in trying to import numpy.
So, I am assuming some kind of version mismatch is happening somewhere.
Any pointers to solve this will be really useful.
I have been using Python 3.x, not sure if I should try Python 2.x (virtual env, install package and zip content of Lib/site-packages).
Regards
Kunal
The key to solving the problem is to add Python 3.6.4 x64 Extension on the portal.
Steps:
Add Extensions on portal.
Create a requirements.txt file.
pandas==1.1.4
numpy==1.19.3
Create run.cmd.
Create below file and zip them into a single zip.
Upload the zip for the webjob.
For more details, please read this post.
Webjobs Running Error (3587fd: ERR ) from zipfile
I am trying to install apache spark to run locally on my windows machine. I have followed all instructions here https://medium.com/#loldja/installing-apache-spark-pyspark-the-missing-quick-start-guide-for-windows-ad81702ba62d.
After this installation I am able to successfully start pyspark, and execute a command such as
textFile = sc.textFile("README.md")
When I then execute a command that operates on textFile such as
textFile.first()
Spark gives me the error 'worker failed to connect back', and I can see an exception in the console coming from worker.py saying 'ModuleNotFoundError: No module named resource'. Looking at the source file I can see that this python file does indeed try to import the resource module, however this module is not available on windows systems. I understand that you can install spark on windows so how do I get around this?
I struggled the whole morning with the same problem. Your best bet is to downgrade to Spark 2.3.2
The fix can be found at https://github.com/apache/spark/pull/23055.
The resource module is only for Unix/Linux systems and is not applicaple in a windows environment. This fix is not yet included in the latest release but you can modify the worker.py in your installation as shown in the pull request. The changes to that file can be found at https://github.com/apache/spark/pull/23055/files.
You will have to re-zip the pyspark directory and move it the lib folder in your pyspark installation directory (where you extracted the pre-compiled pyspark according to the tutorial you mentioned)
Adding to all those valuable answers,
For windows users,Make sure you have copied the correct version of the winutils.exe file(for your specific version of Hadoop) to the spark/bin folder
Say,
if you have Hadoop 2.7.1, then you should copy the winutils.exe file from the Hadoop 2.7.1/bin folder
Link for that is here
https://github.com/steveloughran/winutils
I edited worker.py file. Removed all resource-related lines. Actually # set up memory limits block and import resource. The error disappeared.
I am writing a python script which will run in AWS as lambda function. Since it needs to connect to a Postgres database, library psycopg2 is required. It seems the standard psycopg2 does not work with AWS lambda. I downloaded it from this git repo.
I am using virtualenv for all the dependencies, so I copied the psycopg2-3.6 folder from the downloaded package to [myproject]/env/Lib/site-packages. In my main script this library is imported
import psycopg2
However when I run it in PyCharm, I got error:
File "C:\Users\dxx0111\WorkSpace\iq-iot-lambda\app.py", line 2, in <module>
import psycopg2
File "C:\Users\dxx0111\WorkSpace\iq-iot-lambda\env\lib\site-packages\psycopg2\__init__.py", line 50, in <module>
from psycopg2._psycopg import ( # noqa
ModuleNotFoundError: No module named 'psycopg2._psycopg'
Based on the error message, it looks like it was able to locate the directory of psycopg2 under virtual environment package folder. It just couldn't find psycopg2._psycopg. What am I missing here?
As it turns out, the psycopg2 library downloaded from that link only works on Amazon Linux because that is where the package was compiled on. It doesn't work on my Windows machine. In order to make it work locally, I had to install with pip install psycopg2 to my virtual env. When I deploy to AWS Lambda though, I make the zip with the downloaded library. So different psycopg2 in different platform. Now it works in both places.
I have followed all the steps in the documentation:
https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html
create a directory.
Save all of your Python source files (the .py files) at the root level of this directory.
Install any libraries using pip at the root level of the directory.
Zip the content of the project-dir directory)
But after I uploaded the zip-file to lambda function, I got the error message when I test the script
my code:
import psycopg2
#my code...
the error:
Unable to import module 'myfilemane': No module named 'psycopg2._psycopg'
I don't know where is the suffix '_psycopg' from...
Any help regarding this?
You are using native libraries with lambda. We had this similar problem and here is how we solved it.
Spin a machine with AWS supported AMI that runs your real lambda.
https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html
As this writing, it is,
AMI name: amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2
Full documentation in installing native modules your python lambda.
https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html
Install the required modules required for your lambda,
pip install module-name -t /path/to/project-dir
and prepare your package to upload along with the native modules under lambda ami environment.
Hope this helps.
I believe this is caused because psycopg2 needs to be build an compiled with statically linked libraries for Linux. Please reference Using psycopg2 with Lambda to Update Redshift (Python) for more details on this issue. Another [reference][1] of problems of compiling psycopg2 on OSX.
There are a few solutions, but basically it comes down to installing the library on a Linux machine and using that as the Psycopg2 Library in your upload package.
In Azure ML, I'm trying to execute a Python module that needs to import the module pyxdameraulevenshtein (https://pypi.python.org/pypi/pyxDamerauLevenshtein).
I followed the usual way, which is to create a zip file and then import it; however for this specific module, it seems to never be able to find it. The error message is as usual:
ImportError: No module named 'pyxdameraulevenshtein'
Has anyone included this pyxdameraulevenshtein module in Azure ML with success ?
(I took the package from https://pypi.python.org/pypi/pyxDamerauLevenshtein.)
Thanks for any help you can provide,
PH
I viewed the pyxdameraulevenshtein module page, there are two packages you can download which include a wheel file for MacOS and a source code tar file. I don't think you can directly use the both on Azure ML, because the MacOS one is just a share library .so file for darwin which is not compatible with Azure ML, and the other you need to first compile it.
So my suggestion is as below for using pyxdameraulevenshtein.
First, compile the source code of pyxdameraulevenshtein to a DLL file on Windows, please refer to the document for Python 2/3 or search for doing this.
Write a Python script using the DLL you compiled to implement your needs, please refer to the SO thread How can I use a DLL file from Python? for how to use DLL from Python and refer to the Azure offical tutorial to write your Python script
Package your Python script and DLL file as a zip file, then to upload the zip file to use it in Execute Python script model of Azure ML.
Hope it helps.
Adding the path to pyxdameraulevenshtein to your system path should alleviate this issue. The script checks the system path that the python script is running on and doesn't know where else to look for anything other than the default packages. If your python script is in the same directory as the pyxdameraulevenshtein package in your ZIP file, this should do the trick. Because you are running this within Azure ML and can't be sure of the exact location of your script each time you run it, this solution should account for that.
import os
import sys
sys.path.append(os.path.join(os.getcwd(), 'pyxdameraulevenshtein'))
import pyxdameraulevenshtein