Unable to refer python libraries from Nifi ExecuteScript processor - python

I have been trying to run a python script in NiFi's ExecuteScript processor. Though the catch here is that I don't have server file location access and all the python libraries are installed at "/data/jython", "/data/jython/Lib/site-packages/" and "data/nltk"
Below is the import section of my python script:
import json, traceback, pycountry, requests, geocoder, re, sys, nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk.corpus import stopwords
from java.nio.charset import StandardCharsets
from org.apache.commons.io import IOUtils
from org.apache.nifi.processor.io import StreamCallback
from org.python.core.util import StringUtil
I have added path reference to the packages/libraries:
Heres the screenshot of the error message:
Is there something I am missing? I have referred to another answer here, but couldn't figure out whats wrong with my code.

As the other answers state, Apache NiFi's ExecuteScript processor uses Jython, not Python. There is a limitation on the Jython library that it cannot handle native modules (modules that end in .so or are compiled C code, etc.). It is very likely that the pycountry module contains some native module. You can try a work-around proposed by Matt Burgess on the NiFi Developers Mailing List here.

ExecuteScript processor uses its own Jython Engine to execute your python scripts.
As the libraries which you are importing are not available in NIFI inbuild Jython Engine its throwing error.
SOLUTION:
If python is already installed on our machine with all those libraries (the same machine where your NIFI is installed) you can use that python engine
to execute your script. you can execute your python code using ExecuteProcess processor. see the configuration of ExecuteProcess.

Related

NiFi Processor Issue

In Nifi “ExecuteScript” processor. When the python script (running from “ExecuteScript” processor) trying to import “unidecode” module throwing error saying “No module found” and the “unidecode” module is installed for Python 2.x on Nifi Server.
Will this work on python 3 or do we require to use different processor.
I tried to resolve a error
Check the documentation of the processor ExecuteScript
The engine listed as "python" in the list of available script engines is actually Jython, not Python. When using Jython, you cannot import pure (CPython) modules such as pandas

Including modules in IronPython for .Net

I'm using IronPython 2.7.7 for .Net to run some python code (Installed using NuGet in an MVC project).
I got a simple test case to work quite easily, but when I went to include the actual python code I want to call I ran into problems with missing modules.
No module named Crypto.Cipher Description: An unhandled exception
occurred during the execution of the current web request. Please
review the stack trace for more information about the error and where
it originated in the code.
Exception Details: IronPython.Runtime.Exceptions.ImportException: No module named Crypto.Cipher
These are the modules / libraries that the python script has at the beginning, I can see in visual studio that it's complaining about missing Crypto.Cipter and random.
from datetime import datetime
from Crypto.Cipher import AES
import time
import random
import socket
This is the .Net code that I'm calling the Python code with.
var basePath = System.Web.Hosting.HostingEnvironment.ApplicationPhysicalPath;
var realPath = Path.Combine(basePath, #"Python/PythonThingy.py");
var ipy = Python.CreateRuntime();
dynamic pyThing = ipy.UseFile(realPath);
var items = pyThing.discover(timeout: 5);
I am not allowed to install Python on the production server, so I'm hoping I can include these libraries in some way and avoid rewriting a quite huge set of functions. Any tips on how to do this?
P.S. If not apparent from post, I know very little about python.

The difference between 'from pylons import config' and 'import pylons.config'

Im trying to import a company module into my software and I get the error:
ImportError: No module named config
from:
from pylons.config import config
So obviously, the module that im importing requires pylons.config but cant find it in my virtual environment.
If I go to the terminal and try some Python scripts I can seem to find the config file if I try:
from pylons import config
but will error if I try:
import pylons.config
Why is this?
And does anybody how or where I can get:
from pylons.config import config
to work. Bearing in mind that I cannot change the code for this module, only mine which is importing it or my own system files.
UPDATE
If anyone finding this page has a similar problem you may find that you are trying to run two modules with different versions of Pylons.
For example, you are creating a login application called myApp. You have some Python modules which help with login handling called pyLogin.
First you install pyLogin with python setup.py install. This adds the libraries to your site packages and updates any libraries it depends on, such as SqlAlchemy.
Next you install myApp in the same way which again updates libraries and dependencies.
This problem will occur if pyLogin and myApp are using different versions of Pylons. If pyLogin is using Pylons 0.9.6 and myApp is using Pylons 1.0 for example, then the pyLogin code will be called from myApp but it will be running in the wrong Pylons framework and hence will require EITHER from pylons import config or from pylons.config import config, but will only work with one. If it is using the wrong call for Pylons then you will find yourself with this error message.
So the only solution to this error is to either find earlier or later libraries which use the same Pylons version as your application or to convert your application to the same Pylons version as the libraries you are using.
There is a diffrence between two usages...
import loads a Python module into its own namespace, while from loads a Python module into the current namespace.
So, using from pylons import config imports config to to your current namespace. But trying to import a class or function using import is not possible since there is no namespace to keep them... You can only import modules, and use functions or classes via calling them with their own namespace like
import pylons
....
pylons.config #to retreive config
More about import in Python

Use SMO DLL in Python using win32com

I'm attempting to use the SQL Server SMO library from Python 2.7 using pyWin32. I can import win32com, but I've been stymied in any attempt to access the library. The code I am attempting is below.
import sys
sys.path.append(r'C:\Program Files\Microsoft SQL Server\100\SDK\Assemblies')
import win32com.client
server = win32com.client.Dispatch('Microsoft.SqlServer.Management.SMO.Server')
def main():
print server
if __name__ == '__main__':
main()
When this code is run, I get pywintypes.com_error: (-2147221005, 'Invalid class string', None, None).
It seems likely that I am simply getting the library's name wrong in the call to Dispatch, but I can't figure out any way to know what it should be.
It seems like this may actually be a path problem.
This works:
import win32api
win32api.LoadLibrary(r'C:\Program Files\Microsoft SQL Server\100\SDK\Assemblies\Microsoft.SqlServer.Smo.dll')
However, this does not:
import sys
sys.path.append(r'C:\Program Files\Microsoft SQL Server\100\SDK\Assemblies')
import win32api
win32api.LoadLibrary(r'Microsoft.SqlServer.Smo.dll')
win32com is for using COM libraries and it doesn't know anything about .NET assemblies. Currently, there is no way to use .NET directly from CPython, so your options are to use IronPython or write a command line tool in C# or whatever and then call it from Python.
In addition to #Pondlife's answer, which is correct for the question asked, it is possible to load .NET assemblies in CPython using the Python for .NET module and the following code:
import clr
clr.AddReference("Microsoft.SqlServer.Smo")
from Microsoft.SqlServer.Management.Smo import Server
To make this work, I needed to add the assembly's path to PYTHONPATH manually. I couldn't get it to work using sys.path.append.

Python GData Import

I'm trying to use gdata python but unfortunately when I execute my script it keeps on saying me "ImportError: No module named docs".
I have tried importing it by running python directly in shell and everything seems fine.
Can someone help me out with this problem?
edit:
import gdata.docs
import gdata.docs.service
import gdata.docs.client
import gdata.spreadsheet.service
I had this problem when I started. My guess is that your Gdata library is not on your Python path. For example, my gdata and atom modules are located in the Python27/Lib/site-packages folder.
Another option is to update your PATH environmental variable to point to the current location of the Gdata files.

Categories

Resources