Python UDF in Pig - python

Whenever I try to import external packages of python in a pig udf, it shows the following error
Python Error. Traceback (most recent call last):
File "pythonudf.py", line 5, in
from bs4 import BeautifulSoup
ImportError: No module named bs4
I've tried including the library path
import sys
sys.path.append('/usr/local/lib/python3.5/dist-packages')
And set
export JYTHONPATH=$JYTHONPATH:/usr/local/lib/python3.5/dist-packages
But it is still showing the same error.
What else can I do?
The script isn't running in local or mapreduce mode.
PS: Other functions which do not import external packages are running perfectly.
EDIT:
The packages in the python code are installed.

Use -embedded option when executing pig with python udf importing packages.Reference
pig -embedded jython pythonudf.py

Related

import module failed with running by console

First of all the project structure looks like this
-- ba-amin-code (main directory)
-- Diverse
-- LanguageIdentification.py
-- TextAlignment
-- FileHandler.py
-- TextHandler.py
I want to import a python file from TextAlignment when im in Diverse.
Project Structure, import statements
This is how I imported it, but im getting this error when running it like this
enter image description here
Is this way wrong to import?
Update:
This is how im importing two py files from TextAlignment.
Im in LanguageIdentification.py
import fasttext
import urllib
import typer
from TextAlignment import TextHandler
from TextAlignment import FileHandler
After that im running the program`from the root-directory (ba-amin-code) by console with this command
python Diverse/LanguageIdentification.py "INPUT.txt"
getting this error
Traceback (most recent call last):
File "C:\Users\Snur\Pycharm Projects\Gitlab\ba-amin-code\Diverse\LanguageIdentification.py", line 4, in <module>
from TextAlignment import TextHandler
ModuleNotFoundError: No module named 'TextAlignment'
The package might be installed in a different venv. Did you try running your code from the pycharm console, with your venv activated?
You have two yellow warnings in pycharm at your imports: what are they?
Otherwise, try the command: $pip list to see if the package is correctly installed in your venv.

When trying to import selenium it does not work

When i attempt to import selenium i get the error:
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
import selenium
ModuleNotFoundError: No module named 'selenium'
My selenium module is currently in:
C:\Users\Maseeek\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium
I've seen other people have it in a different directory and want to know how to fix it.
If you are using "PYCHARM" and if you created project with "Virtual Environment" you should choose "Python 3.x.x" instead of "Virtual Environment Python 3.9" that PyCharm created.
If you installed "anaconda" before you need to change paths of libaries.
If you're using PyCharm, then this might help:
PyCharm sometimes shows an error in importing modules that have already been installed (that's annoying). But, try to change the file location to this destination:
C:\Users\Maseeek\PyCharm Projects\(project name)\venv\site-packages\
or try installing the package in PyCharm using python with the following code:
from sys import *
from subprocess import *
call(["-m", "pip", "install", "selenium"])

How can I run a python code with Keras module in VS Code and Ubuntu?

The OS is Ubuntu16.04. I have installed python extension for VS Code. I can run a Hello World program in VS Code. But when there is import keras in the code, I encounter an error:
[Running] python "/home/lym/Documents/py/test.py"
Traceback (most recent call last):
File "/home/lym/Documents/py/test.py", line 8, in <module>
import keras
ImportError: No module named keras
The code is here. I can run this code in terminal. But it seems that VS Code don't recognize the Keras module.
Thanks!
most likely you are using different python versions in VS code and terminal:
check your python path in VS code:
import sys
print(sys.path)
and compare it to the result of this code in your terminal:
which python

Not able to import module after installing using pip

Im trying to import a module called geoip2 from pypi into python it is not included in its standard libraries.
I open command prompt and type:
pip install geoip2
The command prompt returns
Successfully installed geoip2-2.4.2
After it is installed I try importing it using IDLE:
import geoip2.webservice
which returns the error:
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
import geoip2.webservice
ImportError: No module named 'geoip2'
Although it is installed already I cannot use it. How can i prevent this? Take note that I use python 3.6
May be you have two different version of Python installed. Try opening IDLE using the Python version where you have installed geoip.
Instead of:
import geoip2.webservice
Try doing:
import geoip2
from geoip2 import webservice
Since geoip2.webservice is not installed, geopip2 is and .webservice is an function object of that module.
Further, you can avoid typing geoip2.webservice every time by doing:
import geoip2
from geoip2 import webservice as gws
Then anytime you want to run the .webservice function, you can just use gws.
Alternatively:
Just do:
import geoip2
Then in your script you can call it:
geoip2.webservice(#do stuff here or however you call the function)

'No module named happybase' when running from PIG

I have a Python UDF which is connecting to HBase using Happybase. If I run the code from Python 2.7 it works perfectly.
However when I call the Python UDF from Pig 0.15.0 I am getting the following error:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1121: Python Error. Traceback (most recent call last): import happybase ImportError: No module named happybase
In my Pig script I am registering my Python script (pigtest.py) like this:
REGISTER 'pigtest.py' using jython as myfuncs;
I tried to set the Happybase path in my Python script as follows but that didn't make a difference:
import sys
sys.path.append('/usr/local/lib/python2.7/dist-packages/happybase')
import happybase
I also tried adding "/usr/local/lib/python2.7/dist-packages/happybase" to the JYTHON_PATH in the .bashrc file (I'm on Ubuntu) but same error comes up.
It seems to me like I need to set the Happybase path somewhere, but I can't figure out where.
I was able to solve this with the help of the Jython user mailing list. You need to specify the parent directory of the Happybase folder, not the path to the Happybase dir itself like I was doing, by doing one of the following:
Append the location to the sys.path in the Python script:
import sys
sys.path.append('/usr/local/lib/python2.7/dist-packages')
import happybase
OR
Add the location as an environment variable to JYTHONPATH (not JYTHON_PATH!):
export JYTHONPATH = $JYTHONPATH:/usr/local/lib/python2.7/dist-packages

Categories

Resources