mrjob virtualenv error in Hadoop cluster: Permission denied - python

I work at a large corporate organization where we have a Hadoop cluster. I got the admin to install virtualenv on all the Hadoop worker nodes so that I can submit mrjobs with standard Python dependencies that may not exist on the worker nodes. As per the documentation here, this is how my mrjob.conf file looks like:
runners:
hadoop:
setup:
- virtualenv venv
- . venv/bin/activate
- pip install nltk
I have a simple job that uses nltk package. I can verify that this setup script runs on the worker nodes (I can put simple commands like write some data to a file in /tmp and it works). However, I get the following error:
New python executable in venv/bin/python
Installing setuptools............done.
Installing pip...
Error [Errno 13] Permission denied while executing command /storage5/hadoop/map...env/bin/easy_install /usr/share/python-virtualenv/pip-1.1.tar.gz
...Installing pip...done.
Traceback (most recent call last):
File "/usr/bin/virtualenv", line 3, in <module>
virtualenv.main()
File "/usr/lib/python2.7/dist-packages/virtualenv.py", line 938, in main
never_download=options.never_download)
File "/usr/lib/python2.7/dist-packages/virtualenv.py", line 1054, in create_environment
install_pip(py_executable, search_dirs=search_dirs, never_download=never_download)
File "/usr/lib/python2.7/dist-packages/virtualenv.py", line 643, in install_pip
filter_stdout=_filter_setup)
File "/usr/lib/python2.7/dist-packages/virtualenv.py", line 976, in call_subprocess
cwd=cwd, env=env)
File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child
raise child_exception
OSError: [Errno 13] Permission denied
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
What may be causing this error?

Thanks for this idea for deploying packages to the cluster.
As for your problem I think it looks like it doesn't have permission to write to the directory.

Related

Getting permission denied while using virtual env

I'm trying to install Open CV 2 on a shared hosting inside a virtualenv.
I already got numpy and all those stuff downloaded using Pip. I'm just having a bit of trouble with OpenCV2
I run this command in the ssh session
(penv)[dire]$ cmake -D MAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=$VIRTUAL_ENV/local/ -D PYTHON_EXECTUABLE=$PYTHONPATH/python2.7 -D PYTHON_PACKAGES_PATH=$VIRTUAL_ENV/lib/python2.7/site-packages -D INSTALL_PYTHON_EXAMPLES=ON ..
and the error I get is
Traceback (most recent call last):
File "/home/bashtroubles/website.com/public/NNPics/penv/bin/cmake", line 11, in <module>
sys.exit(cmake())
File "/home/bashtroubles/website.com/public/NNPics/penv/local/lib/python2.7/site-packages/cmake/__init__.py", line 33, in cmake
raise SystemExit(_program('cmake', sys.argv[1:]))
File "/home/bashtroubles/website.com/public/NNPics/penv/local/lib/python2.7/site-packages/cmake/__init__.py", line 29, in _program
return subprocess.call([os.path.join(CMAKE_BIN_DIR, name)] + args)
File "/usr/lib/python2.7/subprocess.py", line 493, in call
return Popen(*popenargs, **kwargs).wait()
File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child
raise child_exception
OSError: [Errno 13] Permission denied
(penv)[dire]$
This is also my .bashrc if it matters
# ~/.bashrc: executed by bash(1) for non-login shells.
# Load pythonbrew
alias pb='pythonbrew'
export PYTHONPATH=~/.pythonbrew/pythons/Python-2.7.3/lib
[[ -s /home/bashtroubles/.pythonbrew/etc/bashrc ]] && source /home/bashtroubles/.python$
# Load custom python installation
export PATH=~/opt/python-2.7.3/bin:${PATH}
export PYTHONPATH=~/opt/python-2.7.3/lib
The specific version is opencv-2.4.13 and the python version is 2.7.3
I believe the issue is because it's using the python2.7 from
File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child
raise child_exception
Any ideas on how to get this going without a permission denied issue?
I ran into this problem. It looks like the binary it's trying to call isn't marked as an executable. I ran this to change the permissions
sudo chmod +x -R /usr/local/lib/python2.7/dist-packages/cmake-3.13.3-py2.7-linux-x86_64.egg/cmake/data/bin

Azure Batch Apps install python packages on startup

We are using Azure Batch Apps which will create multiple VMs which can be used to run our tasks parallelly. We are using python for data fetching tasks.
We have mentioned in the batch apps to install anaconda on the VMs when they start up. Anaconda is installed properly. We have listed out the packages(requirements.txt) we need to install to run the tasks.
pip install -r requirements.txt
Some packages get installed correclty, but some packages result in the following error,
Error [Error 6] The handle is invalid while executing command python setup.py egg_info
Exception:
Traceback (most recent call last):
File "C:\user\tasks\shared\anaconda2\lib\site-packages\pip\basecommand.py", line 209, in main
status = self.run(options, args)
File "C:\user\tasks\shared\anaconda2\lib\site-packages\pip\commands\install.py", line 310, in run
wb.build(autobuilding=True)
File "C:\user\tasks\shared\anaconda2\lib\site-packages\pip\wheel.py", line 748, in build
self.requirement_set.prepare_files(self.finder)
File "C:\user\tasks\shared\anaconda2\lib\site-packages\pip\req\req_set.py", line 360, in prepare_files
ignore_dependencies=self.ignore_dependencies))
File "C:\user\tasks\shared\anaconda2\lib\site-packages\pip\req\req_set.py", line 591, in _prepare_file
abstract_dist.prep_for_dist()
File "C:\user\tasks\shared\anaconda2\lib\site-packages\pip\req\req_set.py", line 127, in prep_for_dist
self.req_to_install.run_egg_info()
File "C:\user\tasks\shared\anaconda2\lib\site-packages\pip\req\req_install.py", line 430, in run_egg_info
command_desc='python setup.py egg_info')
File "C:\user\tasks\shared\anaconda2\lib\site-packages\pip\utils\__init__.py", line 678, in call_subprocess
cwd=cwd, env=env)
File "C:\user\tasks\shared\anaconda2\lib\subprocess.py", line 702, in __init__
errread, errwrite), to_close = self._get_handles(stdin, stdout, stderr)
File "C:\user\tasks\shared\anaconda2\lib\subprocess.py", line 823, in _get_handles
p2cread = _subprocess.GetStdHandle(_subprocess.STD_INPUT_HANDLE)
WindowsError: [Error 6] The handle is invalid
When we open the VM and give the same command, all packages get installed correctly.
I just wonder where the issue is.
It seems that the issue was caused by some limits for Azure Batch service, you can see these limits here.
According to the error information, it seems that the installation process needs to fork the subprocess, but the maximum number of tasks per computer node is 4, please see below.

mkvirtualenv python3.2 permission denied

Trying to create a virtualenv using the command:
mkvirtualenv -p /usr/local/lib/python3.2 splinter
Gives me the response:
Running virtualenv with interpreter /usr/local/lib/python3.2
Traceback (most recent call last):
File "/usr/local/bin/virtualenv", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python3.2/dist-packages/virtualenv.py", line 784, in main
popen = subprocess.Popen([interpreter, file] + sys.argv[1:], env=env)
File "/usr/lib/python3.2/subprocess.py", line 745, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.2/subprocess.py", line 1361, in _execute_child
raise child_exception_type(errno_num, err_msg)
OSError: [Errno 13] Permission denied
How oh how can I start a virtualenv using python3.2?
You need to supply the path to the path to the Python interpreter with -p, not the lib directory.
Because you're passing that directory, virtualenv is trying to execute it, and therefore you get Permission denied. So use the path to the python executable in the bin directory instead (use which python3.2 to find out if you don't know the location).
This should work, assuming your Python 3.2 interpreter can be found at /usr/local/bin/python3.2:
mkvirtualenv -p /usr/local/bin/python3.2 splinter

Chromium build gclient runhooks error number 13

I am getting the following error while running gclient runhooks for building chromium.
running '/usr/bin/python src/tools/clang/scripts/update.py --if-needed' in '/media/usrname/!!ChiLL out!!'
Traceback (most recent call last):
File "src/tools/clang/scripts/update.py", line 283, in
sys.exit(main())
File "src/tools/clang/scripts/update.py", line 269, in main
stderr=os.fdopen(os.dup(sys.stdin.fileno())))
File "/usr/lib/python2.7/subprocess.py", line 522, in call
return Popen(*popenargs, **kwargs).wait()
File "/usr/lib/python2.7/subprocess.py", line 710, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 13] Permission denied
Error: Command /usr/bin/python src/tools/clang/scripts/update.py --if-needed returned non-zero exit status 1 in /media/usrname/!!ChiLL out!!
In order to get permission of the directory "/usr/bin/python src/tools/clang/scripts" I tried chown and chmod but it returned the same error.
I think the python scripts in scripts directory are trying to modify some other files or directories .... try to trace what it is trying to do...... You have not specified the OS on which u are working ....... see this link https://github.com/aerospike/aerospike-client-python/issues/22
It says Linux Mint 17 is not supported officially.....
Actually the directory was not mounted with execution permission. So I remounted the directory with execution permission using
mount -o exec /dev/sda5 /media/usrname
and it worked fine.

Why does virtualenv throw an error when I try to define the python version as 2.7?

I tried to create a new virtualenv directory with sudo virtualenv curdir -p /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7 and it threw the following error:
Running virtualenv with interpreter /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7
Traceback (most recent call last):
File "/usr/local/bin/virtualenv", line 9, in <module>
load_entry_point('virtualenv==1.6.4', 'console_scripts', 'virtualenv')()
File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/virtualenv.py", line 785, in main
popen = subprocess.Popen([interpreter, file] + sys.argv[1:], env=env)
File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/subprocess.py", line 741, in __init__
restore_signals, start_new_session)
File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/subprocess.py", line 1356, in _execute_child
raise child_exception_type(errno_num, err_msg)
OSError: [Errno 13] Permission denied
I understand that I was not allowed to do that, but why? python 2.7 is located there and I want to use it. Is there any way to use it in my virtualenv?
I hope that this isn't too basic of a question. I am still pretty new to Unix command line.
You have to point to the python executable, which you are not doing here. Its located at /Library/Frameworks/Python.framework/Versions/2.7/bin/python. Run this
sudo virtualenv curdir -p /Library/Frameworks/Python.framework/Versions/2.7/bin/python

Categories

Resources