No module named 'resource' installing Apache Spark on Windows - python

I am trying to install apache spark to run locally on my windows machine. I have followed all instructions here https://medium.com/#loldja/installing-apache-spark-pyspark-the-missing-quick-start-guide-for-windows-ad81702ba62d.
After this installation I am able to successfully start pyspark, and execute a command such as
textFile = sc.textFile("README.md")
When I then execute a command that operates on textFile such as
textFile.first()
Spark gives me the error 'worker failed to connect back', and I can see an exception in the console coming from worker.py saying 'ModuleNotFoundError: No module named resource'. Looking at the source file I can see that this python file does indeed try to import the resource module, however this module is not available on windows systems. I understand that you can install spark on windows so how do I get around this?

I struggled the whole morning with the same problem. Your best bet is to downgrade to Spark 2.3.2

The fix can be found at https://github.com/apache/spark/pull/23055.
The resource module is only for Unix/Linux systems and is not applicaple in a windows environment. This fix is not yet included in the latest release but you can modify the worker.py in your installation as shown in the pull request. The changes to that file can be found at https://github.com/apache/spark/pull/23055/files.
You will have to re-zip the pyspark directory and move it the lib folder in your pyspark installation directory (where you extracted the pre-compiled pyspark according to the tutorial you mentioned)

Adding to all those valuable answers,
For windows users,Make sure you have copied the correct version of the winutils.exe file(for your specific version of Hadoop) to the spark/bin folder
Say,
if you have Hadoop 2.7.1, then you should copy the winutils.exe file from the Hadoop 2.7.1/bin folder
Link for that is here
https://github.com/steveloughran/winutils

I edited worker.py file. Removed all resource-related lines. Actually # set up memory limits block and import resource. The error disappeared.

Related

missing_dependencies error using Pandas in Azure Web Job

I need to run some long running job via Azure Web Job in Python.
I am facing below error trying to import pandas.
File "D:\local\Temp\jobs\triggered\demo2\eveazbwc.iyd\pandas_init_.py", line 13
missing_dependencies.append(f"{dependency}: {e}")
The Web app (under which I will run the web job) also has python code using pandas, and it does not throw any error.
I have tried uploading pandas and numpy folder inside the zip file (creating venv, installing packages and zipping Lib/site-packages content), (for 32 bit and 64 bit python) as well as tried appending 'D:/home/site/wwwroot/my_app_name/env/Lib/site-packages' to sys.path.
I am not facing such issues in importing standard python modules or additional package modules like requests.
Error is also thrown in trying to import numpy.
So, I am assuming some kind of version mismatch is happening somewhere.
Any pointers to solve this will be really useful.
I have been using Python 3.x, not sure if I should try Python 2.x (virtual env, install package and zip content of Lib/site-packages).
Regards
Kunal
The key to solving the problem is to add Python 3.6.4 x64 Extension on the portal.
Steps:
Add Extensions on portal.
Create a requirements.txt file.
pandas==1.1.4
numpy==1.19.3
Create run.cmd.
Create below file and zip them into a single zip.
Upload the zip for the webjob.
For more details, please read this post.
Webjobs Running Error (3587fd: ERR ) from zipfile

Install PL/Python on Windows for PostgreSQL 12

I've been working on FHIR for a project and we are using PostgreSQL as a database. While reading docs, I've come to know about PL/Python and decided to give it a shot but I am unable to install the python extension.
When I run the command CREATE EXTENSION pypthon3u; I get the following error
Could not load library "C:/Program Files/PostgreSQL/12/lib/plpython3.dll": The specified module could not be found.
I've checked this SO answer but it couldn't help.
My PostgreSQL version: PostgreSQL 12.2, compiled by Visual C++ build 1914, 64-bit
Installed Python version: 3.7.7 (64 Bit)
OS Info: Windows 10 Enterprise Version 1909 OS Build 18363.657
For me, it looks like incorrect version of Python but I'm installing python 3.7.* version against which PostgreSQL is compiled as specified in doc\installation-notes.html inside the install directory.
Any help will be appreciated.
Even if you use the EDB installer's Stack Builder to install Python, you still have to follow the instructions to "ensure they are included in the PATH variable under which the database server will be started". I had to do this at the system level, as I can't find a way to set the PATH for individual services.
And then you also need to set PYTHONPATH as well, which seems to be undocumented.
So I ended up adding c:\edb\languagepack\v1\Python-3.7 to PATH and creating PYTHONPATH with c:\edb\languagepack\v1\Python-3.7\Lib.
I had to add directory containing plpython3.dll do system/user variables (windows).
I was trying the same. Though I am no expert and my errors were a bit different, here are a few things to check:
under share\extension\, there should be plpython3u.control and plpython3u--1.0.sql to be able to do CREATE EXTENSION plpython3u;
ensure you are not having any typo in the command above.
under lib\, there should be a plpython3.dll. This one is possibly created after the above step.
make sure lib\ is in your PATH.
PYTHONHOME should be set in either system/user environment, or should be set before the server start. It seems the version is not as much important as having this name set. plpython3.dll seems to look for a python39.dll but I could use it with a 3.8 installation.
I used a zipped version of Postgres, so the following is my way to run the server.
set PATH=%PATH%;d:\pgsql\bin\;d:\pgsql\lib\
set PYTHONHOME=c:\DevTools\Python\Python38
pg_ctl -D d:\pgsql\pgdata -l logfile start
PS: without PYTHONHOME, the server crashes. the server itself or psql restarts the server but I am not sure about pgAdmin.

How to fix this error on tabula.read_pdf() function in Python

I am trying to extract tables from a PDF file using Python (Pycharm).
I tried the following code:
from tabula import wrapper
object = wrapper.read_pdf("C:/Users/Ojasvi/Desktop/sample.pdf")
However, the error i got was:
"tabula.errors.JavaNotFoundError: `java` command is not found from this Python process. Please ensure Java is installed and PATH is set for `java`"
You probably need to add java to your system path. You can check those posts, they should help you in solving your problem:
How to Set Java Path On Windows?
Environment variables for java installation
I had every thing setup Java installed and Java path setup but was still getting same error, after spending half a day , I did below and everything worked.
I was using a python environment and running Tabula in python environment. I was getting error mentioned in questions.
I changed my python environment basically default one which is no environment and everything worked. I think Tabula is not able to detect Java once we are inside an python environment.

ImportError: No module named 'resource'

I am using python 3.5 and I am doing Algorithms specialization courses on Coursera. Professor teaching this course posted a program which can help us to know the time and memory associated with running a program. It has import resource command at the top. I tried to run this program along with the programs I have written in python and every time I received ImportError: No module named 'resource'
I used the same code in ubuntu and have no errors at all.
I followed suggestions in stackoverflow answers and I have tried adding PYTHONPATH PYTHONHOME and edited the PATH environment variable.
I have no idea of what else I can do here.
Is there any file that I can download and install it in the Lib or site-packages folder of my python installation ?
resource is a Unix specific package as seen in https://docs.python.org/2/library/resource.html which is why it worked for you in Ubuntu, but raised an error when trying to use it in Windows.
I ran into similar error in window 10. Here is what solved it for me.
Downgrade to the Apache Spark 2.3.2 prebuild version
Install (or downgrade) jdk to version 1.8.0
My installed jdk was 1.9.0, which doesn't seem to be compatiable with spark 2.3.2 or 2.4.0
make sure that when you run java -version in cmd (command prompt), it show java version 8. If you are seeing version 9, you will need to change your system ENV PATH to ensure it points to java version 8.
Check this link to get help on changing the PATH if you have multiple java version installed.
Hope this helps someone, I was stuck on this issue for almost a week before finally finding a solution.

HTTP Error 502.2 - Bad Gateway Configuring IIS 7 with Python 2.7

I am a newbie to Stack Overflow (first post), but really see the use of this website.
I'm stumped. We are trying to setup IIS 7.0 to run with WinPython 2.7 on a Windows 7 machine.
I am an IIS newb, but veteran Python user. IIS 7 can NOT find a library, which python finds, and executes, perfectly when ran on it's own. When executed via IIS, the script fails with a traceback, and IIS returns the 502.2.
I found this thread http://forums.iis.net/p/1209465/2073173.aspx?HTTP+Error+502+2+Bad+Gateway+Frustrations but the advised solution is simply another troubleshooting suggestion.
I found IIS's description (http://support.microsoft.com/kb/942057) of the error helpful, but futile.
I found Python's start-up options/parameters helpful (http://docs.python.org/2/using/cmdline.html), but futile.
I found IIS's advice for configuring Python helpful (http://support.microsoft.com/kb/276494, but (questionably?) incomplete.
This thread on manually defining an alternate bin folder (http://forums.asp.net/t/1303052.aspx?Tell+IIS+to+load+dll+from+another+directory+not+Bin+web+config+) might be where my solution lies, but I don't think it is because of the fact that this all worked on 2.6 without doing that to IIS.
IIS seems to allow python to import any module that is just a python script. As soon as it gets to a *.pyd (basically just python's version of a dll file) file, it screams. I'm no pro when it comes to DLLs and windows environments, but wouldn't IIS have to have paths to a bin folder of some kind? Do I have to manually edit them, as discussed in the last link above?
ACTUAL ERROR Details below for DLL failed Load:
The Error :
" HTTP Error 502.2 - Bad Gateway The specified CGI application
misbehaved by not returning a complete set of HTTP headers. The
headers it did return are "Traceback (most recent call last): File
"\estorage.equitable.int\riskmgmt\Quants\web\LinksPage.py", line 2,
in import pyweb File
"\estorage.equitable.int\riskmgmt\Quants\Common2014\Python\pyweb__init__.py",
line 5, in from core import * File
"\estorage.equitable.int\riskmgmt\Quants\Common2014\Python\pyweb\core.py",
line 2, in from pylib import pgdb File
"\estorage.equitable.int\riskmgmt\Quants\Common2014\Python\pylib\pgdb.py",
line 8, in from scikits import timeseries as ts File
"C:\WinPython-32bit-2.7.6.2-20140401\python-2.7.6\lib\site-packages\scikits.timeseries-0.91.3-py2.7-win32.egg\scikits\timeseries__init__.py",
line 13, in import const File
"C:\WinPython-32bit-2.7.6.2-20140401\python-2.7.6\lib\site-packages\scikits.timeseries-0.91.3-py2.7-win32.egg\scikits\timeseries\const.py",
line 79, in from cseries import freq_constants ImportError:
DLL load failed: The specified module could not be found. ".
I'm confident that the python environment is configured properly, as the script runs from the same executable (python.exe) via a command line. I'm thinking that I don't have IIS configured properly, for the new Python 2.7 install. The same script worked yesterday, on IIS and python 2.6. But during our upgrade from 2.6 to 2.7, a bunch of PATH and PYTHONPATH parameters all changed, plus we went from ActivePython to WinPython. WinPython is "registered" on the machine.
What I've tried
confirming python's sys.path is as expected at run-time in both IIS and command line - it is.
using the module from python command line.
recompiling the failing module using two different compilers (ming32 and VS2008).
putting duplicates of my new 2.7 modules in the old python26 folder.
pulling out lots of hair and other hacky stuff.
My next step, is to post this same message on a python forum. If anybody can advise on a good one for python-IIS related challenges, that would be appreciated.
Please help! Thanks in advance.
I got this 502.2 error when doing a clean installation of PHP 5.5 in Windows Server 2012 R2 with IIS 8.5.
It turns out PHP is a Visual C++ application which needs the library MSVCR110.dll in order to run properly. My computer does not have Visual Studio 2012 installed and thus it does not have this file. I got my problem solved by installing the Visual C++ Redistributable Packages https://www.microsoft.com/en-us/download/details.aspx?id=30679#
(Note: jc77 is my associate, and I'm actually the OP, as this was an x-post from IIS forums.)
We solved the problem.
tl,dr; portable python + sloppy/rookie compiling = strange behaviour + frustrations.
Bottom line, compile properly. For scikits.timeseries, using ming32 everything will walk, talk, and sound like it works in Spyder.exe, but not in python.exe. You have to use VS2008, if you want it to work in both.
More Info:
Winpython (as well as others) presents itself as identical to any other python installations, if you "register" the installation. It works great, 99% of the time. We learned the hard way, that "Winpython Interpreter.exe" and "python.exe" provided in the install are in fact different. Can't explain why, but the two executables gave different behavior. We were doing all our testing in Spyder, which must use "winpython interpreter.exe". The module which IIS couldn't find, would import and run no problem in Spyder. Then, in IIS, using python.exe, the module wouldn't import. We were operating on the assumption that the IDE would use python.exe, and that the stack was identical. As, 99% of the time, they appear to be. The way we were compiling scikits worked in winpython interpreter.exe. We were making a rookie mistake when compiling scikits, but it went un-noticed because it was working fine in our IDE (Spyder).
I'm adding these keywords for others : Anybody else who receives errors like this is likely using a portable python installation AND not compiling something properly. Winpython, Portable Python, eGenix, [and possibly?] Active State and Enthought Canopy.
While trying to configure CGI to run Perl in Windows 8.1, I had HTTP Error 502.2, but then I read loste's post and solved the problem. I had previously installed both Perl64 and Strawberry Perl. Although the IIS EventHandler pointed to only the Perl64 directory, both directories appeared in my Windows PATH variable. I prefer Strawberry Perl, so I changed the EventHandler to point to the Strawberry Perl directory and deleted the paths to Perl64 from the Windows PATH variable to solve the error.
Try this
print("Content-Type: text/html\n")
print("Hello Python World!")
You must specify the type of document

Categories

Resources