Cannot Find Module 'preproc' in Python/PySpark - python

I am trying to follow this tutorial: https://runawayhorse001.github.io/LearningApacheSpark/textmining.html
I have loaded my data into a PySpark DataFrame, however when I get to the preprocessing step, I receive the error, "ModuleNotFoundError: No module named 'preproc'" I can't find any information online about what to pip install in order to be able to use the preproc module.
!pip install preproc within a Jupyter notebook returns, "Defaulting to user installation because normal site-packages is not writeable
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
ERROR: Could not find a version that satisfies the requirement preproc (from versions: none)
ERROR: No matching distribution found for preproc"
python -m pip install preproc within cmd returns, "ERROR: Could not find a version that satisfies the requirement preproc (from versions: none)
ERROR: No matching distribution found for preproc"
How do I proceed finding the correct package to install?

Emailed the tutorial creator and will post his response for anyone who needs help in the future.
"The preproc module is designed for the preprocessing functions, such as check_blanns, check_lang, remove_features etc. If you include those functions explicitly, you do not need to import the preproc module."

The used functions are defined previously:
https://runawayhorse001.github.io/LearningApacheSpark/textmining.html#text-preprocessing
You can use them directly like check_lang_udf = udf(check_lang, StringType()).
Or alternatively, save these functions into a python file as preproc.py.

Related

Python package version not found, but it's clearly there

I am trying to create specific python environment inside docker for reproducible builds, but package python-opencv that previously was manually installed refuses to get installed with error:
ERROR: Could not find a version that satisfies the requirement opencv_python==4.7.0 (from versions: 3.4.0.14, 3.4.10.37, 3.4.11.39, 3.4.11.41, 3.4.11.43, 3.4.11.45, 3.4.13.47, 3.4.15.55, 3.4.16.57, 3.4.16.59, 3.4.17.61, 3.4.17.63, 3.4.18.65, 4.3.0.38, 4.4.0.40, 4.4.0.42, 4.4.0.44, 4.4.0.46, 4.5.1.48, 4.5.3.56, 4.5.4.58, 4.5.4.60, 4.5.5.62, 4.5.5.64, 4.6.0.66, 4.7.0.68)
ERROR: No matching distribution found for opencv_python==4.7.0
Command was:
pip3 install face_recognition==1.3.0 opencv_python==4.7.0
Inside docker: ubuntu 22.04; Python 3.10.6; pip 22.0.2
Why pip3 cannot find opencv_python version 4.7.0 since it's clearly in the list of available packages? What's the best way to create reproducible python environment when building docker image?
You need to specify the exact version, so:
opencv_python==4.7.0.68
Otherwise, you can ask for approximate versions using one of the following:
opencv_python~=4.7.0
opencv_python==4.7.0.*

Poshmark API - pip install poshmarkapi is not working

When I do:
pip install poshmarkapi
I see below error:
ERROR: Could not find a version that satisfies the requirement poshmarkapi
ERROR: No matching distribution found for poshmarkapi
Note: you may need to restart the kernel to use updated packages.
I tried installing the Poshmark API client for Python using the following command: pip install poshmarkapi.
Can you please let me know how I can resolve this ?
Many thanks
A.T

FlightRadarAPI not found as a module

I am trying to use the FlightRadar24 API as described in this link:
https://pypi.org/project/FlightRadarAPI/#description
I have followed the steps as outlined, however, when trying to run the following simple script, I get an error saying "No Module named 'FlightRadar24':
from FlightRadar.api import FlightRadar24API
fr_api = FlightRadar24API()
However, I re-checked whether the module has been installed, which is true. I installed the module using pip3.
Anyone know how to solve?
When I check if the module has been installed, I get the following:
Requirement already satisfied: flightradar24 in /usr/local/lib/python3.9/site-packages (0.3.1)
Does it have to do with the fact that it is installed in python3.9? How would I get it installed to python 3.7 specifically using pip?

FCPython installation

When I try to import FCPython on Jupyter notebook:
from FCPython import createPitch
I get this error:
ModuleNotFoundError: No module named 'FCPython'
when I tried to install it :
pip install FCPython
I get this error:
Note: you may need to restart the kernel to use updated packages.
ERROR: Could not find a version that satisfies the requirement FCPython (from versions: none)
ERROR: No matching distribution found for FCPython
Encountered the same issue when following the Soccermatics tutorial on Expected goal from Friends of Tracking youtube channel.
Solution: FCPython is not a module-package, the code is shared in the github repository where 3xgModel is located.
https://github.com/Friends-of-Tracking-Data-FoTD/SoccermaticsForPython/blob/master/FCPython.py
just put the FCPython.py in the same dir as the code file and run.

ERROR: Could not find a version that satisfies the requirement time (from versions: none)

I'm getting this type of error when installing the library.
Not for all packages, but only for some. Is it because pip version? or Because of python interpreter?
Plz guide me what should I do.
You don't need to install this package because it's already in the standard library.
Just import time in any Python file.

Categories

Resources