Python boilerpipe installation issue - python

I am trying to insatll Python Boilerpipe in my Ubuntu 14. It fails with the following error:
Traceback (most recent call last):
File "setup.py", line 27, in <module>
download_jars(datapath=DATAPATH)
File "setup.py", line 21, in download_jars
tar = tarfile.open(tgz_name, mode='r:gz')
File "/usr/lib/python2.7/tarfile.py", line 1678, in open
return func(name, filemode, fileobj, **kwargs)
File "/usr/lib/python2.7/tarfile.py", line 1730, in gzopen
raise ReadError("not a gzip file")
tarfile.ReadError: not a gzip file
These are the steps I am following:
pip install JPype1
pip install charade
git clone
https://github.com/misja/python-boilerpipe.git
cd python-boilerpipe
sudo python setup.py install

Found the issue, so in the setup.py they are looking for boiler-pipe tar file. And they download it from googlecode, which is not there any more.
def download_jars(datapath, version=boilerpipe_version):
tgz_url = 'https://boilerpipe.googlecode.com/files/boilerpipe-{0}- bin.tar.gz'.format(version)
So I replaced the same line with the new file location:
tgz_url='https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/boilerpipe/boilerpipe-1.2.0-bin.tar.gz'
This worked for me.

You can use one of any similar alternative for example try jusText
pip install justext
Below are some alternatives:
http://sourceforge.net/projects/webascorpus/?source=navbar
https://github.com/jiminoc/goose
https://github.com/grangier/python-goose
https://github.com/miso-belica/readability.py
https://github.com/dcramer/decruft
https://github.com/FeiSun/ContentExtraction
https://github.com/JalfResi/justext
https://github.com/andreypopp/extracty/tree/master/justext
https://github.com/dreamindustries/jaws/tree/master/justext
https://github.com/says/justext https://github.com/chbrown/justext
https://github.com/says/justext-app

Related

Error while creating virtual environment in Python3.8

virtualenv myvirtualenv
I am new to the virtual environment in Python. I was following this tutorial https://uoa-eresearch.github.io/eresearch-cookbook/recipe/2014/11/26/python-virtual-env/
But got stuck in step 3
The Error I got:
Traceback (most recent call last):
File "c:\users\vivek\appdata\local\programs\python\python38-32\lib\site-packages\virtualenv\seed\embed\via_app_data\via_app_data.py", line 58, in _install
installer.install(creator.interpreter.version_info)
File "c:\users\vivek\appdata\local\programs\python\python38-32\lib\site-packages\virtualenv\seed\embed\via_app_data\pip_install\base.py", line 46, in install
for name, module in self._console_scripts.items():
File "c:\users\vivek\appdata\local\programs\python\python38-32\lib\site-packages\virtualenv\seed\embed\via_app_data\pip_install\base.py", line 116, in _console_scripts
entry_points = self._dist_info / "entry_points.txt"
File "c:\users\vivek\appdata\local\programs\python\python38-32\lib\site-packages\virtualenv\seed\embed\via_app_data\pip_install\base.py", line 103, in _dist_info
raise RuntimeError(msg) # pragma: no cover
RuntimeError: no .dist-info at C:\Users\Vivek\AppData\Local\pypa\virtualenv\wheel\3.8\image\1\CopyPipInstall\setuptools-50.3.1-py3-none-any, has distutils-precedence.pth, easy_install.py, pkg_resources, setuptools, _distutils_hack
Traceback (most recent call last):
File "c:\users\vivek\appdata\local\programs\python\python38-32\lib\site-packages\virtualenv\seed\embed\via_app_data\via_app_data.py", line 58, in _install
installer.install(creator.interpreter.version_info)
File "c:\users\vivek\appdata\local\programs\python\python38-32\lib\site-packages\virtualenv\seed\embed\via_app_data\pip_install\base.py", line 46, in install
for name, module in self._console_scripts.items():
File "c:\users\vivek\appdata\local\programs\python\python38-32\lib\site-packages\virtualenv\seed\embed\via_app_data\pip_install\base.py", line 116, in _console_scripts
entry_points = self._dist_info / "entry_points.txt"
File "c:\users\vivek\appdata\local\programs\python\python38-32\lib\site-packages\virtualenv\seed\embed\via_app_data\pip_install\base.py", line 103, in _dist_info
raise RuntimeError(msg) # pragma: no cover
RuntimeError: no .dist-info at C:\Users\Vivek\AppData\Local\pypa\virtualenv\wheel\3.8\image\1\CopyPipInstall\pip-20.2.3-py2.py3-none-any, has pip```
Thanks in advance
you don't need to use virtualenv package in python 3.3 or above. There is a built in solution in these versions.
Just run the following command:
python -m venv myvirtualenv
It will create a new virtualenv named "myvirtualenv".
Try the following-
virtualenv -p python3 myvirtualenv

pip is giving conflict error while installing package

While running $ pip install <package>, I am getting below error, in this case I am installing PyJWT and also checked for other packages to crosscheck. This happened after upgrading pip to 19.0.2 from 19.0.1.
Check below errors while running pip install PyJWT, requirement are already satisfied, but still I am getting error, please suggest how to fix it.
$ easy_install pip
for reinstalling the pip.
$ pip install PyJWT
Tejeshs-MacBook-Air:selenium_testing tejeshagrawal$ pip install PyJWT
Requirement already satisfied: PyJWT in
/usr/local/lib/python3.7/site-packages (1.7.1) Error checking for
conflicts. Traceback (most recent call last): File
"/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2897, in _dep_map
return self.__dep_map File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2691, in __getattr__
raise AttributeError(attr) AttributeError: _DistInfoDistribution__dep_map
**During handling of the above exception, another exception occurred:**
Traceback (most recent call last): File
"/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2888, in _parsed_pkg_info
return self._pkg_info File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2691, in __getattr__
raise AttributeError(attr) AttributeError: _pkg_info
**During handling of the above exception, another exception occurred:**
Traceback (most recent call last): File
"/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_internal/commands/install.py",
line 503, in _warn_about_conflicts
package_set, _dep_info = check_install_conflicts(to_install) File
"/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_internal/operations/check.py",
line 108, in check_install_conflicts
package_set, _ = create_package_set_from_installed() File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_internal/operations/check.py",
line 47, in create_package_set_from_installed
package_set[name] = PackageDetails(dist.version, dist.requires()) File
"/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2635, in requires
dm = self._dep_map File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2899, in _dep_map
self.__dep_map = self._compute_dependencies() File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2908, in _compute_dependencies
for req in self._parsed_pkg_info.get_all('Requires-Dist') or []: File
"/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2890, in _parsed_pkg_info
metadata = self.get_metadata(self.PKG_INFO) File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 1410, in get_metadata
value = self._get(self._fn(self.egg_info, name)) File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 1522, in _get
with open(path, 'rb') as stream: FileNotFoundError: [Errno 2] No such file or directory:
'/usr/local/lib/python3.7/site-packages/~ip-18.1.dist-info/METADATA'
Tejeshs-MacBook-Air:selenium_testing tejeshagrawal$ pip freeze >
require.txt Could not parse requirement: -ip Exception: Traceback
(most recent call last): File
"/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2584, in version
return self._version File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2691, in __getattr__
raise AttributeError(attr) AttributeError: _version
**During handling of the above exception, another exception occurred:**
Traceback (most recent call last): File
"/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_internal/cli/base_command.py",
line 179, in main
status = self.run(options, args) File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_internal/commands/freeze.py",
line 93, in run
for line in freeze(**freeze_kwargs): File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_internal/operations/freeze.py",
line 62, in freeze
req = FrozenRequirement.from_dist(dist) File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_internal/operations/freeze.py",
line 239, in from_dist
req = dist.as_requirement() File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2716, in as_requirement
if isinstance(self.parsed_version, packaging.version.Version): File
"/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2551, in parsed_version
self._parsed_version = parse_version(self.version) File "/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages/pip/_vendor/pkg_resources/__init__.py",
line 2589, in version
raise ValueError(tmpl % self.PKG_INFO, self) ValueError: ("Missing 'Version:' header and/or METADATA file", Unknown [unknown version]
(/usr/local/lib/python3.7/site-packages))
Seems like your problem is similar or the same as this bug which was ironically fixed in 19.0.2. The problem is somewhere along the line you tried to install a module and it failed. pip didn't properly clean up after itself and left a package in a broken state.
The solution seems to be to find any directories starting with - in your site-packages directory (/Users/tejeshagrawal/Library/Python/3.7/lib/python/site-packages in your case) and rename them to what they should be. eg. if you find -yJWT-1.0.dist-info then you should rename it to PyJWT-1.0.dist-info. If you're not sure what it's real name should be then look for the Name value in -yJWT-1.0.dist-info/METADATA. NB. I just used PyJWT as an example, it might not be the package(s) that is/are broken. After that pip should be able to get up and running again.
Complementary finding:
Using #Dunes answer, couldn't fix the file name so I ended-up unninstalling every package from PIP
PowerShell pip uninstall -y (pip freeze)
When the loop broke on the Plackage "Plotly" I found the culprit.
You could try to install the package pip-conflict-checker:
pip install pip-conflict-checker
and then run the command:
pipconflictchecker
this will show you the packages that cause troubles
You can create a virtual environment as well by following this link:- https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands
Step 1: conda create -n myenv python=3.7.4
Step 2: conda activate myenv
Step 3: pip install package_name
This helps to manage an individual conda environment to manage packages.

Installing pip for arcpy

I'm attempting to install pip for arcpy (arcgis 10.2 on windows 7). Running get-pip.py results in the following error message:
X:\python>python get-pip.py
Traceback (most recent call last):
File "get-pip.py", line 20061, in <module>
main()
File "get-pip.py", line 194, in main
bootstrap(tmpdir=tmpdir)
File "get-pip.py", line 82, in bootstrap
import pip
File "c:\temp\tmpou5fje\pip.zip\pip\__init__.py", line 26, in <module>
File "c:\temp\tmpou5fje\pip.zip\pip\utils\__init__.py", line 27, in <module>
File "c:\temp\tmpou5fje\pip.zip\pip\_vendor\pkg_resources\__init__.py", line 73, in <module>
File "c:\temp\tmpou5fje\pip.zip\pip\_vendor\packaging\specifiers.py", line 275, in <module>
File "c:\temp\tmpou5fje\pip.zip\pip\_vendor\packaging\specifiers.py", line 373, in Specifier
File "C:\Python27\ArcGIS10.2\Lib\re.py", line 190, in compile
return _compile(pattern, flags)
File "C:\Python27\ArcGIS10.2\Lib\re.py", line 242, in _compile
raise error, v # invalid expression
sre_constants.error: nothing to repeat
Using an administrator command prompt doesn't help. My real goal is to get win32com working under arcpy. I usual just copy the appropriate directories out of c:\python27\lib\site-packages to c:\python27\arcgis10.2\lib\site-packages to install a package under arcpy (why doesn't arcpy come with pip?) but that's not working for win32com, presumably do to a missing dll or other windows specific file.
I would recommend the following:
Get the setuptools module
Get the pip module`
And then run the following in command line (assuming windows)
path-to-python path-to-setuptools install
path-to-python path-to-pip install
I work on a closed network (away from the interwebs of old) and cannot use get-pip.py so I find it best to simply download the actual modules and hard install.
Keep us posted!
Copy get_pip.py to "C:\Python27\ArcGIS10.2", then perform command "python get-pip.py" in the directory.
Note that keep network connected in the process, so that auto-download and setup setuptools,wheels,etc.
Hope that can help you.
Try opening a CMD prompt and typing:
C:\Python27\ArcGIS10.2\python.exe -m pip install -U pip

broken easy_install and pip after upgrading to OS X Mavericks

Upgraded to OS X 10.9 Mavericks and installed XCode, Command Line Tools, XQuartz, etc. Trying to run a pip install now, but it says that the distribution is not found:
Traceback (most recent call last):
File "/usr/local/bin/pip", line 5, in <module>
from pkg_resources import load_entry_point
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 2603, in <module>
working_set.require(__requires__)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 666, in require
needed = self.resolve(parse_requirements(requirements))
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 565, in resolve
raise DistributionNotFound(req) # XXX put more info here
pkg_resources.DistributionNotFound: pip==1.4.1
So I tried to install pip with an easy_install. Turns out that's borked too:
Traceback (most recent call last):
File "/usr/local/bin/easy_install", line 5, in <module>
from pkg_resources import load_entry_point
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 2607, in <module>
parse_requirements(__requires__), Environment()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 565, in resolve
raise DistributionNotFound(req) # XXX put more info here
pkg_resources.DistributionNotFound: setuptools==1.1.6
So some of the other threads say to reinstall setuptools with a sudo python ez_setup.py. It seems to work fine:
Installed /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/setuptools-1.1.6-py2.7.egg
Processing dependencies for setuptools==1.1.6
Finished processing dependencies for setuptools==1.1.6
But when running the easy_install pip, the same pkg_resources.DistributionNotFound: setuptools==1.1.6 error occurs. Anyone else have this problem? Any ideas how to fix this?
Install easy_install:
Download ez_setup.py module from https://pypi.python.org/pypi/setuptools
$ cd path/to/ez_setup.py
$ python ez_setup.py
Install pip:
$ sudo easy_install pip
try sudo python -m easy_install pip
I ran into a similar problem with git-review.
$ git review -s
Traceback (most recent call last):
File "/usr/local/bin/git-review", line 11, in <module>
sys.exit(main())
File "/Library/Python/2.7/site-packages/git_review/cmd.py", line 1132, in main
(os.path.split(sys.argv[0])[-1], get_version()))
File "/Library/Python/2.7/site-packages/git_review/cmd.py", line 180, in get_version
provider = pkg_resources.get_provider(requirement)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 197, in get_provider
return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 666, in require
needed = self.resolve(parse_requirements(requirements))
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 565, in resolve
raise DistributionNotFound(req) # XXX put more info here
pkg_resources.DistributionNotFound: git-review
The git-review team said it was a bug with pkg_resources that could be fixed with
sudo pip install --upgrade setuptools
This worked fine for me.

Why can't I install the openstack nova client on OS X?

I am attempting to install the openstack nova client on my Mac (10.4.8)
nova = https://github.com/openstack/python-novaclient#command-line-api
python --version
Python 2.7.2
I successfully got nova installed (after installing pip)
When I run the client, I get the following error
foo#bar-macbook-pro:~$ nova
Traceback (most recent call last):
File "/usr/local/bin/nova", line 6, in <module>
from novaclient.shell import main
File "/Library/Python/2.7/site-packages/novaclient/__init__.py", line 15, in <module>
import pbr.version
ImportError: No module named pbr.version
In my research, I have found conflicting information about pbr, some say it is required for nova, while others say it isn't required for nova.
https://github.com/rackspace/pyrax/issues/121
When I attempt to install pbr, I see the following error.
foo#bar-macbook-pro:~$ sudo python ~/Downloads/pbr/setup.py install
Traceback (most recent call last):
File "setup.py", line 22, in <module>
**util.cfg_to_args())
File "/Volumes/WDBlack750/spencerowen/Downloads/pbr/pbr/util.py", line 241, in cfg_to_args
pbr.hooks.setup_hook(config)
File "/Volumes/WDBlack750/spencerowen/Downloads/pbr/pbr/hooks/__init__.py", line 27, in setup_hook
metadata_config.run()
File "/Volumes/WDBlack750/spencerowen/Downloads/pbr/pbr/hooks/base.py", line 29, in run
self.hook()
File "/Volumes/WDBlack750/spencerowen/Downloads/pbr/pbr/hooks/metadata.py", line 28, in hook
self.config['name'], self.config.get('version', None))
File "/Volumes/WDBlack750/spencerowen/Downloads/pbr/pbr/packaging.py", line 817, in get_version
version = _get_version_from_git(pre_version)
File "/Volumes/WDBlack750/spencerowen/Downloads/pbr/pbr/packaging.py", line 776, in _get_version_from_git
"git --git-dir=\"" + git_dir + "\" describe --always").replace(
File "/Volumes/WDBlack750/spencerowen/Downloads/pbr/pbr/packaging.py", line 220, in _run_shell_command
stderr=err_location)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 679, in __init__
errread, errwrite)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1228, in _execute_child
raise child_exception
TypeError: must be encoded string without NULL bytes, not str
Is there anything apparent that would explain why I can not get the library installed?
Surely I must not be the first person to try and install nova on OS X.
Over a year later, I finally got this working on OS X Yosemite
sudo pip install python-novaclient
I did not have to install pbr.
I did the install based on venv:
virtualenv venv_name
source venv_name/bin/activate
pip install python-novaclient fabric
In my case, I had a mixup in which python I was using by way of fabric being installed globally.
Prior: rf -rf all my virtualenvs, rf -rf all references to novaclient (locally, and globally), and deleted a global install of fabric which was calling the novaclient.
Also as a precaution, I do not install pip globally, and only use it without sudo in virtualenvs.

Categories

Resources