I am trying to create a program that takes a set of numbers from a webpage and adds them up together. I used the beautifulsoup module that I installed(ran "pip install beautifulsoup4 in command prompt).
Code:
from bs4 import BeautifulSoup
web=request.urlopen('http://py4e-data.dr-chuck.net/comments_845350.html').read()
x = BeautifulSoup(html)
tags=x('span')
sum=0
for tag in tags:
sum = sum+int(tag.contents[0])
print(sum)
However, whenever I run the program, python gives me a ModuleNotFoundError: no module named bs4. How can I fix this?
If you look here, you see that
pip install beautifulsoup4 should do the job.
If you are on Linux you might have to use pip3 instead.
Do you have more than one version of Python installed on your machine?
If so, try running
pip --version
Will return something like this
pip 18.1 from c:\...\lib\site-packages\pip (python 3.6)
Then verify if you are using the same version to run your script
I am getting the same error as in this 4 years old thread: bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
But I am using MacOS, IntelliJ and Conda / Python 3 as my environment. Things I have tried:
$ STATIC_DEPS=true sudo pip install lxml
and
$ pip install -U lxml
Collecting lxml
Downloading https://files.pythonhosted.org/packages/16/31/be98027f5cd909e698210092ffc7d2e339492bc82cc872557b05f2ba3546/lxml-4.2.4-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (8.7MB)
100% |████████████████████████████████| 8.7MB 2.8MB/s
Installing collected packages: lxml
Found existing installation: lxml 4.1.1
Uninstalling lxml-4.1.1:
Successfully uninstalled lxml-4.1.1
Successfully installed lxml-4.2.4
after that:
$ python3 -m pip install lxml
Requirement already satisfied: lxml in /anaconda3/lib/python3.6/site-packages (4.2.4)
But I still get the same error upon executing my script in IntelliJ:
File "/Users/blabla/katalog-scanner/KatalogScanner.py", line 149, in <module>
soup = BeautifulSoup(html, 'lxml')
File "/anaconda3/envs/katalog-scanner/lib/python3.6/site-packages/bs4/__init__.py", line 198, in __init__
% ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
I also tried switching to html5lib in my code, resulting in the same error, saying that html5lib was requested and not found. What else can I try?
I had multiple installations of Python on my machine, provided by
homebrew
Anaconda
easy_install
package managers. I deleted the anaconda instance completely (was directly under my macintosh-hd), removed easy_install and brew uninstall python --force to remove all the instances of python (2.7, 3.6, 3.7) I had in usr/local/bin
then I installed only with homebrew: brew install python3
then you need to link python and pip commands to python3/pip3 by opening
~/.bash_profile
putting this there and saving:
alias python='python3'
alias pip='pip3'
then refresh the terminal (maybe you need to restart it completely or even the OS):
source ~/.bashrc
then python --version should show the newest 3.x version an you should be able to do: (second command starts python interpreter, fourth ends it)
pip install beautifulsoup4
python
import bs4
exit()
Now you have to go to IntelliJ > File > Project Structure and add Python 3.x SDK to Plattform Settings (SDK) and set Project Settings > Project SDK to that SDK
Before I also had an IntelliJ .iml-file, but the project seems to work fine without
I am using Python 3.6.5 in Visual Studio Code on a Mac.
I installed pip3 and it is up to date, when I put in the command :
$ pip --version
I get this result :
pip 10.0.1 from /usr/local/lib/python3.6/site-packages/pip (python 3.6)
I imported the module requests.
And when I put in this command :
pip freeze | grep requests
I get this result :
requests==2.19.1
So I thought this meant the requests module was installed, but I still get the error ImportError: No module named requests when I put in : import requests in my file and try to run it.
Can somebody explain what is happening? Thank you :)
I found the answer, turns out I was using an extension called Code Runner and I thought it used the integrated terminal, where I had configured Python3. But turns out it uses its own interpreter. I added the following to my user settings:
"code-runner.executorMap": {
"python": "python3",
}
and now it works! :)
I have the following code (in PyCharm (MacOS)):
import pandas as pd
fiddy_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
print(fiddy_states)
And I get the following error:
/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/user_name/PycharmProjects/PandasTest/Doc3.py
Traceback (most recent call last):
File "/Users/user_name/PycharmProjects/PandasTest/Doc3.py", line 9, in <module>
fiddy_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/io/html.py", line 906, in read_html
keep_default_na=keep_default_na)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/io/html.py", line 733, in _parse
parser = _parser_dispatch(flav)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/io/html.py", line 693, in _parser_dispatch
raise ImportError("lxml not found, please install it")
ImportError: lxml not found, please install it
In Anaconda does appear installed the last version of lxml (3.8.0). Despite of that, I have tried to reinstall it by: 1) writing pip install lxml and 2) downloading the lxml wheel corresponding to my python version (lxml-3.8.0-cp36-cp36m-win_amd64.whl), but in any case all remains the same (in the second case I get that it is not a supported wheel on this platform, even though the version of python is correct (3.6, 64 bits)).
I've read similar questions here (even with the same code above, since it's from a tutorial), but the problem still persists.
Based on the fact that the error is:
/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6
This means that you are working with python-3.6. Now usually the package manager for python-3.x is pip3. So you probably should install it with:
pip3 install lxml
For people reached here using Jupyter notebook, I restarted the kernel after pip install lxml and the error is gone.
I got same error, it seems that my python3 was pointing to pandas in python2 (since I have not install pandas in python3). After doing pip3 install pandas and restarting a notebook, it worked fine.
you may have to (re)install some of your libraries pip install lxml bs4 html5lib
pd.read_html() reads with 'lxml' library by default, so try another library that you installed above like pd.read_html(some_url, flavor='html5lib')
You can go to Settings > Project Interpreter > Click on '+' icon
Find 'lxml' from the list of packages and click 'Install Package' button found below.
I am using PyCharm 2019.2.1 (Community Edition)
Build #PC-192.6262.63, built on August 22, 2019
Runtime version: 11.0.3+12-b304.39 amd64
VM: OpenJDK 64-Bit Server VM by JetBrains s.r.o
Linux 4.15.0-58-generic
GC: ParNew, ConcurrentMarkSweep
Memory: 937M
Cores: 4
I tried to reinstall lxml without any progress.
I ended uninstalling pandas and reinstalling and updating and that solved my issues!
pip uninstall pandas
pip install pandas
pip3 install --upgrade pandas
I got the same error when trying to run some code that was using pandas. I tried some suggestions here but those did not work. Finally, what worked for me was the following two steps :
conda update anaconda
conda install spyder=5.0.5
Now when I restarted Spyder and ran my code it worked fine.
I have just installed and starting using anaconda so I don't know the root cause of this issue, but my guess is there seemed to be some "cross-connection" in the packages I had installed prior to my installation of Anaconda, and by running the above two steps now everything is running from within the Anaconda environment.
This error occurs when lxml is not installed, so just go to the terminal
and run: pip3 install lxml
I got the same problem. Trying to reinstall lxml does not work. After rereading the error message and tracing the error ~\Miniconda3\envs\mini_ds\lib\site-packages\pandas\io\html.py:872, I think I found the problem lies in the function _importers() in ~/pandas/io/html.py.
Here is the function:
def _importers() -> None:
# import things we need
# but make this done on a first use basis
global _IMPORTS
if _IMPORTS:
return
global _HAS_BS4, _HAS_LXML, _HAS_HTML5LIB
bs4 = import_optional_dependency("bs4", errors="ignore")
_HAS_BS4 = bs4 is not None
lxml = import_optional_dependency("lxml.etree", errors="ignore")
_HAS_LXML = lxml is not None
html5lib = import_optional_dependency("html5lib", errors="ignore")
_HAS_HTML5LIB = html5lib is not None
_IMPORTS = True
You can see that for lxml option, it actually tries importing "lxml.etree" instead of "lxml". So this is probably why reinstalling "lxml" would not help.
Conclusion, I think this is perhaps a problem of pandas version (mine is 1.4.1). For me, a quick solution is to specify the flavor ='html5lib' in pd.read_html().
I installed lxml 4.9.1, but it didn't work. So I tried to install lxml 4.8.0 instead, and it worked!
pip install lxml==4.8
As OP is using Anaconda, in order to solve that issue, install lxml by opening the CMD.Exe Prompt for the environment one is working on, and run
conda install -c anaconda lxml
(Source)
One can also do it by specifying the version as follows
conda install -c anaconda lxml=4.8.0
Notes:
pip doesn't manage dependencies the same way conda does and can, potentially, damage one's installation. Therefore, would recommend to use it only if conda doesn't work.
pip install lxml
# or
pip install lxml==4.9.1
If one is using pip and one has already the package installed and one is getting errors, one can pass -I (--ignore-installed) and -v as follows
pip install -Iv lxml==4.9.1
lxml official documentation can be found here.
This is their official GitHub repo.
I was seeing this issue as well on my RPi.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/pi/python3-ml/lib/python3.7/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/home/pi/python3-ml/lib/python3.7/site-packages/pandas/io/html.py", line 1113, in read_html
displayed_only=displayed_only,
File "/home/pi/python3-ml/lib/python3.7/site-packages/pandas/io/html.py", line 902, in _parse
parser = _parser_dispatch(flav)
File "/home/pi/python3-ml/lib/python3.7/site-packages/pandas/io/html.py", line 859, in _parser_dispatch
raise ImportError("lxml not found, please install it")
ImportError: lxml not found, please install it
Looking into /home/pi/python3-ml/lib/python3.7/site-packages/pandas/io/html.py it was attempting to use lxml.etree, so I attempted to just use that module
>>> from lxml import etree
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: libxslt.so.1: cannot open shared object file: No such file or directory
I searched for that error and found that the following packages needed to be installed on the RPi
sudo apt-get install libxslt
After installing I was successfully able to use pandas
import pandas as pd
from urllibenter code here.request import Request, urlopen
url = 'WEB-SITE'
request_site = Request(url, headers={"User-Agent": "Mozilla/5.0"})
webpage = urlopen(request_site)
dfk1 = pd.read_html(webpage, flavor='html5lib')
print(dfk1)
please help!
I am using a macbook running OS X 10.10.5
I am trying to install and then import beautifulsoup using python 3.6 but am getting the following error:
ModuleNotFoundError: No module named ‘beautiful soup’
This is what I have done:
installed python 3.6, this has installed in the applications folder, this is working fine with idle.
downloaded and installed beautifulsoup4, installed using: sudo python setup.py install. This has installed beautifulsoup4-4.5.3-py2.7.egg files into the Library/2.7/site-packages directory
My code is as follows:
import sys
sys.path.append("Macintosh HD/Library/Python/2.7/site-packages")
import beautifulsoup4
Any ideas? Many thanks in advance.
Have you ever checked the documentation of that package?
You import this package as below:
from bs4 import BeautifulSoup