python 2.7.10 issues about import bs4 - python

I installed bs4 successfully but when I import it, the command line told me that
Traceback (most recent call last):
Python Shell, prompt 3, line 1
File "C:\Python27\Lib\site-packages\bs4\__init__.py", line 303, in <module>
from . import _htmlparser
File "C:\Python27\Lib\site-packages\bs4\_htmlparser.py", line 36, in <module>
from bs4.builder import (
ImportError: No module named builder
I have searched google but I didn't find a solution..
Could our experts help me on this issue ?
thanks a lot !
my system info:
PC OS : windows 7 64bit
Python version: 2.7.10

You must first pip install beautifulsoup4, then try import bs4. If this doesn't work odds are you have a messed up pip configuration. In order to remedy this either reinstall pip, use easy_install or build from source.
In order to use easy_install just run easy_install beautifulsoup4. In order to build from source run download and extract this zip (unzip /path/to/beautifulsoup4.zip if you're at the terminal). Next cd into the now unzipped folder by doing cd /path/to/beautifulsoup and run python setup.py. The package will now be installed and ready to import!

I uninstalled bs4 package and re-installed it. and now it works...
It is quite weird because I tried to uninstall and re-install, but only this time, it worked...
Thanks for your kind help :)

Related

Python: ImportError: lxml not found, please install it

I have the following code (in PyCharm (MacOS)):
import pandas as pd
fiddy_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
print(fiddy_states)
And I get the following error:
/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/user_name/PycharmProjects/PandasTest/Doc3.py
Traceback (most recent call last):
File "/Users/user_name/PycharmProjects/PandasTest/Doc3.py", line 9, in <module>
fiddy_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/io/html.py", line 906, in read_html
keep_default_na=keep_default_na)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/io/html.py", line 733, in _parse
parser = _parser_dispatch(flav)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/io/html.py", line 693, in _parser_dispatch
raise ImportError("lxml not found, please install it")
ImportError: lxml not found, please install it
In Anaconda does appear installed the last version of lxml (3.8.0). Despite of that, I have tried to reinstall it by: 1) writing pip install lxml and 2) downloading the lxml wheel corresponding to my python version (lxml-3.8.0-cp36-cp36m-win_amd64.whl), but in any case all remains the same (in the second case I get that it is not a supported wheel on this platform, even though the version of python is correct (3.6, 64 bits)).
I've read similar questions here (even with the same code above, since it's from a tutorial), but the problem still persists.
Based on the fact that the error is:
/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6
This means that you are working with python-3.6. Now usually the package manager for python-3.x is pip3. So you probably should install it with:
pip3 install lxml
For people reached here using Jupyter notebook, I restarted the kernel after pip install lxml and the error is gone.
I got same error, it seems that my python3 was pointing to pandas in python2 (since I have not install pandas in python3). After doing pip3 install pandas and restarting a notebook, it worked fine.
you may have to (re)install some of your libraries pip install lxml bs4 html5lib
pd.read_html() reads with 'lxml' library by default, so try another library that you installed above like pd.read_html(some_url, flavor='html5lib')
You can go to Settings > Project Interpreter > Click on '+' icon
Find 'lxml' from the list of packages and click 'Install Package' button found below.
I am using PyCharm 2019.2.1 (Community Edition)
Build #PC-192.6262.63, built on August 22, 2019
Runtime version: 11.0.3+12-b304.39 amd64
VM: OpenJDK 64-Bit Server VM by JetBrains s.r.o
Linux 4.15.0-58-generic
GC: ParNew, ConcurrentMarkSweep
Memory: 937M
Cores: 4
I tried to reinstall lxml without any progress.
I ended uninstalling pandas and reinstalling and updating and that solved my issues!
pip uninstall pandas
pip install pandas
pip3 install --upgrade pandas
I got the same error when trying to run some code that was using pandas. I tried some suggestions here but those did not work. Finally, what worked for me was the following two steps :
conda update anaconda
conda install spyder=5.0.5
Now when I restarted Spyder and ran my code it worked fine.
I have just installed and starting using anaconda so I don't know the root cause of this issue, but my guess is there seemed to be some "cross-connection" in the packages I had installed prior to my installation of Anaconda, and by running the above two steps now everything is running from within the Anaconda environment.
This error occurs when lxml is not installed, so just go to the terminal
and run: pip3 install lxml
I got the same problem. Trying to reinstall lxml does not work. After rereading the error message and tracing the error ~\Miniconda3\envs\mini_ds\lib\site-packages\pandas\io\html.py:872, I think I found the problem lies in the function _importers() in ~/pandas/io/html.py.
Here is the function:
def _importers() -> None:
# import things we need
# but make this done on a first use basis
global _IMPORTS
if _IMPORTS:
return
global _HAS_BS4, _HAS_LXML, _HAS_HTML5LIB
bs4 = import_optional_dependency("bs4", errors="ignore")
_HAS_BS4 = bs4 is not None
lxml = import_optional_dependency("lxml.etree", errors="ignore")
_HAS_LXML = lxml is not None
html5lib = import_optional_dependency("html5lib", errors="ignore")
_HAS_HTML5LIB = html5lib is not None
_IMPORTS = True
You can see that for lxml option, it actually tries importing "lxml.etree" instead of "lxml". So this is probably why reinstalling "lxml" would not help.
Conclusion, I think this is perhaps a problem of pandas version (mine is 1.4.1). For me, a quick solution is to specify the flavor ='html5lib' in pd.read_html().
I installed lxml 4.9.1, but it didn't work. So I tried to install lxml 4.8.0 instead, and it worked!
pip install lxml==4.8
As OP is using Anaconda, in order to solve that issue, install lxml by opening the CMD.Exe Prompt for the environment one is working on, and run
conda install -c anaconda lxml
(Source)
One can also do it by specifying the version as follows
conda install -c anaconda lxml=4.8.0
Notes:
pip doesn't manage dependencies the same way conda does and can, potentially, damage one's installation. Therefore, would recommend to use it only if conda doesn't work.
pip install lxml
# or
pip install lxml==4.9.1
If one is using pip and one has already the package installed and one is getting errors, one can pass -I (--ignore-installed) and -v as follows
pip install -Iv lxml==4.9.1
lxml official documentation can be found here.
This is their official GitHub repo.
I was seeing this issue as well on my RPi.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/pi/python3-ml/lib/python3.7/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/home/pi/python3-ml/lib/python3.7/site-packages/pandas/io/html.py", line 1113, in read_html
displayed_only=displayed_only,
File "/home/pi/python3-ml/lib/python3.7/site-packages/pandas/io/html.py", line 902, in _parse
parser = _parser_dispatch(flav)
File "/home/pi/python3-ml/lib/python3.7/site-packages/pandas/io/html.py", line 859, in _parser_dispatch
raise ImportError("lxml not found, please install it")
ImportError: lxml not found, please install it
Looking into /home/pi/python3-ml/lib/python3.7/site-packages/pandas/io/html.py it was attempting to use lxml.etree, so I attempted to just use that module
>>> from lxml import etree
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: libxslt.so.1: cannot open shared object file: No such file or directory
I searched for that error and found that the following packages needed to be installed on the RPi
sudo apt-get install libxslt
After installing I was successfully able to use pandas
import pandas as pd
from urllibenter code here.request import Request, urlopen
url = 'WEB-SITE'
request_site = Request(url, headers={"User-Agent": "Mozilla/5.0"})
webpage = urlopen(request_site)
dfk1 = pd.read_html(webpage, flavor='html5lib')
print(dfk1)

ImportError in Python

I tried to execute a program using geonames_rdf, but I cant execute it by this error:
Traceback (most recent call last):
File "geo1.py", line 13, in <module>
import geonames.config.log
ImportError: No module named config.log
I read several posts abot ImportError and I check the path of the system and it is correct. I'm working in a VirtualBox with a fresh Ubuntu 16.04.
The imports of my program are:
import sys
import os
import os.path
import logging
import geonames.config.log
import geonames.compat
import geonames.adapters.search
I've also tried add this line:
sys.path.append('/usr/local/lib/python2.7/dist-packages/geonames/')
The command that I used to instal this package was
sudo pip install geonames_rdf
Try appending site-packages not dist-packages. A bit of searching it looks like dist-packages is debian specific.
sys.path.append('/usr/local/lib/python2.7/site-packages/geonames/')
Reason:
Since you're installing 3rd party python package via pip it will not go into dist-packages and python rightfully cannot find it on the path.
Reference link:
What's the difference between dist-packages and site-packages?
I just tried to use geonames_rdf, but I didn't know I needed it to do a geonames search so I installed geonames first, then discovered I had to install fiona and gdal (I'm on Windows, had to install these two using prebuilt whl from http://www.lfd.uci.edu/~gohlke/pythonlibs/). Don't know why these dependencies aren't baked into geonames.
Anyway once I then installed geonames_rdf it seemed to install into the geonames folder in c:\Python27\lib\site-packages, added at least the adapters package. In c:\Python27\lib\site-packages\geonames there is a config folder with log.py in it.

Importing bs4 in Python 3.5

I have installed both Python 3.5 and Beautifulsoup4. When I try to import bs4, I get the error below. Is there any fix for that? Or should I just install Python 3.4 instead?
Please be very explicit - I am new to programming. Many thanks!
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python 3.5\lib\sit-packages\bs4\__init__.py", line 30, in <module>
from .builder import builder_registry, ParserRejectionMarkup
File "C:\Python 3.5\lib\sit-packages\bs4\__init__.py", line 308, in <module>
from . import _htmlparser
File "C:\Python 3.5\lib\sit-packages\bs4\_htmlparser.py", line 7, in <module>
from html.parser import (
ImportError: cannot import name 'HTMLParseError'
Update: Starting with 4.4.0, BeautifulSoup is compatible with Python 3.5. Upgrade:
pip install --upgrade beautifulsoup4
Old answer:
Because of the changes made for Deprecate strict mode of HTMLParser issue:
Issue #15114: the strict mode and argument of HTMLParser,
HTMLParser.error, and the HTMLParserError exception have been removed.
I'm afraid beautifulSoup4 is not compatible with Python 3.5 at the moment. Use Python 3.4.
Update: BeautifulSoup 4.4.0 has been updated to be python3.5 compatible, so a pip install --upgrade beautifulsoup4 should do the trick if you are still hitting this issue.
I've sent the author a followup about this bug. If you want to install BeautifulSoup on Python 3.5a, I've uploaded a working patch of the source code to github.
https://github.com/jjangsangy/BeautifulSoup4
You can install it using setup.py or just copy & paste this code into terminal.
git clone https://github.com/jjangsangy/BeautifulSoup4 \
&& cd BeautifulSoup4 \
&& python3.5 setup.py install
I'm assuming here that since you're trying out 3.5a your python interpreter is installed with proper user permissions for your site-packages directory so no sudo invocation is necessary.

Scrapy, problems on the tutorial

Ive been trying to get around Scrapy.
I have python 2.7 installed on my mac (OSX 10.8.5) from before, so I installed pip, scrapy, lxml and twisted (I did the last one manually though through dmg file).
I try to run scrapy startproject tutorial to no success I only get:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/bin/scrapy", line 3, in <module>
from scrapy.cmdline import execute
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/__init__.py", line 43, in <module>
from twisted import version as _txv
ImportError: No module named twisted
Now I've looked around for hours can't seem to find what the problem is, so I thought I'd give this here a shot, any suggestions?
pJ
You might be better off setting up a virtualenv and installing twisted with pip instead.
Got the same error. Works if you install twisted manually, then uninstall scrappy and
reinstall
pip install twisted
pip uninstall scrapy
pip install scrapy

installing paramiko on Windows

This may sound like a repeated question on SF, but I could not find a clear answer to it, yet.So.
I installed Paramiko 1.7 with "setup.py install" command and while running the demo.py program, I got this error:
Traceback (most recent call last):
File "C:\Documents and Settings\fixavier\Desktop\paramiko-1.7\demos\demo.py", line 33, in <module>
import paramiko
File "C:\Python26\lib\site-packages\paramiko\__init__.py", line 69, in <module>
from transport import randpool, SecurityOptions, Transport
File "C:\Python26\lib\site-packages\paramiko\transport.py", line 32, in <module>
from paramiko import util
File "C:\Python26\lib\site-packages\paramiko\util.py", line 31, in <module>
from paramiko.common import *
File "C:\Python26\lib\site-packages\paramiko\common.py", line 99, in <module>
from Crypto.Util.randpool import PersistentRandomPool, RandomPool
ImportError: No module named Crypto.Util.randpool
I'm getting this error even after installing PyCrypto 2.1.
On running test.py(which comes with the installation), I got the following error -
Traceback (most recent call last):
File "C:\Documents and Settings\fixavier\Desktop\pycrypto-2.0.1\pycrypto-2.0.1\test.py", line 18, in <module>
from Crypto.Util import test
File "C:\Documents and Settings\fixavier\Desktop\pycrypto-2.0.1\pycrypto-2.0.1\build/lib.win32-2.6\Crypto\Util\test.py", line 17, in <module>
import testdata
File "C:\Documents and Settings\fixavier\Desktop\pycrypto-2.0.1\pycrypto-2.0.1\test\testdata.py", line 450, in <module>
from Crypto.Cipher import AES
ImportError: cannot import name AES
I don't have the confidence to go ahead and install AES after all this, for all I know I may get another ImportError!
Please advice.Is it the way of installation thats problematic?
Looks like your pycrypto installation is broken or not installed.
Try to get a pycrypto for python2.6 installer here and try again after installing it.
http://www.voidspace.org.uk/python/modules.shtml#pycrypto
I tried Vijay's method,but it doesn't work.
I use the method on 'http://kmdarshan.com/blog/?p=3208',it works:
Goto to http://twistedmatrix.com/trac/wiki/Downloads and download the pycrypto package .exe for windows/python2.5. This is needed for running paramiko.
Next, download the paramiko package from http://www.lag.net/paramiko/.
Unzip paramiko to a temporary folder, better if you unzip it to the folder where python is installed.
Go into the folder for paramiko.
Open command prompt and see to it that you have python set as the environment variable.
Run this command python setup.py install
You will get a series of lines of compilation. Just make sure you dont have any error in them. If you have any errors you will need to re compile them again.
Just be be sure everything is alright import paramiko in your program and see.
FYI: paramiko is used for ssh..and so on.
Download paramiko for windows. You get the zip file:
www.lag.net/paramiko/
To build it you need the dependency package pycrypto. Again keep in mind you will need a matching version of pycrypto for your Python. This is a built version of Windows so no install is required. http://www.voidspace.org.uk/python/modules.shtml#pycrypto
You could do an easy_install by downloading setuptools but I ran into some issues so I chose to download MinGW tool. This is again an installation and no build is required. http://sourceforge.net/projects/mingw/files/Automated%20MinGW%20Installer/mingw-get-inst/mingw-get-inst-20110316/
Once you have pycrypto and MinGW installed on your windows machine, just browse to the folder where you extracted the paramiko module from the zip file and issue this command:
python setup.py build --compiler=mingw32 bdist_wininst
TADA! You are all set to use ssh on your windows machine with Python.
I have installed paramiko onto 64bit Windows 7 successfully:
Install Python2.7
Download 64bit PyCrypto installation package from: http://www.dragffy.com/posts/ython-pycrypto-2-4-1-32-and-64-bit-windows-32x64-amdintel-installers
Download paramiko package from: http://www.lag.net/paramiko/
extract paramiko package
start a command line terminal from the extracted paramiko package, run
"python setup.py install"
I wanted to install Paramiko for Python 3.3.2 on Windows XP. I followed the instructions here
After I downloaded all programs on the list for my Python version, Paramiko starts without problems.
Install python-2.7.3.amd64.msi
Install pycrypto-2.6.win-amd64-py2.7.exe
Install setuptools-1.4.2.win-amd64-py2.7.exe
Install pip-1.4.1.win-amd64-py2.7.exe
Download and extract https://github.com/paramiko/paramiko/archive/master.zip
The actual problem does not seem to be a broken Crypto install but a slightly different one. After installing paramiko and crypto with easy_install on windows I do have crypto installed but not Crypto. I installed the package PyCrypt (which gave an error because I didn't have a C compiler before I installed visual studio express)
It appears that the Crypto package you downloaded doesn't have AES...
you should try doing the following:
import Crypto
import Crypto.Util
import Crypto.Cipher
if any of those fail then you still need to make sure pycrypto is installed (see the link from S.Mark here), otherwise Paramiko might not depend on having AES (even though there is a test for that)
It seems PyCrypto uses a c-compiler(which is inherently present on the Linux system - gcc).
Also, somewhere on the PyCrypto readme.txt file says, it needs to be 'build' first, before doing an 'install'
On Linux, I build it first and then ran 'install' command on it and was successfully installed.
I searched for a long time looking for a solution to this problem. I'm running Windows 7 64-bit and python 2.7. None of the above solutions worked for me.
this one did
Don't forget to include the C++ compiler when you download the Microsoft SDK, it wasn't checked by default.
I downloaded the pycrypto 2.5 source to do the compile and the paramiko 2.3, things work well.
Here is a very precise answer:
Step 1: Go to https://github.com/paramiko/paramiko
Step 2: Download the zip file, and extract it
Step 3: Move into the folder and run python setup.py install
You are done!
I had the similar problem on my mac and the way I solved it was by simply just renaming the "crypto" directory to the "Crypto". I already had the paramiko and ssh installed in it. They both work perfectly fine now. However, this may or may not work for some one but this is just a simple thought on getting the ways around this problem.
just try
pip install paramiko
if this shows error, then
pip install cryptography
pip install paramiko

Categories

Resources