AttributeError: module 'copy' has no attribute 'deepcopy' - python

I'm Actually New to Python and BS4.
And I Decided to create a script that will scrape a website, oscarmini.com to be precise, the code was running fine untill today when I wanted to modify it, I keep getting errors, in the little knowledge I have about Exceptions and Error, there's nothing wrong with the code it seems to be from the importation of 'bs4' module..
from bs4 import BeautifulSoup as BS
import requests
url = 'https://oscarmini.com/2018/05/techfest-2018.html'
page = requests.get(url)
soup = BS(page.text, 'lxml')
mydivs = soup.find("div", {"class": "entry-content"})
soup.find('div', id="dpsp-content-top").decompose()
print(mydivs.get_text())
input()
Below is the error message I get.
Traceback (most recent call last):
File "C:/Users/USERNaME/Desktop/My Programs/Random/Oscarmini-
Scrapper.py", line 1, in <module>
from bs4 import BeautifulSoup as BS
File "C:\Users\USERNaME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\bs4\__init__.py", line 35, in <module>
import xml.etree.cElementTree as default_etree
File ":\Users\USERNaME\AppData\Local\Programs\Python\Python36-32\lib\xml\etree\cElementTree.py", line 3, in <module>
from xml.etree.ElementTree import *
File "C:\Users\USERNaME\AppData\Local\Programs\Python\Python36-32\lib\xml\etree\ElementTree.py", line 1654, in <module>
from _elementtree import *
AttributeError: module 'copy' has no attribute 'deepcopy'
Process finished with exit code 1
Please I really need help on this..

I encountered the same problem. And I finally found that the problem is I have another script named copy.py and it shadows the original copy module.
You can show the real path for the copy module with print(copy.__file__) just before the exception occurs and see whether it is intended.
You can also list your PATHONPATH environment variable with:
print(os.environ['PYTHONPATH'].split(os.pathsep))
just before the line that causes the exception, and see whether there are something unexpected.

Make sure any copy.py file does not exists in your project working directory...
like
project Folder:
copy.py
currentOpenFile.py # when you import copy module...

Related

Can I import a list of modules from a shared file? i.e. can I import imports?

I have about a dozen python module imports that are going to be reused on many different scrapers, and I would love to just throw them into a single file (scraper_functions.py) that also contains a bunch of functions, like this:
import smtplib
import requests
import re
from urllib.request import urlopen
from bs4 import BeautifulSoup
import time
def function_name(var1)
# function code here
then in my scraper I would simply do something like:
import scraper_functions
and be done with it. But listing the imports at the top of scraper_functions.py doesn't work, and neither does putting all the imports in a function. In each case I get errors in the scraper that is doing the importing.
Traceback (most recent call last):
File "{actual-scraper-name-here}.py", line 24, in <module>
x = requests.get(main_url)
NameError: name 'requests' is not defined
In addition, in VSCode, under Problems, I get errors like
Undefined variable 'requests' pylint(undefined-variable) [24,5]
None of the modules are recognized. I have made sure that all files are in the same directory.
Is such a thing possible please?
You need to either use the scraper_functions prefix (same way you do this import name) or use the from keyword to import your things from scraper_functions with the * selector.
Using the form keyword (Recommended)
from scraper_functions import * # import everything with *
...
x = requests.get(main_url)
Using the scraper_functions prefix (Not recommended)
import scraper_functions
...
x = scraper_functions.requests.get(main_url)

python: called function from other file needs modules

I am calling a function from functions.py into work.py, which works fine:
from functions import get_ad_page_urls
The get_ad_page_urls function makes use of a.o. the requests module.
Now, wether or not I import the requests module into work.py, when I run the called function in work.py, it gives an error: NameError: name 'requests' is not defined.
I have defined get_ad_page_urls in functions.py including the module, like so,
def get_ad_page_urls():
import requests
<rest of function>
or excluding the module, like so,
import requests
def get_ad_page_urls():
<rest of function>
but it doesn't matter, the NameError persists.
How should I write the function such that when I call the function in work.py everything works fine?
Traceback:
get_ad_page_urls(page_root_url)
Traceback (most recent call last):
File "<ipython-input-253-ac55b8b1e24c>", line 1, in <module>
get_ad_page_urls(page_root_url)
File "/Users/myname/Documents/RentIndicator/Python Code/idealista_functions.py", line 35, in get_ad_page_urls
NameError: name 'requests' is not defined
functions.py
import requests
import bs4
import re
from bs4 import BeautifulSoup
def get_ad_page_urls(page_root_url):
response = requests.get(page_root_url)
soup = bs4.BeautifulSoup(response.text)
container=soup.find("div",{"class":"items-container"})
return [link.get("href") for link in container.findAll("a", href=re.compile("^(/inmueble/)((?!:).)*$"))]
work.py
import requests
import bs4
import re
from bs4 import BeautifulSoup
from functions import get_ad_page_urls
city='Valencia'
lcity=city.lower()
root_url = 'https://www.idealista.com'
house_href='/alquiler-habitacion/'
page_root_url = root_url +house_href +lcity+ '-' + lcity + '/'
get_ad_page_urls(page_root_url)
Mine works perfectly fine running on python 3.4.4
functions.py
import requests
def get_ad_page_urls():
return requests.get("https://www.google.com")
work.py
from functions import get_ad_page_urls
print(get_ad_page_urls())
# outputs <Response [200]>
Make sure they are in the same directory. You might be using two different python versions and one of them doesn't have requests?

How to open url source code?

I try to write code, that can open url connection and read text from the response. I've tried:
import urllib
urllib.urlopen('http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=12345')
But it gives me this error:
Traceback (most recent call last):
File "<pyshell#40>", line 1, in <module>
urllib.urlopen('http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=12345')
AttributeError: 'module' object has no attribute 'urlopen'
What's wrong with my code and what is the solution to the problem?
I think the problem is that you are on Python 3.x but using code that only works on 2.x.
In Python 3.x, the urlopen function is contained in urllib.request:
>>> from urllib.request import urlopen
>>> urlopen
<function urlopen at 0x020DA7C8>
>>>
Edit:
I think this does everything you want:
from urllib.request import urlopen
page = urlopen('http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=12345').read()
print(page)
Try "requests". It is much easier to work with.
http://www.python-requests.org/en/latest/user/quickstart/
import requests
r = requests.get('http://www.pythonchallenge.com/pc/def/linkedlist.php', params={
'nothing':'12345'
})
print r.text

Import Error using cPickle in Python

I am using Pickle in Python2.7. I am getting error while using cPickle.load() method. The code and error is shown below. Can someone guide me through this?
Code:
#! usr/bin/python
import cPickle
fo = open('result','rb')
dict1 = cPickle.load(fo)
Error:
Traceback (most recent call last):
File "C:\Python27\test.py", line 7, in <module>
dicts = cPickle.load(fo)
ImportError: No module named options
It seems like you can not do
import options
but when you or someone else did
cpickle.dump(xxx, open('result', 'rb'))
there was an object with a class or function of a module options that existed at this point in time, in xxx.
Solution
You can open the file binarily and replace options with the module you replaced the old module options with.
You probably created the file in your package like in module package.main by executing the file main.py or something like it, having a module options in the same directory.
Now you do import package.main, try to read the file and options is now called package.options and the module options can not be found.
How did you create this file? How do you load it now? cPickle/pickle does not transfer source code - so if you use a function you need the module when you load it.

Fixed: Python NameError, fixed AttributeError and got this?

FIXED: turns out there is a module already called parser. Renamed it and its working fine! Thanks all.
I got a python NameError I can't figure out, got it after AttributeError. I've tried what I know, can't come up with anything.
main.py:
from random import *
from xml.dom import minidom
import parser
from parser import *
print("+---+ Roleplay Stat Reader +---+")
print("Load previous DAT file, or create new one (new/load file)")
IN=input()
splt = IN.split(' ')
if splt[0]=="new":
xmlwrite(splt[1])
else:
if len(splt[1])<2:
print("err")
else:
xmlread(splt[1])
ex=input("Press ENTER to Exit...")
parser.py:
from xml.dom import minidom
from random import *
def xmlread(doc):
xmldoc = minidom.parse(doc)
itemlist = xmldoc.getElementsByTagName('item')
for s in itemlist:
print(s.attributes['name'].value,":",s.attributes['value'].value)
def xmlwrite(doc):
print("no")
And no matter what I get the error:
Traceback (most recent call last):
File "K:\Python Programs\Stat Reader\main.py", line 10, in <module>
xmlwrite.xmlwrite(splt[1])
NameError: name 'xmlread' is not defined
The same error occurs when trying to access xmlwrite.
When I change xmlread and xmlwrite to parser.xmlread and parser.xmlwrite I get:
Traceback (most recent call last):
File "K:\Python Programs\Stat Reader\main.py", line 15, in <module>
parser.xmlread(splt[1])
AttributeError: 'module' object has no attribute 'xmlread'
The drive is K:\ because it's my personal drive at my school.
If your file is really called parser.xml, that's your problem. It needs to be parser.py in order to work.
EDIT: Okay, since that wasn't your issue, it looks like you have a namespacing issue. You import your parser module twice when you use import parser and then from parser import *. The first form of it makes "parser" the namespace and the second form directly imports it, so in theory, you should have both parser.xmlwrite and xmlwrite in scope. It's also clearly not useful to import minidom in main.py since you don't use any minidom functionality in there.
If you clear up those and still have the issue, I would suggest looking at __ init __.py. If that still does nothing, it could just plain be a conflict with Python's parser module, you could substitute a name like myxmlparser.

Categories

Resources