I am calling a function from functions.py into work.py, which works fine:
from functions import get_ad_page_urls
The get_ad_page_urls function makes use of a.o. the requests module.
Now, wether or not I import the requests module into work.py, when I run the called function in work.py, it gives an error: NameError: name 'requests' is not defined.
I have defined get_ad_page_urls in functions.py including the module, like so,
def get_ad_page_urls():
import requests
<rest of function>
or excluding the module, like so,
import requests
def get_ad_page_urls():
<rest of function>
but it doesn't matter, the NameError persists.
How should I write the function such that when I call the function in work.py everything works fine?
Traceback:
get_ad_page_urls(page_root_url)
Traceback (most recent call last):
File "<ipython-input-253-ac55b8b1e24c>", line 1, in <module>
get_ad_page_urls(page_root_url)
File "/Users/myname/Documents/RentIndicator/Python Code/idealista_functions.py", line 35, in get_ad_page_urls
NameError: name 'requests' is not defined
functions.py
import requests
import bs4
import re
from bs4 import BeautifulSoup
def get_ad_page_urls(page_root_url):
response = requests.get(page_root_url)
soup = bs4.BeautifulSoup(response.text)
container=soup.find("div",{"class":"items-container"})
return [link.get("href") for link in container.findAll("a", href=re.compile("^(/inmueble/)((?!:).)*$"))]
work.py
import requests
import bs4
import re
from bs4 import BeautifulSoup
from functions import get_ad_page_urls
city='Valencia'
lcity=city.lower()
root_url = 'https://www.idealista.com'
house_href='/alquiler-habitacion/'
page_root_url = root_url +house_href +lcity+ '-' + lcity + '/'
get_ad_page_urls(page_root_url)
Mine works perfectly fine running on python 3.4.4
functions.py
import requests
def get_ad_page_urls():
return requests.get("https://www.google.com")
work.py
from functions import get_ad_page_urls
print(get_ad_page_urls())
# outputs <Response [200]>
Make sure they are in the same directory. You might be using two different python versions and one of them doesn't have requests?
Related
I create a python console app that includes imports of a custom class I'm using. Everytime I run my app I get the error ModuleNotFoundError: "No module named 'DataServices'.
Can you help?
Provided below is my folder structure:
ETL
Baseball
Baseball_DataImport.py
DataServices
DataService.py
ConfigServices.py
PageDataMode.py
SportType.py
Here is the import section from the Baseball_DataImport.py file. This is the file when I run I get the error:
from bs4 import BeautifulSoup
import scrapy
import requests
import BaseballEntity
import mechanize
import re
from time import sleep
import logging
import time
import datetime
from functools import wraps
import json
import DataServices.DataService - Error occurs here
Here is my DataService.py file:
import pymongo
import json
import ConfigServices
import PageDataModel
#from SportType import SportType
class DataServices(object):
AppConfig: object
def __init__(self):
AppConfig = ConfigServices.ConfigService()
#print(AppConfig)
#def GetPagingDataBySport(self,Sport:SportType):
def GetPagingDataBySport(self):
#if Sport == SportType.BASEBALL:
pagingData = []
pagingData.append(PageDataModel.PageDataModel("", 2002, 2))
pagingData.append(PageDataModel.PageDataModel("", 2003, 2))
return pagingData
It might seem that your structure is:
Baseball
Baseball_DataImport.py
Dataservices
Dataservice.py
Maybe you need to do from Dataservices.Dataservice import DataServices
Edit:
I created the folder structure, and the method I showed you works:
Here's the implementation
Dataservice.py only contains:
class DataServices():
pass
Did you try copieing the Dataservice.py into the Projectfolder with the main.py?
I have about a dozen python module imports that are going to be reused on many different scrapers, and I would love to just throw them into a single file (scraper_functions.py) that also contains a bunch of functions, like this:
import smtplib
import requests
import re
from urllib.request import urlopen
from bs4 import BeautifulSoup
import time
def function_name(var1)
# function code here
then in my scraper I would simply do something like:
import scraper_functions
and be done with it. But listing the imports at the top of scraper_functions.py doesn't work, and neither does putting all the imports in a function. In each case I get errors in the scraper that is doing the importing.
Traceback (most recent call last):
File "{actual-scraper-name-here}.py", line 24, in <module>
x = requests.get(main_url)
NameError: name 'requests' is not defined
In addition, in VSCode, under Problems, I get errors like
Undefined variable 'requests' pylint(undefined-variable) [24,5]
None of the modules are recognized. I have made sure that all files are in the same directory.
Is such a thing possible please?
You need to either use the scraper_functions prefix (same way you do this import name) or use the from keyword to import your things from scraper_functions with the * selector.
Using the form keyword (Recommended)
from scraper_functions import * # import everything with *
...
x = requests.get(main_url)
Using the scraper_functions prefix (Not recommended)
import scraper_functions
...
x = scraper_functions.requests.get(main_url)
I'm Actually New to Python and BS4.
And I Decided to create a script that will scrape a website, oscarmini.com to be precise, the code was running fine untill today when I wanted to modify it, I keep getting errors, in the little knowledge I have about Exceptions and Error, there's nothing wrong with the code it seems to be from the importation of 'bs4' module..
from bs4 import BeautifulSoup as BS
import requests
url = 'https://oscarmini.com/2018/05/techfest-2018.html'
page = requests.get(url)
soup = BS(page.text, 'lxml')
mydivs = soup.find("div", {"class": "entry-content"})
soup.find('div', id="dpsp-content-top").decompose()
print(mydivs.get_text())
input()
Below is the error message I get.
Traceback (most recent call last):
File "C:/Users/USERNaME/Desktop/My Programs/Random/Oscarmini-
Scrapper.py", line 1, in <module>
from bs4 import BeautifulSoup as BS
File "C:\Users\USERNaME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\bs4\__init__.py", line 35, in <module>
import xml.etree.cElementTree as default_etree
File ":\Users\USERNaME\AppData\Local\Programs\Python\Python36-32\lib\xml\etree\cElementTree.py", line 3, in <module>
from xml.etree.ElementTree import *
File "C:\Users\USERNaME\AppData\Local\Programs\Python\Python36-32\lib\xml\etree\ElementTree.py", line 1654, in <module>
from _elementtree import *
AttributeError: module 'copy' has no attribute 'deepcopy'
Process finished with exit code 1
Please I really need help on this..
I encountered the same problem. And I finally found that the problem is I have another script named copy.py and it shadows the original copy module.
You can show the real path for the copy module with print(copy.__file__) just before the exception occurs and see whether it is intended.
You can also list your PATHONPATH environment variable with:
print(os.environ['PYTHONPATH'].split(os.pathsep))
just before the line that causes the exception, and see whether there are something unexpected.
Make sure any copy.py file does not exists in your project working directory...
like
project Folder:
copy.py
currentOpenFile.py # when you import copy module...
I can't run my script I'm using python3 and I install pyrebase and his dependencies
I got this below exception when I try to run my script on linux ubuntu
Traceback (most recent call last):
File "scrapping2fb.py", line 9, in <module>
import pyrebase
File "/usr/local/lib/python3.4/dist-packages/pyrebase/__init__.py", line 1, in <module>
from .pyrebase import initialize_app
File "/usr/local/lib/python3.4/dist-packages/pyrebase/pyrebase.py", line 19, in <module>
from requests.packages.urllib3.contrib.appengine import is_appengine_sandbox
Can some one help me
Thank you
The script that i try to run
from urllib.request import urlopen ,URLError,HTTPError,Request
from socket import timeout
from bs4 import BeautifulSoup
from time import sleep
import mysql.connector
from datetime import datetime
import pyrebase
def is_exist_firebase_db_AR(siteName,title):#(siteName,title):
global config
global email
global password
firebase = pyrebase.initialize_app(config)
db=firebase.database()
auth = firebase.auth()
user = auth.sign_in_with_email_and_password(email, password)
all_items = db.child("items_ar").get(user['idToken'])
if(all_items.each() is not None):
for item in all_items.each():
if(siteName in item.val().get("nomSite") and title in item.val().get("titre")):
return 1
return 0
This is a problem with the pyrebase package.
Since commit 8e17600ef60de4faf632acb55d15cb3c178de9bb which went into v2.16.0, requests no longer bundle urllib3.
The package pyrebase is relying on this implementation detail, and, like all things that rely on implementation details eventually do, was broken.
so I have this code:
def crawl(self, url):
data = urllib.request.urlopen(url)
print(data)
but then when I call the function, it returns
data = urllib.request.urlopen(url)
AttributeError: 'module' object has no attribute 'request'
what did I do wrong? I already imported urllib..
using python 3.1.3
In python3, urllib is a package with three modules request, response, and error for its respective purposes.
Whenever you had import urllib or import urllib2 in Python2.
Replace them with
import urllib.request
import urllib.response
import urllib.error
The classs and methods are same.
BTW, use 2to3 tool if you converting from python2 to python3.
urllib.request is a separate module; import it explicitly.