I built a scraping module "scraper.py" that also has the ability to download file and I imported this module into django views. Issue is that in the scraper.py, this " __name__='__main__" is included where the multiprocessing pool is, so when I import the module and try to run it, it doesn't work because it isn't the main.
This is the script(scraper.py) that uses the pool method.
def download(self, url):
response = self._is_downloadable(url)
if response:
name = response.headers.get('content-disposition')
fname = re.findall('filename=(.+)', name)
if len(fname) != 0:
filename = fname[0]
filename = filename.replace("\"", "")
print(filename)
else :
filename = "Lecture note"
with open(filename, 'wb') as files:
for chunk in response.iter_content(100000):
files.write(chunk)
def download_course_file(self, course):
username = self._login_data["username"]
p = Path(f"{username}-{course}.txt").exists()
if not p:
self.get_download_links(course)
statime = time.time()
if __name__ == "__main__":
with Pool() as p:
with open(f"{username}-{course}.txt", "r") as course_link:
data = course_link.read().splitlines(False)[::2]
p.map(self.download, data)
print(data)
print(f"Process done {time.time()-statime}")
This module is imported in the views and then ran as
import scraper
def download_course(request, id):
course = course = get_object_or_404(Course, id=id)
course_name = (course.course_name)[:6]
person, error = create_session(request)
if "invalid" in error:
data = {"error":error}
return JsonResponse(data)
person.download_course_file(course_name)
data = {"success":"Your notes are being downloaded"}
return JsonResponse(data)
PS: create_session is a function for initialising the scraper object with a username and password.
Is there a workaround for this name statement and even if there isn't, can't I remove it when I am deploying to a server as long as the server don't use windows as its OS.
Related
I am trying to make a Python script that runs as a background service with NSSM that will only preform an action when a new version of a program is released.
To do this I want to check a web URL with requests every x amount of time and only if the version the server reports has changed from the local version run a updater block.
I currently have the block that does the update working fine but I just don't know how I would compare the request result and local version info from a file and only run that block if they don't match.
I get the newest version on the server with this code:
winclientsettings = requests.get('https://myapi.website/client-version/Windows')
winclientdict = winclientsettings.json()
serverversion = list(winclientdict.values())[1]
And the local version with:
appdata = os.getenv('LOCALAPPDATA')
programfolder = appdata + '/My Program'
versionfile = programfolder + '/localversion'
with open(versionfile, "r") as file:
localversion = file.readline()
I want to put these together into an infinite loop, where if serverversion and localversion are the same it just does nothing and waits a set amount of time before repeating, and if serverversion and localversion are different, will trigger a block higher in the file.
Put what you have in a while loop, and use the builtin time.sleep
import os
import time
import requests
WAIT_SECONDS = 300
appdata = os.getenv('LOCALAPPDATA')
programfolder = appdata + '/My Program'
versionfile = programfolder + '/localversion'
def get_server_version():
winclientsettings = requests.get(
'https://myapi.website/client-version/Windows')
winclientdict = winclientsettings.json()
serverversion = list(winclientdict.values())[1]
return serverversion
def get_local_version():
with open(versionfile, "r") as file:
localversion = file.readline()
return localversion
def update_if_new():
server_version = get_server_version()
local_version = get_local_version()
if server_version != local_version:
with open(versionfile, "w") as file:
file.write(server_version)
def main():
while True:
update_if_new()
time.sleep(WAIT_SECONDS)
if __name__ == '__main__':
main()
run display
when run it spits out the error couldnt write token to cache at: .cache but it still manages to grab whatever is playing. and each time it makes a request it reopens a auth page. it also doesnt save to any of the files. everything runs fine in pycharm using the main python environment.
from spotipy.oauth2 import SpotifyOAuth
import requests
import time
#file locations
file_time_playing = "time_playing.txt"
file_time_playing_ms = "time_playing_ms.txt"
file_time_total = "time_total.txt"
file_time_total_ms = "time_total_ms.txt"
sp = spotipy.Spotify(auth_manager=SpotifyOAuth(client_id = ".......",
client_secret = ".......",
redirect_uri = "http://localhost:8080",
scope = "streaming"))
song = ""
def track():
print("======================")
global song
global currently_playing
# print(sp.currently_playing())
try:
currently_playing = sp.currently_playing()
#print(currently_playing)
if song == str(currently_playing["item"]["name"]):
print(song)
play_time()
return
else:
song = str(currently_playing["item"]["name"])
print(song)
with open("song.txt", "w", encoding="utf-8") as f:
f.write(song)
except:
pass
starttime = time.time()
while True:
track()
time.sleep(0.5 - ((time.time() - starttime) % 0.5))```
i found a link to a git hub error and it suggested ruining it from a .bat i used the below links to solve my issue
https://github.com/plamere/spotipy/issues/363
How to run a .py file through a batch file using Command line
I'm trying to get the generated domain name or IP-address of flask_ngrok or py-ngrok after been deploy. I want to deploy flask_app to localhost and get the new IP-address or domain name on the main page.
I.E: If I access 127.0.0.1/ I want it to return something like
You can now log in through https://aaf8447ee878.ngrok.io/
I have tried checking through the directories and read some help but I can't still get it. Thanks in advance ❤
add
import atexit
import json
import os
import platform
import shutil
import subprocess
import tempfile
import time
import zipfile
from pathlib import Path
from threading import Timer
import requests
def _run_ngrok():
ngrok_path = str(Path(tempfile.gettempdir(), "ngrok"))
_download_ngrok(ngrok_path)
system = platform.system()
if system == "Darwin":
command = "ngrok"
elif system == "Windows":
command = "ngrok.exe"
elif system == "Linux":
command = "ngrok"
else:
raise Exception(f"{system} is not supported")
executable = str(Path(ngrok_path, command))
os.chmod(executable, 777)
ngrok = subprocess.Popen([executable, 'http', '5000'])
atexit.register(ngrok.terminate)
localhost_url = "http://localhost:4040/api/tunnels" # Url with tunnel details
time.sleep(1)
tunnel_url = requests.get(localhost_url).text # Get the tunnel information
j = json.loads(tunnel_url)
tunnel_url = j['tunnels'][0]['public_url'] # Do the parsing of the get
tunnel_url = tunnel_url.replace("https", "http")
return tunnel_url
def _download_ngrok(ngrok_path):
if Path(ngrok_path).exists():
return
system = platform.system()
if system == "Darwin":
url = "https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-darwin-amd64.zip"
elif system == "Windows":
url = "https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-windows-amd64.zip"
elif system == "Linux":
url = "https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip"
else:
raise Exception(f"{system} is not supported")
download_path = _download_file(url)
with zipfile.ZipFile(download_path, "r") as zip_ref:
zip_ref.extractall(ngrok_path)
def _download_file(url):
local_filename = url.split('/')[-1]
r = requests.get(url, stream=True)
download_path = str(Path(tempfile.gettempdir(), local_filename))
with open(download_path, 'wb') as f:
shutil.copyfileobj(r.raw, f)
return download_path
def start_ngrok():
global ngrok_address
ngrok_address = _run_ngrok()
print(f" * Running on {ngrok_address}")
print(f" * Traffic stats available on http://127.0.0.1:4040")
def run_with_ngrok(app):
"""
The provided Flask app will be securely exposed to the public internet via ngrok when run,
and the its ngrok address will be printed to stdout
:param app: a Flask application object
:return: None
"""
old_run = app.run
def new_run():
thread = Timer(1, start_ngrok)
thread.setDaemon(True)
thread.start()
old_run()
app.run = new_run
####################
dont import flask_ngrok
at the end at before name == 'main' add this function
def ngrok_url():
global tunnel_url
while True:
try:
print(ngrok_address)
except Exception as e:
print(e)
and after before app.run() put
thread = Timer(1, ngrok_url)
thread.setDaemon(True)
thread.start()
and run Warning: this will crash your code editor/ or terminal if u dont want that in the ngrok url function replace print with whatever you want to do with the url
and you dont need that
global tunnel_url
def ngrok_url():
while True:
try:
print(ngrok_address)
except Exception as e:
print(e)
you can delete the threading part before the name == 'main' too after the imports set
ngrok_address = ''
then you can accses the ngrok_address anywhere in your code
I found out the easiest way to do this is the just copy the url when the user is visiting the site. You can do this by...
#app.before_request
def before_request():
global url
url = request.url
# url = url.replace('http://', 'https://', 1)
url = url.split('.ngrok.io')[0]
url += '.ngrok.io'
Could anyone please help me in getting a way to get the source code from Environment or SB Console or Weblogic.
I created the python script whick exports the JAR, but I need the source code. Because if I unjar the jar, I do not get the exact source code, as file names are shorten and some code is added by itself in wsdls, xqueries etc. I don't want that.
Here's my wlst Python/Jython Script:
from java.io import FileInputStream
from java.io import FileOutputStream
from java.util import ArrayList
from java.util import Collections
from com.bea.wli.sb.util import EnvValueTypes
from com.bea.wli.config.env import EnvValueQuery;
from com.bea.wli.config import Ref
from com.bea.wli.config.customization import Customization
from com.bea.wli.config.customization import FindAndReplaceCustomization
import sys
#=======================================================================================
# Utility function to load properties from a config file
#=======================================================================================
def exportAll(exportConfigFile):
def exportAll(exportConfigFile):
try:
print "Loading export config from :", exportConfigFile
exportConfigProp = loadProps(exportConfigFile)
adminUrl = exportConfigProp.get("adminUrl")
exportUser = exportConfigProp.get("exportUser")
exportPasswd = exportConfigProp.get("exportPassword")
exportJar = exportConfigProp.get("exportJar")
customFile = exportConfigProp.get("customizationFile")
passphrase = exportConfigProp.get("passphrase")
project = sys.argv[2]
if project == None :
project = exportConfigProp.get("project")
connectToServer(exportUser, exportPasswd, adminUrl)
ALSBConfigurationMBean = findService("ALSBConfiguration", "com.bea.wli.sb.management.configuration.ALSBConfigurationMBean")
print "ALSBConfiguration MBean found"
print "Input project: ", project
if project == None :
ref = Ref.DOMAIN
collection = Collections.singleton(ref)
if passphrase == None :
print "Export the config"
theBytes = ALSBConfigurationMBean.exportProjects(collection, None)
else :
print "Export and encrypt the config"
theBytes = ALSBConfigurationMBean.export(collection, true, passphrase)
else :
ref = Ref.makeProjectRef(project);
print "Export the project", project
collection = Collections.singleton(ref)
theBytes = ALSBConfigurationMBean.export(collection, false, None)
aFile = File(exportJar)
out = FileOutputStream(aFile)
out.write(theBytes)
out.close()
print "ALSB Configuration file: "+ exportJar + " has been exported"
if customFile != None:
print collection
query = EnvValueQuery(None, Collections.singleton(EnvValueTypes.WORK_MANAGER), collection, false, None, false)
customEnv = FindAndReplaceCustomization('Set the right Work Manager', query, 'Production System Work Manager')
print 'EnvValueCustomization created'
customList = ArrayList()
customList.add(customEnv)
print customList
aFile = File(customFile)
out = FileOutputStream(aFile)
Customization.toXML(customList, out)
out.close()
print "ALSB Dummy Customization file: "+ customFile + " has been created"
except:
raise
#=======================================================================================
# Utility function to load properties from a config file
#=======================================================================================
def loadProps(configPropFile):
propInputStream = FileInputStream(configPropFile)
configProps = Properties()
configProps.load(propInputStream)
return configProps
#=======================================================================================
# Connect to the Admin Server
#=======================================================================================
def connectToServer(username, password, url):
connect(username, password, url)
domainRuntime()
# EXPORT script init
try:
exportAll(sys.argv[1])
except:
print "Unexpected error: ", sys.exc_info()[0]
dumpStack()
raise
Any help would be appreciated.
What you get as a result of the export is the deployed unit. Yes, there is some metadata added/modified as a result of the deployment on the OSB runtime (deployment could also mean creating/editing components directly on the servicebus console).
To get it back as "source code" from the exported jar, you can simply import it back into JDeveloper (12c) or Eclipse with OEPE (11g)
I have a function that parses a file and inserts the data into MySQL using SQLAlchemy. I've been running the function sequentially on the result of os.listdir() and everything works perfectly.
Because most of the time is spent reading the file and writing to the DB, I wanted to use multiprocessing to speed things up. Here is my pseduocode as the actual code is too long:
def parse_file(filename):
f = open(filename, 'rb')
data = f.read()
f.close()
soup = BeautifulSoup(data,features="lxml", from_encoding='utf-8')
# parse file here
db_record = MyDBRecord(parsed_data)
session.add(db_record)
session.commit()
pool = mp.Pool(processes=8)
pool.map(parse_file, ['my_dir/' + filename for filename in os.listdir("my_dir")])
The problem I'm seeing is that the script hangs and never finishes. I usually get 48 of 63 records into the database. Sometimes it's more, sometimes it's less.
I've tried using pool.close() and in combination with pool.join() and neither seems to help.
How do I get this script to complete? What am I doing wrong? I'm using Python 2.7.8 on a Linux box.
You need to put all code which uses multiprocessing, inside its own function. This stops it recursively launching new pools when multiprocessing re-imports your module in separate processes:
def parse_file(filename):
...
def main():
pool = mp.Pool(processes=8)
pool.map(parse_file, ['my_dir/' + filename for filename in os.listdir("my_dir")])
if __name__ == '__main__':
main()
See the documentation about making sure your module is importable, also the advice for running on Windows(tm)
The problem was a combination of 2 things:
my pool code being called multiple times (thanks #Peter Wood)
my DB code opening too many sessions (and/or) sharing sessions
I made the following changes and everything works now:
Original File
def parse_file(filename):
f = open(filename, 'rb')
data = f.read()
f.close()
soup = BeautifulSoup(data,features="lxml", from_encoding='utf-8')
# parse file here
db_record = MyDBRecord(parsed_data)
session = get_session() # see below
session.add(db_record)
session.commit()
pool = mp.Pool(processes=8)
pool.map(parse_file, ['my_dir/' + filename for filename in os.listdir("my_dir")])
DB File
def get_session():
engine = create_engine('mysql://root:root#localhost/my_db')
Base.metadata.create_all(engine)
Base.metadata.bind = engine
db_session = sessionmaker(bind=engine)
return db_session()