Looping a function with its input being a url [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
So I am trying to get into python, and am using other examples that I find online to understand certain functions better.
I found a post online that shared a way to check prices on an item through CamelCamelCamel.
They had it set to request from a specific url, so I decided to change it to userinput instead.
How can I just simply loop this function?
It runs fine afaik once, but after the inital process i get 'Process finished with exit code 0', which isn't necessarily a problem.
For the script to perform how I would like it to. It would be nice if there was a break from maybe, 'quit' or something, but after it processes the URL that was given, I would like it to request for a new URL.
Im sure theres a way to check for a specific url, IE this should only work for Camelcamelcamel, so to limit to only that domain.
Im more familiar with Batch, and have kinda gotten away with using batch to run my python files to circumvent what I dont understand.
Personally if I could . . .
I would just mark the funct as 'top:'
and put goto top at the bottom of the script.
from bs4 import BeautifulSoup
import requests
print("Enter CamelCamelCamel Link: ")
plink = input("")
headers = {'User-Agent': 'Mozilla/5.0'}
r = requests.get(plink,headers=headers)
data = r.text
soup = BeautifulSoup(data,'html.parser')
table_data = soup.select('table.product_pane tbody tr td')
hprice = table_data[1].string
hdate = table_data[2].string
lprice = table_data[7].string
ldate = table_data[8].string
print ('High price-',hprice)
print ("[H-Date]", hdate)
print ('---------------')
print ('Low price-',lprice)
print ("[L-Date]", ldate)
Also how could I find the difference from the date I obtain from either hdate or ldate, from today/now. How the dates I parsed they're strings and I got. TypeError: unsupported operand type(s) for +=: 'int' and 'str'.
This is really just for learning, any example works, It doesnt have to be that site in specific.

In Python, you have access to several different types of looping control structures, including:
while statements
while (condition) # Will execute until condition is no longer True (or until break is called)
<statements to execute while looping>
for statements
for i in range(10) # Will execute 10 times (or until break is called)
<statements to execute while looping>
Each one has its strengths and weaknesses, and the documentation at Python.org is very thorough but easy to assimilate.
https://docs.python.org/3/tutorial/controlflow.html

Related

How to access webpage with variables in python [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I have a project and want to access a url by python. If i have variable1=1 and variable2=2, I want an output to be like this:
www.example.com/data.php?variable1=1&variable2=2
How do I achieve this? Thanks!
Check this out:
try:
from urllib2 import urlopen # python 2
except:
from urllib.request import urlopen # python 3
vars = ['variable1=1', 'variable2=2']
for i in vars:
url = 'http://www.example.com/data.php?' + i
response = urlopen(url)
html = response.read()
print(html)
The first four lines import some code we can use to make a HTTP request.
Then we create a list of variables named vars.
Then we pass each of those variables into a loop; that loop will run once for each item in vars.
Next we build the url given the current value in vars.
Finally we get the html at that address and print it to the terminal.
You can use formate operation in python for string.
for example
variable1=1
variable1=1
url = 'www.example.com/data.php?variable1={}&variable2={}'.format(variable1,variable1)
or if you want to use the url with request then you can make a dict and pass it in request like this way
import requests
url = 'http://www.example.com/data.php'
data = {'variable1':'1','variable2':'2'}
r = requests.get(url,data)
and it will making request on this url
http://www.example.com/data.php?variable1=1&variable2=2
Try string formatting...
url = 'www.example.com/data.php?variable1={}&variable2={}'.format(variable1, variable2)
This means the 2 {} will be replaced with whatever you pass in .format(), which in this case is just the variables' values

Check textfile for arguments Python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have two Python scripts:
Script 1: Checks elements on a webpage and writes them to a file.
Script 2: Reads from this file and uses the contents as argument for an if statement. This is the part that I'm unsure about.
The text file has at least 500 items all on new lines, and I want to check if these items are still there when I revisit the site.
def read_input_file(self):
inFile = open("page_items.txt","r")
if inFile == current_content:
do.stuff
What would be the best way to go about this?
Use the first script to scrape the site again and save it in a set. Then use .issubset to check if everything in 'inFile' is contained within the current_site?
current_site = set(scraped_items)
if set(inFile).issubset(current_site):
do.stuff
It turned out that the sets where not really what I was looking for after all. Mainly because the scraped contents needed to survive a reboot. So the text file was the only option I could think of.
I did find a solution however, instead of scraping the current_site and matching that with the the infile I now start with the infile and search for that line on the current_site, using Selenium.
Here is what I came up with, it's not very clean but maybe it's useful to somebody from the future
import linecache
for i in range(0, 200):
scraped_content = linecache.getline('scraped.txt', count)
scraped_content = str(scraped_content).rstrip()
search_path = "//*[contains(text(),'",scraped_content,"')]"
joined_string = "".join(str(x) for x in search_path)
scroll_down = driver.find_element_by_tag_name('a')
scroll_down.send_keys(Keys.PAGE_DOWN)
scroll_to_element = None
while not scroll_to_element:
try:
scroll_to_element = driver.find_element_by_xpath(joined_string)
time.sleep(1)
except NoSuchElementException:
print "Searching for Content:", scraped_content
break
if scroll_to_element != None:
try:
print scraped_content,"Found!"

Python Shelve like requirement [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I want to store some kind of persistent tracking to my application. Like for example ,
application.py supports 3 arguments, --firstarg --secondarg --thirdarg.
user can execute application.py using either one of the argument. I want to provide a feature where in user can track that he has already completed running the firstarg. In order to achieve this, i wanted to store a file called status.log, where the tracker will be stored.
Initially, status.log has
firstarg : 0
secondarg : 0
thirdarg : 0
once the user executes the first arg, it should turn the line , firstarg : 1
so when users wants, he can check if he has already completed running the firstarg or not.
Is it possible ?
I tried shelve, its not working for me for some reason. I am not seeing any logs written to the file.
Thanks.
I'd say use a JSON file as a persistent store and use the load and dump methods for reads and writes.
The JSON method is built in so all you need is to import it and use it:
import json
data_file = '/path/to/file'
# Load current state from file
contents = json.load(data_file)
# Do something
# ....
# ....
# Change values
contents['firstarg'] = firstarg
# Save to file
json.dump(contents, data_file)
Since you don't have a code sample, this is a simple approach

Receiving JSON Dics with python [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I'm pretty new to python and i want to get a JSON dic and make an python dic based on it.
An exemple of those JSON dics is this:
http://ajax.googleapis.com/ajax/services/search/web?v=2.0&q=python?cod=&start=2
So i need to get it and print all the keys "URL" on the screen.
I have seen some things like this but did not worked good, my code now is:
while (page < 50):
page_src = urllib.urlopen(search_url + '?cod=&start=' + str(page)).read()
json_src = json.loads (page_src)
for item in json_src['responseData']:
sub_item = json_src['responseData']['results']
for link in sub_item:
for key in link:
if (key == u'"url"'):
print link[key]
page = page + 1
But when executed i get:
TypeError: 'NoneType' object is not iterable
IDK where i'm wrong, please help me..
Thank you all guys.
TheV0iD
Check out to make sure your url is correct, the code worked for me. My only revision would be:
for item in json_src['responseData']['results']:
print link[key]
Also, make sure your starting and ending values of page are real urls, you are getting the NoneType because there was no such thing as 'responseData' found.
Also what is your value of search_url? are you including the ?v=2.0&q=python? in it? if you messed up your url at all your NoneType is coming from trying to iterate through json_src['responseData']['results'] because there is no such thing.
EDIT:
The issue is that you reassign the search_url in the loop. The second iteration the url becomes http://ajax.googleapis.com/ajax/services/search/web?v=2.0&q=python?cod=&start=0?cod=&start=1 with both appended. Simple change the search_url = to cur_url =
Final code:
print "\n\n RESULTS:"
while (page < 2):
current_url = search_url + '?cod=&start=' + str(page)
json_src = json.load(urllib.urlopen(search_url))
print json_src
results = json_src['responseData']['results']
for result in results:
print "\t" + result['url']
page = page + 1

Is there a better way to format this Python/Django code as valid PEP8? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have code written both ways and I see flaws in both of them. Is there another way to write this or is one approach more "correct" than the other?
def functionOne(subscriber):
try:
results = MyModelObject.objects.filter(
project__id=1,
status=MyModelObject.STATUS.accepted,
subscriber=subscriber).values_list(
'project_id',
flat=True).order_by('-created_on')
except:
pass
def functionOne(subscriber):
try:
results = MyModelObject.objects.filter(
project__id=1,
status=MyModelObject.STATUS.accepted,
subscriber=subscriber)
results = results.values_list('project_id', flat=True)
results = results.order_by('-created_on')
except:
pass
This is valid code, this isn't correct code, I ripped a similar chunk of code to give an example of the objects.filter section. Please don't waste time commenting on the other parts of the code. I put the try/except in there to force an indent to push certain elements on new lines(80 columns)
I would do this:
def functionOne(subscriber):
try:
results = MyModelObject.objects.filter(
project__id=1,
status=MyModelObject.STATUS.accepted,
subscriber=subscriber
).values_list(
'project_id',
flat=True
).order_by(
'-created_on'
)
except:
pass
Use indentation to make the hierarchy more readable. However, this code isn't particularly nice. Using code like this directly in views should be considered an anti-pattern. Model Managers might be a better option for such recurring code.
You might want to read http://dabapps.com/blog/higher-level-query-api-django-orm/

Categories

Resources