Double regex search loop makes problems

Double regex search loop makes problems - python

I'm writing a code to send a values (one by one) to send function which will send it as msg to me. I have a two lists allowed and denied for regex keywords which are allowed in re.search and which are denied. I'm scraping an elements from web, trying to filter it with re.search allowed which works, and send it to the second function where I'm trying to filter it second time. This time filter out string which contains denied words. Here Is the problem. When I'm doing a loop one time allowed it goes good, but when I'm trying to do the second loop for denied in loop, it sending the string two times to my send function. How could I change the code, to make it work please?
Here Is the code
allowed = ["pc", "FUJITSU", "LIFEBOOK", "win" "Windows",
"PC", "Linux" "linux", "HP", "hp", "notebook", "desktop",
"raspberry", "NEC", "mac", "Mac", "Core"]
denied = ["philips", "samsung"]
used = set()
source = requests.get("https://jmty.jp/aichi/sale-pcp").text
soup = BeautifulSoup(source, 'lxml')
def skype_login(user, password):
sk = Skype(user, password)
return(sk)
def send(sk, title, address, price, town, topic='Not'):
for c in sk.chats.recent():
chat = sk.chats[c]
if hasattr(chat, 'topic') and chat.topic == topic:
chat.sendMsg(f'Some string {title} \n {price} \n \n {town} \n \n URL: {address}' )
break
sleep(1)
chat.sendMsg("Additional Message")
def jimoti(sk):
global used
for h2 in soup.find_all('div', class_='p-item-content-info'):
title = h2.select_one('.p-item-title').text
address = h2.select_one('.p-item-title').a["href"]
price = (h2.select_one('.p-item-most-important').text).replace("円", "").replace("\n", "").replace(",", "")
price = int(price)
town = h2.select_one('.p-item-supplementary-info').text
if price < 2000 and title not in used:
used.add(title)
for pattern in allowed:
print(pattern)
if re.search(pattern, title):
second(sk, title, address, price, town)
break
def second(sk, title, address, price, town):
sk = sk
title = title
address = address
price = price
town = town
# for prh in denied: # Here it makes the problem
# print(prh)
# if re.search(prh, title):
# break
# else:
send(sk, title, address, price, town)
if __name__ == '__main__':
sk = skype_login('username', 'pass')
while True:
jimoti(sk)
sleep(randint(11,20))

If I read your code correctly you had this set up (pseudo-python):
for bad in denied:
if bad exists in string:
break
else:
send message
This means that for every denied word that isn't found, you send the message. So if there are two bad words and it doesn't contain either, then you'll send it twice.
You can easily fix this by just having a bool
def second(sk, title, address, price, town):
# I'm not sure why you do this, it's 100% unnecessary
# sk = sk
# title = title
# address = address
# price = price
# town = town
is_ok = True
for prh in denied:
if re.search(prh, title):
is_ok = False
break
if is_ok:
send(sk, title, address, price, town)

Related

how to make except only one value of some in selenium?

I want to find title, address, price of some items in an online mall.
But, sometimes the address is empty and my code is break in my code(below_it's an only selenium part)
num = 1
while 1:
try:
title = browser.find_element_by_xpath('//*[#id="root"]/div[1]/section/article/div/div['+str(num)+']/div/div/a/span').text
datas_title.append(title)
address = browser.find_element_by_xpath('//*[#id="root"]/div[1]/section/article/div/div['+str(num)+']/div/div/a/div/p[2]').text
datas_address.append(address)
price = browser.find_element_by_xpath('//*[#id="root"]/div[1]/section/article/div/div['+str(num)+']/div/div/a/p').text
datas_price.append(price)
print('crowling....num = '+str(num))
num=num+1
except Exception as e:
print("finish get data...")
break
print(datas_title)
print(datas_address)
print(datas_price)
what should I do if the address is empty -> just ignore it and find the next items?

Use this so you can skip the entries with missing information:
num = 1
while 1:
try:
title = browser.find_element_by_xpath('//*[#id="root"]/div[1]/section/article/div/div['+str(num)+']/div/div/a/span').text
datas_title.append(title)
address = browser.find_element_by_xpath('//*[#id="root"]/div[1]/section/article/div/div['+str(num)+']/div/div/a/div/p[2]').text
datas_address.append(address)
price = browser.find_element_by_xpath('//*[#id="root"]/div[1]/section/article/div/div['+str(num)+']/div/div/a/p').text
datas_price.append(price)
print('crowling....num = '+str(num))
num=num+1
except:
print("an error was encountered")
continue
print(datas_title)
print(datas_address)
print(datas_price)

address = browser.find_element_by_xpath('//*[#id="root"]/div[1]/section/article/div/div['+str(num)+']/div/div/a/div/p[2]').text
if not address:
address = "None"
else:
address = address[0].text
datas_title.append(address)
You could use find_elements to check if it's empty and then proceed to do it with either value. You can than encapsulate this into a function pass it the xpath and the data_title array and your code should be repeatable.

I think you need to first check if the web element returned isn't none. And then proceed with fetching text.
You could write a function for it, and catch that exception in it.

Multiple Python scripts running concurrently with different inputs

So I have this python script that scrapes listing off of a specific craigslist URL the user constructs (location, max price, type of item, etc.). It then goes to the URL, scrapes the listings info (price, date posted, etc) and returns three outputs. One is 'x' number of items around the average price (the user determines the number of items and the range of prices such as $100 off the average price). Next, are 'x' closets listings based of the zip code the user provided in the begging (user also determine the # of items displayed based on proximity to zip code). Lastly the craigslist url link is outputted to the user so they can visit the page and look at the items displayed to them earlier. The data of the scrape is stored in a data.json file and a data.csv file .Content is the same just different formats, I would like to offload this data to a Database everytime a scrape is done. Either Cloud Firestore or AWS DynamoDB since I want to host this a web app in the future
What I want to do is allow the user to have multiple instances of the same scripts all with unique craigslist urls running at the same time. All of the code is the same, the only difference are the craigslist urls that the script scrapes.
I made a method that iterated through the creation of the attributes(location, max price, etc) and returns a lost of the urls, but in my main I call the contructor and it needs all of those attributes so I have to fish it from the urls and that seemed over the top.
I then tried to have the loop in my main. the user determine how many url links they want to make and append the completed links to a list. Again ran into the same problem.
class CraigslistScraper(object):
# Contructor of the URL that is being scraped
def __init__(self, location, postal_code, max_price, query, radius):
self.location = location # Location(i.e. City) being searched
self.postal_code = postal_code # Postal code of location being searched
self.max_price = max_price # Max price of the items that will be searched
self.query = query # Search for the type of items that will be searched
self.radius = radius # Radius of the area searched derived from the postal code given previously
self.url = f"https://{location}.craigslist.org/search/sss?&max_price={max_price}&postal={postal_code}&query={query}&20card&search_distance={radius}"
self.driver = webdriver.Chrome(r"C:\Program Files\chromedriver") # Path of Firefox web driver
self.delay = 7 # The delay the driver gives when loading the web page
# Load up the web page
# Gets all relevant data on the page
# Goes to next page until we are at the last page
def load_craigslist_url(self):
data = []
# url_list = []
self.driver.get(self.url)
while True:
try:
wait = WebDriverWait(self.driver, self.delay)
wait.until(EC.presence_of_element_located((By.ID, "searchform")))
data.append(self.extract_post_titles())
# url_list.append(self.extract_post_urls())
WebDriverWait(self.driver, 2).until(
EC.element_to_be_clickable((By.XPATH, '//*[#id="searchform"]/div[3]/div[3]/span[2]/a[3]'))).click()
except:
break
return data
# Extracts all relevant information from the web-page and returns them as individual lists
def extract_post_titles(self):
all_posts = self.driver.find_elements_by_class_name("result-row")
dates_list = []
titles_list = []
prices_list = []
distance_list = []
for post in all_posts:
title = post.text.split("$")
if title[0] == '':
title = title[1]
else:
title = title[0]
title = title.split("\n")
price = title[0]
title = title[-1]
title = title.split(" ")
month = title[0]
day = title[1]
title = ' '.join(title[2:])
date = month + " " + day
if not price[:1].isdigit():
price = "0"
int(price)
raw_distance = post.find_element_by_class_name(
'maptag').text
distance = raw_distance[:-2]
titles_list.append(title)
prices_list.append(price)
dates_list.append(date)
distance_list.append(distance)
return titles_list, prices_list, dates_list, distance_list
# Gets all of the url links of each listing on the page
# def extract_post_urls(self):
# soup_list = []
# html_page = urllib.request.urlopen(self.driver.current_url)
# soup = BeautifulSoup(html_page, "html.parser")
# for link in soup.findAll("a", {"class": "result-title hdrlnk"}):
# soup_list.append(link["href"])
#
# return soup_list
# Kills browser
def kill(self):
self.driver.close()
# Gets price value from dictionary and computes average
#staticmethod
def get_average(sample_dict):
price = list(map(lambda x: x['Price'], sample_dict))
sum_of_prices = sum(price)
length_of_list = len(price)
average = round(sum_of_prices / length_of_list)
return average
# Displays items around the average price of all the items in prices_list
#staticmethod
def get_items_around_average(avg, sample_dict, counter, give):
print("Items around average price: ")
print("-------------------------------------------")
raw_list = []
for z in range(len(sample_dict)):
current_price = sample_dict[z].get('Price')
if abs(current_price - avg) <= give:
raw_list.append(sample_dict[z])
final_list = raw_list[:counter]
for index in range(len(final_list)):
print('\n')
for key in final_list[index]:
print(key, ':', final_list[index][key])
# Displays nearest items to the zip provided
#staticmethod
def get_items_around_zip(sample_dict, counter):
final_list = []
print('\n')
print("Closest listings: ")
print("-------------------------------------------")
x = 0
while x < counter:
final_list.append(sample_dict[x])
x += 1
for index in range(len(final_list)):
print('\n')
for key in final_list[index]:
print(key, ':', final_list[index][key])
# Converts all_of_the_data list of dictionaries to json file
#staticmethod
def convert_to_json(sample_list):
with open(r"C:\Users\diego\development\WebScraper\data.json", 'w') as file_out:
file_out.write(json.dumps(sample_list, indent=4))
#staticmethod
def convert_to_csv(sample_list):
df = pd.DataFrame(sample_list)
df.to_csv("data.csv", index=False, header=True)
# Main where the big list data is broken down to its individual parts to be converted to a .csv file
also the parameters of the website are set
if name == "main":
location = input("Enter the location you would like to search: ") # Location Craigslist searches
zip_code = input(
"Enter the zip code you would like to base radius off of: ") # Postal code Craigslist uses as a base for 'MILES FROM ZIP'
type_of_item = input(
"Enter the item you would like to search (ex. furniture, bicycles, cars, etc.): ") # Type of item you are looking for
max_price = input(
"Enter the max price you would like the search to use: ") # Max price Craigslist limits the items too
radius = input(
"Enter the radius you would like the search to use (based off of zip code provided earlier): ") # Radius from postal code Craigslist limits the search to
scraper = CraigslistScraper(location, zip_code, max_price, type_of_item,
radius) # Constructs the URL with the given parameters
results = scraper.load_craigslist_url() # Inserts the result of the scrapping into a large multidimensional list
titles_list = results[0][0]
prices_list = list(map(int, results[0][1]))
dates_list = results[0][2]
distance_list = list(map(float, results[0][3]))
scraper.kill()
# Merge all of the lists into a dictionary
# Dictionary is then sorted by distance from smallest -> largest
list_of_attributes = []
for i in range(len(titles_list)):
content = {'Listing': titles_list[i], 'Price': prices_list[i], 'Date posted': dates_list[i],
'Distance from zip': distance_list[i]}
list_of_attributes.append(content)
list_of_attributes.sort(key=lambda x: x['Distance from zip'])
scraper.convert_to_json(list_of_attributes)
scraper.convert_to_csv(list_of_attributes)
# scraper.export_to_mongodb()
# Below function calls:
# Get average price and prints it
# Gets/prints listings around said average price
# Gets/prints nearest listings
average = scraper.get_average(list_of_attributes)
print(f'Average price of items searched: ${average}')
num_items_around_average = int(input("How many listings around the average price would you like to see?: "))
avg_range = int(input("Range of listings around the average price: "))
scraper.get_items_around_average(average, list_of_attributes, num_items_around_average, avg_range)
print("\n")
num_items = int(input("How many items would you like to display based off of proximity to zip code?: "))
print(f"Items around you: ")
scraper.get_items_around_zip(list_of_attributes, num_items)
print("\n")
print(f"Link of listings : {scraper.url}")
What i want is the program to get the number of URLs the user wants to scrape. That input will determine the number of instances of this script that needs to be running.
Then the user will run through the prompt of every scraper such as making the url ("what location would you like to search?: "). After they are done creating the urls, each scraper will run with their specific url and display back the three output described above specific to the url the scraper was assigned.
In the future I would like to add a time function and the user determines how often they want the script to run (every hour, every day, every other day, etc). Connect to a database and instead just query from the database the the 'x' # of listings around the average price range and the 'x' closest listings based off of proximity based of the specific url's results.

If you want several instances of your scraper in parallel while your main is running in loop, you need to use subprocceses.
https://docs.python.org/3/library/subprocess.html

How to fix user input as the variable assignment works fine

My code doesn't give the desired output when I use user input but it works fine when I use a simple variable assignment.
I checked both user input and variable. Both are of type String.
When I use Input, it gives below error: print("\nIPAbuse check for the IP Address: {} \nDatabase Check: \nConfidence of Abuse: \nISP: {} \nUsage: {} \nDomain Name: {} \nCountry: {} \nCity: {}".format(num,description1,description2,isp,usage,domain,country,city)) NameError: name 'description1' is not defined
# sys.stdout.write("Enter Source IP Address: ")
# sys.stdout.flush()
# ip = sys.stdin.readline()
ip = '212.165.108.173'
url = ""
num = str(ip)
req = requests.get(url + num)
html = req.text
soup = BeautifulSoup(html, 'html.parser')
try:
div = soup.find('div', {"class": "well"})
description1 = div.h3.text.strip()
description2 = div.p.text.strip()
isp = soup.find("th", text="ISP").find_next_sibling("td").text.strip()
usage = soup.find("th", text="Usage Type").find_next_sibling("td").text.strip()
domain = soup.find("th", text="Domain Name").find_next_sibling("td").text.strip()
country = soup.find("th", text="Country").find_next_sibling("td").text.strip()
city = soup.find("th", text="City").find_next_sibling("td").text.strip()
except:
isp = 'Invalid'
usage = 'Invalid'
domain = 'Invalid'
country = 'Invalid'
city = 'Invalid'
print(
"num, description1, description2, isp, usage, domain, country, city)

readline() adds a '\n' character to the input, so it's going to be different than if you make it a hardcoded assignment like ip = '212.165.108.173'. The newline char is messing up the request. As a quick patch, confirm that the last character of the user input is '\n' and try making sure that character doesn't get in the url for the request. On the other hand, I'd also suggest going for input like someone said in the comments (if only because that one does not add the \n at the end).

how to make text clickable in python

How to make text clickable ?
class ComplainceServer():
def __init__(self, jira_server, username, password, encoding='utf-8'):
if jira_server is None:
error('No server provided.')
#print(jira_server)
self.jira_server = jira_server
self.username = username
self.password = password
self.encoding = encoding
def checkComplaince(self, appid, toAddress):
query = "/rest/api/2/search?jql=issuetype = \"Application Security\" AND \"Prod Due Date\" < now()
request = self._createRequest()
response = request.get(query, contentType='application/json')
# Parse result
if response.status == 200 and action == "warn":
data = Json.loads(response.response)
print "#### Issues found"
issues = {}
msg = "WARNING: The below tickets are non-complaint in fortify, please fix them or raise exception.\n"
issue1 = data['issues'][0]['key']
for item in data['issues']:
issue = item['key']
issues[issue] = item['fields']['summary']
print u"* {0} - {1}".format(self._link(issue), item['fields']['summary'])
print "\n"
data = u" {0} - {1}".format(self._link(issue), item['fields']['summary'])
msg += '\n'+ data
SOCKET_TIMEOUT = 30000 # 30s
email = SimpleEmail()
email.setHostName('smtp.com')
email.setSmtpPort(25)
email.setSocketConnectionTimeout(SOCKET_TIMEOUT);
email.setSocketTimeout(SOCKET_TIMEOUT);
email.setFrom('R#group.com')
for toAddress in toAddress.split(','):
email.addTo(toAddress)
email.setSubject('complaince report')
email.addHeader('X-Priority', '1')
email.setMsg(str(msg))
email.send()
def _createRequest(self):
return HttpRequest(self.jira_server, self.username, self.password)
def _link(self, issue):
return '[{0}]({1}/browse/{0})'.format(issue, self.jira_server['url'])
This is the calling function. APPid and toAddress will be passed in from different UI.
from Complaince import ComplainceServer
jira = ComplainceServer(jiraServer, username, password)
issues = jira.checkComplaince(appid, toAddress)
I want issueid to be an embedded link.
currently the email sends as below:
MT-4353(https://check.com/login/browse/MT-4353) - Site Sc: DM isg_cq5
but i want [MT-4353] as hyperlink to the URL https://check.com/login/browse/MT-4353

Firstly, you need to encode your email as html. I'm not familiar with the library you are using so I cannot give an example of this.
I have replaced a snippet of your code with html syntax just to illustrate the point that you are meant to use html syntax to have clickable links in an email.
msg = "<p>WARNING: The below tickets are non-compliant in fortify, please fix them or raise exception.</p>"
issue1 = data['issues'][0]['key']
for item in data['issues']:
issue = item['key']
issues[issue] = item['fields']['summary']
data = u"<a href='{0}'>{1}</a>".format(self._link(issue), item['fields']['summary'])
msg += '<br />'+ data
In future, please ask your questions carefully as your title does not question does not indicate what you are actually meaning. You also have spelling mistakes: Compliant
Oh, I missed the point of self._link(issue) not returning the correct link. It returns MT-4353(https://check.com/login/browse/MT-4353) so you are going to need to extract the link part between the brackets. I suggest a regular expression.

Why is one of my functions running twice?

The function search_for_song(pbody) is running twice, i can't figure out why.
would like some help, just started learning python a few days ago.
Here's the full code:
#a bot that replies with youtube songs that were mentioned in the comments
import traceback
import praw
import time
import sqlite3
import requests
from lxml import html
import socket
import errno
import re
import urllib
from bs4 import BeautifulSoup
import sys
import urllib2
'''USER CONFIGURATION'''
APP_ID = ""
APP_SECRET = ""
APP_URI = ""
APP_REFRESH = ""
# https://www.reddit.com/comments/3cm1p8/how_to_make_your_bot_use_oauth2/
USERAGENT = "Python automatic youtube linkerbot"
# This is a short description of what the bot does.
# For example "Python automatic replybot v2.0 (by /u/GoldenSights)"
SUBREDDIT = "kqly"
# This is the sub or list of subs to scan for new posts. For a single sub, use "sub1". For multiple subreddits, use "sub1+sub2+sub3+..."
DO_SUBMISSIONS = False
DO_COMMENTS = True
# Look for submissions, comments, or both.
KEYWORDS = ["linksong"]
# These are the words you are looking for
KEYAUTHORS = []
# These are the names of the authors you are looking for
# The bot will only reply to authors on this list
# Keep it empty to allow anybody.
#REPLYSTRING = "**Hi, I'm a bot.**"
# This is the word you want to put in reply
MAXPOSTS = 100
# This is how many posts you want to retrieve all at once. PRAW can download 100 at a time.
WAIT = 30
# This is how many seconds you will wait between cycles. The bot is completely inactive during this time.
CLEANCYCLES = 10
# After this many cycles, the bot will clean its database
# Keeping only the latest (2*MAXPOSTS) items
'''All done!'''
try:
import bot
USERAGENT = bot.aG
except ImportError:
pass
print('Opening SQL Database')
sql = sqlite3.connect('sql.db')
cur = sql.cursor()
cur.execute('CREATE TABLE IF NOT EXISTS oldposts(id TEXT)')
print('Logging in...')
r = praw.Reddit(USERAGENT)
r.set_oauth_app_info(APP_ID, APP_SECRET, APP_URI)
r.refresh_access_information(APP_REFRESH)
def replybot():
print('Searching %s.' % SUBREDDIT)
subreddit = r.get_subreddit(SUBREDDIT)
posts = []
if DO_SUBMISSIONS:
posts += list(subreddit.get_new(limit=MAXPOSTS))
if DO_COMMENTS:
posts += list(subreddit.get_comments(limit=MAXPOSTS))
posts.reverse()
for post in posts:
#print ("Searching for another the next comment")
# Anything that needs to happen every loop goes here.
pid = post.id
try:
pauthor = post.author.name
except AttributeError:
# Author is deleted. We don't care about this post.
continue
if pauthor.lower() == r.user.name.lower():
# Don't reply to yourself, robot!
print('Will not reply to myself.')
continue
if KEYAUTHORS != [] and all(auth.lower() != pauthor for auth in KEYAUTHORS):
# This post was not made by a keyauthor
continue
cur.execute('SELECT * FROM oldposts WHERE ID=?', [pid])
if cur.fetchone():
# Post is already in the database
continue
if isinstance(post, praw.objects.Comment):
pbody = post.body
else:
pbody = '%s %s' % (post.title, post.selftext)
pbody = pbody.lower()
if not any(key.lower() in pbody for key in KEYWORDS):
# Does not contain our keyword
continue
cur.execute('INSERT INTO oldposts VALUES(?)', [pid])
sql.commit()
print('Replying to %s by %s' % (pid, pauthor))
try:
if search_for_song(pbody):
# pbody=pbody[8:]
# pbody=pbody.replace("\n", "")
temp=pbody[8:].lstrip()
post.reply("[**"+temp+"**]("+search_for_song(pbody)+") \n ---- \n ^^This ^^is ^^an ^^automated ^^message ^^by ^^a ^^bot, ^^if ^^you ^^found ^^any ^^bug ^^and/or ^^willing ^^to ^^contact ^^me. [**^^Press ^^here**](https://www.reddit.com/message/compose?to=itailitai)")
except praw.errors.Forbidden:
print('403 FORBIDDEN - is the bot banned from %s?' % post.subreddit.display_name)
def search_for_song(pbody):
#print("in search_for_song")
song=pbody
if len(song)>8:
song=song[8:]
if song.isspace()==True or song=='':
return False
else:
print("Search if %s exists in the database" % song )
#HEADERS = {'User-Agent': 'Song checker - check if songs exists by searching this website, part of a bot for reddit'}
author, song_name = song_string_generator(song)
url = 'http://www.songlyrics.com/'+author+'/'+song_name+'-lyrics/'
print url
#page = requests.get(url, HEADERS)
check=1
while check==1:
try:
headers = { 'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; rv:40.0) Gecko/20100101 Firefox/40.0' }
req = urllib2.Request(url, None, headers)
page= urllib2.urlopen(req)
check=2
except socket.error as error:
pass
except Exception:
print('An error occured while tryinc to verify song existence')
return False
soup = BeautifulSoup(page.read(), "lxml")
if "Please check the spelling and try again" not in soup.get_text():
print ("Song was found in the database!")
result=first_youtube(song)
return result
else:
print ("Song was not found in the database!")
return False
def song_string_generator(song):
#print("in song_string_generator")
song=song
author,song_name= '',''
try:
if "-" in song:
l=song.split('-', 1 )
print ("2 ",l)
author=l[0]
song_name=l[1]
elif "by" in song:
l=song.split('by', 1 )
print ("2 ",l)
author=l[1]
song_name=l[0]
song_name=" ".join(song_name.split())
author=" ".join(author.split())
print (author,song_name)
if author == 'guns and roses':
author="guns n' roses"
song_name=song_name.replace("\n", "")
author=author.replace("\n", "")
author=author.replace(" ", "-")
song_name=song_name.replace(" ", "-")
author=author.replace("'", "-")
song_name=song_name.replace("'", "-")
song_name=song_name.rstrip()
song_name=" ".join(song_name.split())
return author, song_name
except:
print ("No song was mentioned in the comment!")
return False
def first_youtube(textToSearch):
reload(sys)
sys.setdefaultencoding('UTF-8')
query_string = textToSearch
try:
html_content = urllib.urlopen("http://www.youtube.com/results?search_query=" + query_string)
search_results = re.findall(r'href=\"\/watch\?v=(.{11})', html_content.read().decode())
result="http://www.youtube.com/watch?v=" + search_results[0]
return result
except IOError:
print ("IOError Occured while contacting Youtube!")
except Exception:
print ("A non IOError Occured while contacting Youtube!")
return False
cycles = 0
while True:
try:
replybot()
cycles += 1
except Exception as e:
traceback.print_exc()
if cycles >= CLEANCYCLES:
print('Cleaning database')
cur.execute('DELETE FROM oldposts WHERE id NOT IN (SELECT id FROM oldposts ORDER BY id DESC LIMIT ?)', [MAXPOSTS * 2])
sql.commit()
cycles = 0
print('Running again in %d seconds \n' % WAIT)
time.sleep(WAIT)
This is the output I'm getting:
Opening SQL Database
Logging in...
Searching kqly.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Will not reply to myself.
Replying to d0kwcrs by itailitai
Search if guns and roses - paradise city exists in the database
('2 ', [u' guns and roses ', u' paradise city'])
(u'guns and roses', u'paradise city')
http://www.songlyrics.com/guns-n--roses/paradise-city-lyrics/
Song was found in the database!
Search if guns and roses - paradise city exists in the database
('2 ', [u' guns and roses ', u' paradise city'])
(u'guns and roses', u'paradise city')
http://www.songlyrics.com/guns-n--roses/paradise-city-lyrics/
Song was found in the database!
Running again in 30 seconds
it's a bot for reddit that replies with the youtube video of a song that was mentioned in the comments, if anyone wants to know.

With a cursory reading of your code you have
if search_for_song(pbody):
# do stuff..
post.reply("[**"+temp+"**]("+search_for_song(pbody)+") \n ---- \n ^^This ^^is ^^an ^^automated ^^message ^^by ^^a ^^bot, ^^if ^^you ^^found ^^any ^^bug ^^and/or ^^willing ^^to ^^contact ^^me. [**^^Press ^^here**](https://www.reddit.com/message/compose?to=itailitai)")
You call the function in the start of the if and in your post.reply line
RESPONDING TO COMMENTS
If you need to check the results but don't want to call twice simply save the output
res = search_for_song(pbody):
if res:
#...
post.reply(... + res + ...)

I've just quickly searched for the function call search_for_song, I suppose the following piece of code is resulting in 2 function calls.
if search_for_song(pbody):
# pbody=pbody[8:]
# pbody=pbody.replace("\n", "")
temp=pbody[8:].lstrip()
post.reply("[**"+temp+"**]("+search_for_song(pbody)+")
Once at the if statement, and once inside the post.reply statement.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Double regex search loop makes problems - python

Related

how to make except only one value of some in selenium?

Multiple Python scripts running concurrently with different inputs

How to fix user input as the variable assignment works fine

how to make text clickable in python

Why is one of my functions running twice?

Categories

Resources