How to fix user input as the variable assignment works fine - python

My code doesn't give the desired output when I use user input but it works fine when I use a simple variable assignment.
I checked both user input and variable. Both are of type String.
When I use Input, it gives below error: print("\nIPAbuse check for the IP Address: {} \nDatabase Check: \nConfidence of Abuse: \nISP: {} \nUsage: {} \nDomain Name: {} \nCountry: {} \nCity: {}".format(num,description1,description2,isp,usage,domain,country,city)) NameError: name 'description1' is not defined
# sys.stdout.write("Enter Source IP Address: ")
# sys.stdout.flush()
# ip = sys.stdin.readline()
ip = '212.165.108.173'
url = ""
num = str(ip)
req = requests.get(url + num)
html = req.text
soup = BeautifulSoup(html, 'html.parser')
try:
div = soup.find('div', {"class": "well"})
description1 = div.h3.text.strip()
description2 = div.p.text.strip()
isp = soup.find("th", text="ISP").find_next_sibling("td").text.strip()
usage = soup.find("th", text="Usage Type").find_next_sibling("td").text.strip()
domain = soup.find("th", text="Domain Name").find_next_sibling("td").text.strip()
country = soup.find("th", text="Country").find_next_sibling("td").text.strip()
city = soup.find("th", text="City").find_next_sibling("td").text.strip()
except:
isp = 'Invalid'
usage = 'Invalid'
domain = 'Invalid'
country = 'Invalid'
city = 'Invalid'
print(
"num, description1, description2, isp, usage, domain, country, city)

readline() adds a '\n' character to the input, so it's going to be different than if you make it a hardcoded assignment like ip = '212.165.108.173'. The newline char is messing up the request. As a quick patch, confirm that the last character of the user input is '\n' and try making sure that character doesn't get in the url for the request. On the other hand, I'd also suggest going for input like someone said in the comments (if only because that one does not add the \n at the end).

Related

How to extract a URL's domain name python (no imports)

Currently doing a school mini-project where we have to make a program to extract the domain name from a few given URL's, and put those which end in .uk (ie are websites from the united kingdom) in a list.
A couple of specifications:
We cannot import any modules or anything.
We can ignore urls that don't start with either "http://" or "https://"
I was originally just going to do:
uksites = []
file = open('urlfile.txt','r')
urllist = file.read().splitlines()
for url in urllist:
if "http://" in url:
domainstart = url.find("http://") + len("http://")
elif "https://" in url:
domainstart = url.find("https://") + len("https://")
domainend = url.find("/", domainstart)
if domainend >= 0:
domain = url[domainstart:domainend]
else:
domain = url[domainstart:]
if domain[-3:] = ".uk":
uksites.append(url)
But then our professor warned us that not all domain names will be ended with a "/" (for example, one of the given ones in the test file we were supplied ends with ":")
Is this the only other valid character that can signify the end of a domain name? or are there even more?
If so how could I approach this?
The test file is pretty short, it contains a few "links" (some aren't real sites apparently):
http://google.com.au/
https://google.co.uk/
https://00.auga.com/shop/angel-haired-dress/
http://applesandoranges.co.uk:articles/best-seasonal-fruits-for-your-garden/
https://www.youtube.com/watch?v=GVbG35DeMto
http://docs.oracle.com/en/java/javase/11/docs/api/
https://www.instagram.co.uk/posts/hjgjh42324/
This should work
def compute():
ret = []
hs = ['https:', 'http:']
domains = ['uk']
file = open('urlfile.txt','r')
urllist = file.read().splitlines()
for url in urllist:
url_t = url.split('/')
# \setminus {http*}
if url_t[0] in hs:
url_t = (url_t[2]).split('.')
else:
url_t = (url_t[0]).split('.')
# get domain
if ':' in url_t[-1]:
url_t = (url_t[-1].split(':'))[0]
else:
url_t = url_t[-1]
# verify it
if url_t in domains:
ret.append(url)
return ret
if __name__ == '__main__':
print(compute())
Analyze the structure of the url first.
An url contains a few characters which cannot appear randomly e.g. the dot "."
The last dot allows therefore marks the end of the url + the country code.
The easiest solution could be to split the urls at the last dot with .rsplit(".", 1) and then take the first 2 letters after the split and re-attach them to the first part of the split. I chose a different approach and checked the second part after the split for alphanumeric characters, because after the country code there is always a special character (non-alphanumeric) so this allows for an additional split of the second part.
s = '''http://google.com.au/
https://google.co.uk/
https://00.auga.com/shop/angel-haired-dress/
http://applesandoranges.co.uk:articles/best-seasonal-fruits-for-your-garden/
https://www.youtube.com/watch?v=GVbG35DeMto
http://docs.oracle.com/en/java/javase/11/docs/api/
https://www.instagram.co.uk/posts/hjgjh42324/'''
s_splitted = s.split("\n")
for raw_url in s_splitted:
a,b = raw_url.rsplit(".", 1)
c = "".join([x if x.isalnum() else " " for x in b ]).split(" ",1)[0]
url = a+"."+c
if url.endswith(".uk"):
print(url)

Double regex search loop makes problems

I'm writing a code to send a values (one by one) to send function which will send it as msg to me. I have a two lists allowed and denied for regex keywords which are allowed in re.search and which are denied. I'm scraping an elements from web, trying to filter it with re.search allowed which works, and send it to the second function where I'm trying to filter it second time. This time filter out string which contains denied words. Here Is the problem. When I'm doing a loop one time allowed it goes good, but when I'm trying to do the second loop for denied in loop, it sending the string two times to my send function. How could I change the code, to make it work please?
Here Is the code
allowed = ["pc", "FUJITSU", "LIFEBOOK", "win" "Windows",
"PC", "Linux" "linux", "HP", "hp", "notebook", "desktop",
"raspberry", "NEC", "mac", "Mac", "Core"]
denied = ["philips", "samsung"]
used = set()
source = requests.get("https://jmty.jp/aichi/sale-pcp").text
soup = BeautifulSoup(source, 'lxml')
def skype_login(user, password):
sk = Skype(user, password)
return(sk)
def send(sk, title, address, price, town, topic='Not'):
for c in sk.chats.recent():
chat = sk.chats[c]
if hasattr(chat, 'topic') and chat.topic == topic:
chat.sendMsg(f'Some string {title} \n {price} \n \n {town} \n \n URL: {address}' )
break
sleep(1)
chat.sendMsg("Additional Message")
def jimoti(sk):
global used
for h2 in soup.find_all('div', class_='p-item-content-info'):
title = h2.select_one('.p-item-title').text
address = h2.select_one('.p-item-title').a["href"]
price = (h2.select_one('.p-item-most-important').text).replace("円", "").replace("\n", "").replace(",", "")
price = int(price)
town = h2.select_one('.p-item-supplementary-info').text
if price < 2000 and title not in used:
used.add(title)
for pattern in allowed:
print(pattern)
if re.search(pattern, title):
second(sk, title, address, price, town)
break
def second(sk, title, address, price, town):
sk = sk
title = title
address = address
price = price
town = town
# for prh in denied: # Here it makes the problem
# print(prh)
# if re.search(prh, title):
# break
# else:
send(sk, title, address, price, town)
if __name__ == '__main__':
sk = skype_login('username', 'pass')
while True:
jimoti(sk)
sleep(randint(11,20))
If I read your code correctly you had this set up (pseudo-python):
for bad in denied:
if bad exists in string:
break
else:
send message
This means that for every denied word that isn't found, you send the message. So if there are two bad words and it doesn't contain either, then you'll send it twice.
You can easily fix this by just having a bool
def second(sk, title, address, price, town):
# I'm not sure why you do this, it's 100% unnecessary
# sk = sk
# title = title
# address = address
# price = price
# town = town
is_ok = True
for prh in denied:
if re.search(prh, title):
is_ok = False
break
if is_ok:
send(sk, title, address, price, town)

How to extract a Data from Riot Games API?

I am still a beginner and have just started with Python.
I try to get the tier and rank of a player with the Riot Games(only EUW) API via JSON, but I get a Exception:
print (responseJSON2[ID][0]['tier'])
TypeError: list indices must be integers or slices, not str
I dont know what I have to change, maybe someone can help me :)
Code:
import requests
def requestSummonerData(summonerName, APIKey):
URL = "https://euw1.api.riotgames.com/lol/summoner/v3/summoners/by-name/" + summonerName + "?api_key=" + APIKey
print (URL)
response = requests.get(URL)
return response.json()
def requestRankedData(ID, APIKey):
URL= "https://euw1.api.riotgames.com/lol/league/v3/positions/by-summoner/"+ID+"?api_key="+APIKey
print (URL)
response = requests.get(URL)
return response.json()
def main():
summonerName = (str)(input('Type your Summoner Name here: '))
APIKey = (str)(input('Copy and paste your API Key here: '))
responseJSON = requestSummonerData(summonerName, APIKey)
print(responseJSON)
ID = responseJSON ['id']
ID = str(ID)
print (ID)
responseJSON2 = requestRankedData(ID, APIKey)
print (responseJSON2[ID][0]['tier'])
print (responseJSON2[ID][0]['entries'][0]['rank'])
print (responseJSON2[ID][0]['entries'][0]['leaguePoints'])
if __name__ == "__main__":
main()
responseJSON2 is a list. A list has indexes (0, 1, 2, ...).
You need to use an int for your list:
ID = str(ID)
is wrong, you need to have an int there!
try with
ID = int(ID)
And you can convert back in string:
def requestRankedData(ID, APIKey):
URL= "https://euw1.api.riotgames.com/lol/league/v3/positions/by-summoner/"+str(ID)+"?api_key="+APIKey
print (URL)
response = requests.get(URL)
return response.json()
You need to find the index matching your ID in your response:
responseJSON2 = requestRankedData(ID, APIKey)
ID_idx = responseJSON2.index(str(ID))
print (responseJSON2[ID_idx][0]['tier'])
print (responseJSON2[ID_idx][0]['entries'][0]['rank'])
print (responseJSON2[ID_idx][0]['entries'][0]['leaguePoints'])
There is my code :
from riotwatcher import LolWatcher()
region = str(input("Your region : ")) #If you only need EUW, just do region = "euw1"
summonerName = str(input("Your summonername : ")) #Asking for the user's summoner name
watcher = LolWatcher(api_key="your_api_key")
summonner = watcher.summoner.by_name(region=region, summoner_name=pseudo) #Getting account informations, you can print(summoner) to see what it gives
rank = watcher.league.by_summoner(region=region, encrypted_summoner_id=summonner["id"]) #User ranks using his id in summoner
tier = rank[0]["tier"]
ranklol = rank[0]["rank"]
lp = rank[0]["leaguePoints"]
print(f"{tier} {ranklol} {lp} LP")
It should be ok, I don't know why are you working with the link, use the API features, it's way more easier. Hope I helped you.

Printing the output from a loop under two different titles

I want to try format an output to be listed under a title.
I made a python (v3.6) script, which checks urls (contained in a textile) and output's whether it's safe or malicious.
Loop statement:
"""
This iterates through each url in the textfile and checks it against google's database.
"""
f = open("file.txt", "r")
for weburl in f:
sb = threat_matches_find(weburl) # Google API module
if url_checker == {}: # the '{}' represent = Safe URL
print ("Safe :", url)
else:
print ("Malicious :", url)
The result out put from this is:
>>>python url_checker.py
Safe : url1.com
Malicious : url2.com
Safe : url3.com
Malicious: url4.com
Safe : url5.com
The objective is to get the url to be listed/sorted under a Title (group), as follows:
If the url is safe, print the url under 'Safe URL', else 'Malicious'.
>>> python url_checker.py
Safe URLs:
url1.com
url3.com
url5.com
Malicious URLs:
url2.com
url4.com
I was unsuccessful in finding other post related to my problem. Any help would be much appreciated.
You could append to lists as you loop and then print when both lists are populated:
safe = []
malicious = []
for weburl in f:
sb = threat_matches_find(weburl) # Google API module
if url_checker == {}: # the '{}' represent = Safe URL
safe.append(url)
else:
malicious.append(url)
print('Safe URLs', *safe, '', sep='\n')
print('Malicious URLs', *malicious, '', sep='\n')
Sample Output:
safe = ['url1.com','url3.com','url5.com']
malicious = ['url2.com','url4.com']
Safe URLs
url1.com
url3.com
url5.com
Malicious URLs
url2.com
url4.com

Python - print 2nd argument

I am new to Python and I've written this test-code for practicing purposes, in order to find and print email addresses from various web pages:
def FindEmails(*urls):
for i in urls:
totalemails = []
req = urllib2.Request(i)
aResp = urllib2.urlopen(req)
webpage = aResp.read()
patt1 = '(\w+[-\w]\w+#\w+[.]\w+[.\w+]\w+)'
patt2 = '(\w+[\w]\w+#\w+[.]\w+)'
regexlist = [patt1,patt2]
for regex in regexlist:
match = re.search(regex,webpage)
if match:
totalemails.append(match.group())
break
#return totalemails
print "Mails from webpages are: %s " % totalemails
if __name__== "__main__":
FindEmails('https://www.urltest1.com', 'https://www.urltest2.com')
When I run it, it prints only one argument.
My goal is to print the emails acquired from webpages and store them in a list, separated by commas.
Thanks in advance.
The problem here is the line: totalemails = []. Here, you are re-instantiating the the variables totalemails to have zero entries. So, in each iteration, it only has one entry inside it. After the last iteration, you'll end up with just the last entry in the list. To get a list of all emails, you need to put the variable outside of the for loop.
Example:
def FindEmails(*urls):
totalemails = []
for i in urls:
req = urllib2.Request(i)
....

Categories

Resources