imgfound=False
imgexists=0
img_ext=['.jpg','.jpeg','.png','.gif']
while True:
httpfind=html.find('http',imgexists)
if httpfind==-1:
break
imgexists=httpfind
imgexist=html.find('"',imgexists)
imgurl=html[imgexists:imgexist]
imgexists+=len(imgurl)
for extscan in img_ext:
if not imgurl.find(extscan)==-1:
imgfound=True
break
#print imgfound
if imgfound==False:
continue
print imgurl
I want to find links to images in a html document. But something is not working like it should. Like it prints all links regardless of whether they have a img_ext substring in them. I printed the value in imgfound, and for all the links it is True. Where have I gone wrong?
The expression
not imgurl.find(extscan) == -1
will evaluate to False for every integer, because of operators precedence.
How can you fix it?
Change it to
imgurl.find(extscan) != -1
Or, change it to
not(imgurl.find(extscan) == -1)
Christian's answer is correct, but it's worth noting that this is not good Python style. The preferred form is:
if extscan not in imgurl
Your version looks like a Java-ism.
Related
Looking to see if this is the most pythonic way to compare a string variable passed in as an argument in python3. My testing shows that this works, however I was confused to why or would not work and and will. This is just a demo, the tag variable is set from the command line. When I test with centos6, centos7, centos8 I hit the else and it works as expected. Is this the best way to do this? Or is this wrong?
tag = 'centos6'
if tag != 'centos6' and tag != 'centos7' \
and tag != 'centos8':
print('[--os %s] must be [--os centos6] or '
'[--os centos7] or [--os centos8]' % tag)
print('fail')
else:
print('good')
Because the or makes the if True if the tag is not equal to one of the centos values, it doesn't matter if it's just one or all of them, and makes the if True only if it's different to all values. Now this is simpler to write:
options = ['centos6', 'centos7', 'centos8']
tag = 'centos6'
if tag not in options:
...
I am pretty new to Python, and am more used to JS, so I am a little lost on how to do this.
Basically I have a JSON from an API from Google, and the first result isn't always valid for what I need. But I do only need to the first result that returns true.
I am pretty sure I have the syntax wrong in more than one area, but I need the first imageUrl where [gi]['pagemap'] would be true.
item_len = len(deserialized_output['items'])
for gi in range(item_len):
def loop_tgi():
if deserialized_output['items'][gi]['pagemap'] is True:
imageUrl = deserialized_output['items'][gi]['pagemap']['cse_image'][0]['src']
break
loop_tgi()
You could iterate over items directly, without a use of index
in Python loop leak a variable, so when you break from the loop, gi variable would have what you need (or last value).
To overcome "or last value" we can use else close to check that we went through whole loop with no break
for gi in deserialized_output['items']:
if gi['pagemap'] is True:
break
else:
gi = None # or throw some sort of exception when there is no good elements
if gi: # checking that returned element is good
print(gi) # now we can use that element to do what you want!
imageUrl = gi['pagemap']['cse_image'][0]['src']
I am a bit worried about your gi['pagemap'] is True as later you try to access gi['pagemap']['cse_image']. It means that gi['pagemap'] is not a boolean, but some sort of object.
If this is dict, you could check if gi['pagemap']: that is True if this dict is not empty. but gi['pagemap'] is True would be False if gi['pagemap'] is {'cse_image':...}
I started Javascript around a year and a half ago but I started Python around a week ago so I'm still a beginner at this. So I'm trying to figure out if a user is already stored into the users table and if they aren't then it will add them to it and if they are then it skips.
import rethinkdb as r
r.db('bot').table('users').filter(r.row["id"] == "253544423110082589").run()
this code should return
<rethinkdb.net.DefaultCursor object at 0x03DAAE10> (done streaming):
[
]
So how exactly would I check if its empty?
I tried something like
if not r.db('bot').table('users').filter(r.row["id"] == "253544423110082589").run():
# add user to table
else:
continue
but it continues even when the user isn't in the table. So please tell me what I'm doing wrong and how I can actually get it to work
From the docs use the is_empty function:
r.db('bot').table('users').filter(r.row["id"] == "253544423110082589").is_empty().run(conn)
You could also use count
r.db('bot').table('users').filter(r.row["id"] == "253544423110082589").count().run(conn)
I'm not familiar with the library you're using, but you may be able to check the length of the table.
It might look something like this:
if len(table) == 0:
# add user to table
else:
continue
I am trying to find a substring which is a basically a link to any website. The idea is that if a user posts something, the link will be extracted and assigned to a variable called web_link. My current code is following:
post = ("You should watch this video https://www.example.com if you have free time!")
web_link = post[post.find("http" or "www"):post.find(" ", post.find("http" or "www"))]
The code works perfectly if there is a spacebar after the link, however, if the link inside the post is at the very end. For example:
post = ("You should definitely watch this video https://www.example.com")
Then the post.find(" ") can not find a spacebar/whitespace and returns -1 which results in web_link "https://www.example.co"
I am trying to find a solution that does not involve an if statement if possible.
Use regex. I've made a little change the solution here.
import re
def func(post):
return re.search("[(http|ftp|https)://]*([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?", post).group(0)
print(func("You should watch this video www.example.com if you have free time!"))
print(func("You should watch this video https://www.example.com"))
Output:
www.example.com
https://www.example.com
But I should say, using "if" is simpler and obvious:
def func(post):
start = post.find("http" or "www")
finish = post.find(" ", start)
return post[start:] if finish == -1 else post[start:finish]
The reason this doesn't work is because if the string isn't found and -1 is returned the slice commands interprets this as "the rest of the string -1 character from the end".
As ifma pointed out the best way to achieve this would be with a regular expression. Something like:
re.search("(https?://|www[^\s]+)", post).group(0)
This is part of my code, the id of image wont print even I use the withtag function. I think the function canvas.delete won't work is also the same problem, it seem the tag is inserted as "123","456". However, the tag I expected to use and get is 123 instead of '123'. And I guess it's the main problem I can't get the id I want with the findtag function.
CurrentImage=Note[NoteIndexLocal]
Temp=canvas.create_image(XShow,YShow,image=CurrentImage,tag=123)
print canvas.find_withtag(123) #This Wont Work,printed()
canvas.delete(123) #This Wont Work
print canvas.gettags(Temp) #printed '123'
From: http://effbot.org/tkinterbook/canvas.htm
Tags are symbolic names attached to items. Tags are ordinary strings,
and they can contain anything except whitespace (as long as they don’t
look like item handles).
Use str(123) instead of 123
EDIT: correct answer is in text from doc "as long as they don’t look like item handles". Number 123 looks like item handle (print Temp to see how it looks like) so it doesn't work. Use text like "a123" and it will work.