Im trying to get an specific email from a gmail address (or subject).
Im using selenium because can't use imaplib for this gmail account.
Im stck here:
driver.find_element_by_id("identifierId").send_keys('MYEMAIL')
driver.find_element_by_id("identifierNext").click()
time.sleep(5)
driver.find_element_by_name("password").send_keys('PSWRD')
driver.find_element_by_id("passwordNext").click()
time.sleep(15)
email = driver.find_elements_by_css_selector("div.xT>div.y6")
for emailsub in email:
if "someone#gmail.com" in emailsub: #Error line
emailsub.click()
break
Error: argument of type 'WebElement' is not iterable
Idk why I'm using find_elements.
Your for is loop being done correctly and they way you call driver.find_elements_by_css_selector is fine. Inside of your for loop you are dealing with INDIVIDUAL instances of WebElement. The error you get is due to the use of the in statement, I would check this post on using IN in a IF statement and furter understand what you are looking for in your in statement.
I suspect plainly iterating over the WebElement class is throwing this error, and you need to further define what the `in' applies to.
Related
I'm creating a web scraper for discord after logging in the bot should extract the last sent message in a discord channel and print it out. The last sent message however is dynamic. It always has a different path. It doesn't have an ID either. It only has a class name of embedFooterText-28V_Wb. As you know selenium returns the first element that has that class. In this case, it's the first message ever sent. How can I reverse it so it gets the last message sent? This is what I have written so far:
text = driver.find_element_by_class_name('embedFooterText-28V_Wb').get_attribute('innerHTML')
print(text)
This code returns the first message sent and I'd like to get the last message sent
you can find all of the elements the get last one with:
el1 = driver.find_elements_by_xpath("//*[contains(#class, 'embedFooterText-28V_Wb')]")[-1]
but better to get last one already with xpath functionality
Use (//*[contains(#class,'embedFooterText-28V_Wb')])[last()]
I used Java syntax however in Python it will be quite similar
I have a bot I'm writing using imaplib in python to fetch emails from gmail and output some useful data from them. I've hit a snag on selecting the inbox, though; the existing sorting system uses custom labels to separate emails from different customers. I've partially replicated this system in my test email, but imaplib.select() throws a "imaplib.IMAP4.error: SELECT command error: BAD [b'Could not parse command']" with custom labels. Screenshot attatched My bot has no problem with the default gmail folders, fetching INBOX or [Gmail]/Spam. In that case, it hits an error later in the code that deals with completely different problem I have yet to fix. The point, though, is that imaplib.select() is succsessful with default inboxes and just not custom labels.
The way my code works is it works through all the available inboxes, compares it to a user-inputted name, and if they match, saves the name and sets a boolean to true to signal that it found a match. It then checks, if there was a match (the user-inputted inbox exists) it goes ahead, otherwise it throws an error message and resets. It then attempts to select the inbox the user entered.
I've verified that the variable the program's saving the inbox name to matches what's listed as the name in the imap.list() command. I have no idea what the issue is.
I could bypass the process by iterating through all mail to find the email's I'm looking for, but it's far more efficient to use the existing sorting system due to the sheer number of emails on the account I'll be using.
Any help is appreciated!
EDIT: Code attached after request. Thank you to the person who told me to do so.
'''
Fetches emails from the specified inbox and outputs them to a popup
'''
def fetchEmails(self):
#create an imap object. Must be local otherwise we can only establish a single connection
#imap states are kinda bad
imap = imaplib.IMAP4_SSL(host="imap.gmail.com", port="993")
#Login and fetch a list of available inboxes
imap.login(username.get(), password.get())
type, inboxList = imap.list()
#Set a reference boolean and iterate through the list
inboxNameExists = False
for i in inboxList:
#Finds the name of the inbox
name = self.inboxNameParser(i.decode())
#If the given inbox name is encountered, set its existence to true and break
if name.casefold().__eq__(inboxName.get().casefold()):
inboxNameExists = True
break
#If the inbox name does not exist, break and give error message
if inboxNameExists != True:
self.logout(imap)
tk.messagebox.showerror("Disconnected!", "That Inbox does not exist.")
return
'''
If/else to correctly feed the imap.select() method the inbox name
Apparently inboxes containing spaces require quoations before and after
Selects the inbox and pushes it to a variable
two actually but the first is unnecessary(?)
imap is weird
'''
if(name.count(" ") > 0):
status, messages = imap.select("\"" + name + "\"")
else:
status, messages = imap.select(name);
#Int containing total number of emails in inbox
messages = int(messages[0])
#If there are no messages disconnect and show an infobox
if messages == 0:
self.logout(imap)
tk.messagebox.showinfo("Disconnected!", "The inbox is empty.")
self.mailboxLoop(imap, messages)
Figured the issue out after a few hours banging through it with a friend. As it turns out the problem was that imap.select() wants quotations around the mailbox name if it contains spaces. So imap.select("INBOX") is fine, but with spaces you'd need imap.select("\"" + "Label Name" + "\"")
You can see this reflected in the code I posted with the last if/else statement.
Python imaplib requires mailbox names with spaces to be surrounded by apostrophes. So imap.select("INBOX") is fine, but with spaces you'd need imap.select("\"" + "Label Name" + "\"").
I have a set of company's names and I want to find their profile on LinkedIn, to do this, I am using 'https://www.linkedin.com/company/'+company_name to test if it is the company profile and the linkedin-scraper, https://pypi.org/project/linkedin-scraper/.
The problem is that the program ends to run when the first valid link is found.
actions.login(driver, email, password)
linkedin_info = []
for i in range(len(df['NAME'])):
try:
linkedin_info.append(Company('https://www.linkedin.com/company/'+df['NAME'[i],driver=driver),scrape=False)
Company.scrape(close_on_complete=False)
continue
except:
linkedin_info.append('info_not_found')
continue
I'm using "try" because when no page is found we get an error.
I also tried to use a list of valid LinkedIn links, but I can scrape only one link each time I run the code.
What might be the Issue?
I solved, the problem was that I was making a wrong use of the scrape function.
I would have to do:
linkedin_info[-1].scrape(close_on_complete=False)
I'm experiencing a strange issue that seems to be inconsistent with google's gmail API:
If you look here, you can see that gmail's representation of an email has keys "snippet" and "id", among others. Here's some code that I use to generate the complete list of all my emails:
response = service.users().messages().list(userId='me').execute()
messageList = []
messageList.extend(response['messages'])
while 'nextPageToken' in response:
pagetoken = response['nextPageToken']
response = service.users().messages().list(userId='me', pageToken=pagetoken).execute()
messageList.extend(response['messages'])
for message in messageList:
if 'snippet' in message:
print(message['snippet'])
else:
print("FALSE")
The code works!... Except for the fact that I get output "FALSE" for every single one of the emails. 'snippet' doesn't exist! However, if I run the same code with "id" instead of snippet, I get a whole bunch of ids!
I decided to just print out the 'message' objects/dicts themselves, and each one only had an "id" and a "threadId", even though the API claims there should be more in the object... What gives?
Thanks for your help!
As #jedwards said in his comment, just because a message 'can' contain all of the fields specified in documentation, doesn't mean it will. 'list' provides the bare minimum amount of information for each message, because it provides a lot of messages and wants to be as lazy as possible. For individual messages that I want to know more about, I'd then use 'messages.get' with the id that I got from 'list'.
Running get for each email in your inbox seems very expensive, but to my knowledge there's no way to run a batch 'get' command.
I am retrieving emails from my email server using IMAPClient (Python), by checking for emails flagged with "\Recent". After the email has been read the email server automatically sets the email flag to "\Seen".
What I want to do is reset the email flag to "\Recent" so when I check the email directly on the server is still appears as unread.
What I'm finding is that IMAPClient is throwing an exception when I try to add the "\Recent" flag to an email using IMAPClient's "set_flag" definition. Adding any other flag works fine.
The IMAPClient documentation say's the Recent flag is read-only, but I was wondering if there is still a way to mark an email as un-read.
From my understanding email software like Thunderbird allows you to set emails as un-read so I assume there must be a way to do it.
Thanks.
For completeness, here's an actual example using IMAPClient. The \Seen flag is updated in order to control whether messages are marked as read or unread.
from imapclient import IMAPClient, SEEN
client = IMAPClient(...)
client.select_folder('INBOX')
msg_ids = client.search(...)
# Mark messages as read
client.add_flags(msg_ids, [SEEN])
# Mark messages as unread
client.remove_flags(msg_ids, [SEEN])
Note that add_flags and remove_flags are used instead of set_flags because the latter resets the flags to just those specified. When setting the read/unread status you typically want to leave any other message flags intact.
It's also worth noting that it's possible call fetch using the "BODY.PEEK" data item to retrieve parts of messages without affecting the \Seen flag. This can avoid the need to fix up the \Seen flag after downloading a message.
See section 6.4.5 of RFC 3501 for more details.
IMAPClient docs specifically stated the '\Recent' flag is ReadOnly:
http://imapclient.readthedocs.org/en/latest/#message-flags
This is probably a feature (or limitation) of IMAP and IMAP servers. (That is: probably not an IMAPClient limitation).
Use the '\Seen' flag to mark something unread.
Disclaimer: I'm familiar with IMAP but not Python-IMAPClient specifically.
Normally the 'seen' flag determines if an email summary will be shown normal or bold.
You should be able to reset the seen flag. However the recent flag may not be under your direct control. The imap server will set it if notices new messages arriving.
#Menno Smits:
I'm having issues adding the '\Seen' flag to a mail after parsing through it.
I only want to mark a mail as READ when it contains a particular text.
I've been trying to use the add_flags using the "client.add_flags(msg_ids, [SEEN])" you gave above but I keep getting store failed: Command received in invalid state What exactly goes into the [SEEN](is this just a placeholder or the exact syntax?)
Here is a portion of my code:
#login and authentication
context=ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
iobj=imapclient.IMAPClient('outlook.office365.com', ssl=True,ssl_context=context)
iobj.login(uname,pwd)
iobj.select_folder('INBOX', readonly=True)
unread=iobj.search('UNSEEN')
print('There are: ',len(unread),' UNREAD emails')
for i in unread:
mail=iobj.fetch(i,['BODY[]'])
mail_body=html2text.html2text(mcontent.html_part.get_payload().decode(mcontent.html_part.charset))
##Do some regex to parse the email to check if it contains text
meter_no=(re.findall(r'\nACCOUNT NUMBER: (\d+)', mail_body))
req_type=(re.findall(r'Complaint:..+?\n(.+)\n', mail_body))
if 'Key Change' in req_type:
if meter_no in kct['Account_no'].values:
print 'Going to sendmail'# Call a function
sending_email(meter_no,subject,phone_no,req_type,)
mail[b'FLAGS']=r'b\Seen'+','+''+r'b\Answered'##Trying to manuaally alter the flag but didn't work##
iobj.add_flags(i,br'\Seen')# Didn't work too (but is 'i' my msg_id??)
iobj.add_flags(i,[SEEN]) # Complains Name SEEN not defined
else: print 'KCT is yet to be generated'