I made a Python program using tkinter and pandas to select rows and send them by email.
The program let the user decides on which excel file wants to operate;
then asks on which sheet of that file you want to operate;
then it asks how many rows you want to select (using .tail function);
then the program is supposed to iterate through rows and read from a cell (within selected rows) the email address;
then it sends the correct row to the correct address.
I'm stuck at the iteration process.
Here's the code:
import pandas as pd
import smtplib
def invio_mail(my_tailed__df): #the function imports the sliced (.tail) dataframe
gmail_user = '###'
gmail_password = '###'
sent_from = gmail_user
server = smtplib.SMTP_SSL('smtp.gmail.com', 465)
server.ehlo()
server.login(gmail_user, gmail_password)
list = my_tailed_df
customers = list['CUSTOMER']
wrs = list['WR']
phones = list['PHONE']
streets = list['STREET']
cities = list["CITY"]
mobiles = list['MOBILE']
mobiles2 = list['MOBILE2']
mails = list['INST']
for i in range(len(mails)):
customer = customers[i]
wr = wrs[i]
phone = phones[i]
street = streets[i]
city = cities[i]
mobile = mobiles[i]
mobile2 = mobiles2[i]
"""
for i in range(len(mails)):
if mails[i] == "T138":
mail = "email_1#gmail.com"
elif mails[i] == "T139":
mail = "email_2#gmail.com"
"""
subject = 'My subject'
body = f"There you go" \n Name: {customer} \n WR: {wr} \n Phone: {phone} \n Street: {street} \n City: {city} \n Mobile: {mobile} \n Mobile2: {mobile2}"
email_text = """\
From: %s
To: %s
Subject: %s
%s
""" % (sent_from, ", ".join(mail), subject, body)
try:
server.sendmail(sent_from, [mail], email_text)
server.close()
print('Your email was sent!')
except:
print("Some error")
The program raises a KeyError: 0 after it enters the for loop, on the first line inside the loop: customer = customers[i]
I know that the commented part (the nested for loop) will raise the same error.
I'm banging my head on the wall, I think i've read and tried everything.
Where's my error?
Things start to go wrong here: list = my_tailed_df. In Python list() is a Built-in Type.
However, with list = my_tailed_df, you are overwriting the type. You can check this:
# before the line:
print(type(list))
<class 'type'>
list = my_tailed_df
# after the line:
print(type(list))
<class 'pandas.core.frame.DataFrame'> # assuming that your df is an actual df!
This is bad practice and adds no functional gain at the expense of confusion. E.g. customers = list['CUSTOMER'] is doing the exact same thing as would customers = my_tailed_df['CUSTOMER'], namely creating a pd.Series with the index from my_tailed_df. So, first thing to do, is to get rid of list = my_tailed_df and to change all those list[...] snippets into my_tailed_df[...].
Next, let's look at your error. for i in range(len(mails)): generates i = 0, 1, ..., len(mails)-1. Hence, what you are trying to do, is access the pd.Series at the index 0, 1 etc. If you get the error KeyError: 0, this must simply mean that the index of your original df does not contain this key in the index (e.g. it's a list of IDs or something).
If you don't need the original index (as seems to be the case), you could remedy the situation by resetting the index:
my_tailed_df.reset_index(drop=True, inplace=True)
print(my_tailed_df.index)
# will get you: RangeIndex(start=0, stop=x, step=1)
# where x = len(my_tailed_df)-1 (== len(mails)-1)
Implement the reset before the line customers = my_tailed_df['CUSTOMER'] (so, instead of list = my_tailed_df), and you should be good to go.
Alternatively, you could keep the original index and change for i in range(len(mails)): into for i in mails.index:.
Finally, you could also do for idx, element in enumerate(mails.index): if you want to keep track both of the position of the index element (idx) and its value (element).
I finally had to give up and ask for help. I am retrieving a document (with requests) that has a json type of format (but is not well formatted - no double quotes) and trying to extract the data as a normal dict. Here is what I have: this works and will get you the output from which I am trying to extract the data.
def test():
url = "http://www.sgx.com/JsonRead/JsonstData"
payload = {}
payload['qryId'] = 'RSTIc'
payload['timeout'] = 60
header = {'User-Agent' : 'Mozilla/5.0 (compatible; MSIE 10.0; Linux i686; Trident/2.0)', 'Content-Type': 'text/html; charset=utf-8'}
req = requests.get(url, headers = header, params = payload)
print(req.url)
prelim = req.content.decode('utf-8')
print(type(prelim))
print(prelim)
test()
What I would like to have after that is: (assuming a properly functioning dict)
for stock in prelim['items']:
print(stock['N'])
Which should give me a list of all the stocks names.
I have tried most json functions: prelim.json(), loads., load., dump., dumps., parse. None seems to work because the data is not formatted properly. I also tried ast.literal_eval() without success. I tried some examples on Stack Overflow to convert that string in a proper dict but no luck. I don't seem to be able to convert that string to make it behave as a proper dictionary. If you can point me in the right direction that would be much appreciated.
Good samaritains have asked for an example of the data. The data coming from the above request is a bit longer but I removed a few 'items' so people can see the general look of the retrieved data.
{}&& {identifier:'ID', label:'As at 19-03-2018 8:38 AM',items:[{ID:0,N:'AscendasReit',SIP:'',NC:'A17U',R:'',I:'',M:'',LT:0,C:0,VL:97.600,BV:485.300,B:'2.670',S:'2.670',SV:1009.100,O:0,H:0,L:0,V:259811.200,SC:'9',PV:2.660,P:0,P_:'X',V_:''},
{ID:1,N:'CapitaComTrust',SIP:'',NC:'C61U',R:'',I:'',M:'',LT:0,C:0,VL:126.349,BV:1467.300,B:'1.800',S:'1.800',SV:620.900,O:0,H:0,L:0,V:228691.690,SC:'9',PV:1.810,P:0,P_:'X',V_:''},
{ID:2,N:'CapitaLand',SIP:'',NC:'C31',R:'',I:'',M:'',LT:0,C:0,VL:78.000,BV:184.900,B:'3.670',S:'3.670',SV:372.900,O:0,H:0,L:0,V:286026.000,SC:'9',PV:3.660,P:0,P_:'X',V_:''},
{ID:28,N:'Wilmar Intl',SIP:'',NC:'F34',R:'CD',I:'',M:'',LT:0,C:0,VL:0.000,BV:32.000,B:'3.210',S:'3.210',SV:73.100,O:0,H:0,L:0,V:0.000,SC:'2',PV:3.220,P:0,P_:'',V_:''},
{ID:29,N:'YZJ Shipbldg SGD',SIP:'',NC:'BS6',R:'',I:'',M:'',LT:0,C:0,VL:0.000,BV:349.500,B:'1.330',S:'1.330',SV:417.700,O:0,H:0,L:0,V:0.000,SC:'2',PV:1.340,P:0,P_:'',V_:''}]}
Following the recent commentaries, I know I could do this:
def test2():
my_text = "{}&& {identifier:'ID', label:'As at 19-03-2018 8:38 AM',items:[{ID:0,N:'AscendasReit',SIP:'',NC:'A17U',R:'',I:'',M:'',LT:0,C:0,VL:97.600,BV:485.300,B:'2.670',S:'2.670',SV:1009.100,O:0,H:0,L:0,V:259811.200,SC:'9',PV:2.660,P:0,P_:'X',V_:''}, {ID:1,N:'CapitaComTrust',SIP:'',NC:'C61U',R:'',I:'',M:'',LT:0,C:0,VL:126.349,BV:1467.300,B:'1.800',S:'1.800',SV:620.900,O:0,H:0,L:0,V:228691.690,SC:'9',PV:1.810,P:0,P_:'X',V_:''}, {ID:2,N:'CapitaLand',SIP:'',NC:'C31',R:'',I:'',M:'',LT:0,C:0,VL:78.000,BV:184.900,B:'3.670',S:'3.670',SV:372.900,O:0,H:0,L:0,V:286026.000,SC:'9',PV:3.660,P:0,P_:'X',V_:''}, {ID:28,N:'Wilmar Intl',SIP:'',NC:'F34',R:'CD',I:'',M:'',LT:0,C:0,VL:0.000,BV:32.000,B:'3.210',S:'3.210',SV:73.100,O:0,H:0,L:0,V:0.000,SC:'2',PV:3.220,P:0,P_:'',V_:''}, {ID:29,N:'YZJ Shipbldg SGD',SIP:'',NC:'BS6',R:'',I:'',M:'',LT:0,C:0,VL:0.000,BV:349.500,B:'1.330',S:'1.330',SV:417.700,O:0,H:0,L:0,V:0.000,SC:'2',PV:1.340,P:0,P_:'',V_:''}]}"
prelim = my_text.split("items:[")[1].replace("}]}", "}")
temp_list = prelim.split(", ")
end_list = []
main_dict = {}
for tok1 in temp_list:
temp_dict = {}
temp = tok1.replace("{","").replace("}","").split(",")
for tok2 in temp:
my_key = tok2.split(":")[0]
my_value = tok2.split(":")[1].replace("'","")
temp_dict[my_key] = my_value
end_list.append(temp_dict)
main_dict['items'] = end_list
for stock in main_dict['items']:
print(stock['N'])
test2()
Which is the desired result. I am just asking, if there is an easier (more elegant/pythonic) way of doing this.
You need to convert the string to JSON convertible text first then use json.loads to get dictionary
prelim is not in JSON format and values are not surrounded by "
Remove '{}&& '
Surround properties with "
Apply json.loads(new_text) to get the dictionary representation
i.e
import requests, json
#replace tuples
reps = (('identifier:', '"identifier":'),
('label:', '"label":'),
('items:', '"items":'),
('NC:', '"NC":'),
('ID:', '"ID":'),
('N:', '"N":'),
('SIP:', '"SIP":'),
('SC:', '"SC":'),
('R:', '"R":'),
('I:', '"I":'),
('M:', '"M":'),
('LT:', '"LT":'),
('C:', '"C":'),
('VL:', '"VL":'),
('BV:', '"BV":'),
('BL:', '"BL":'),
('B:', '"B":'),
('S:', '"S":'),
('SV:', '"SV":'),
('O:', '"O":'),
('H:', '"H":'),
('L:', '"L":'),
('PV:', '"PV":'),
('V:', '"V":'),
('P_:', '"P_":'),
('P:', '"P":'),
('V_:', '"V_":'))
#getting rid of invalid json text
prelim = prelim.replace('{}&& ', '')
#replacing single quotes with double quotes
prelim = prelim.replace("'", "\"")
print(prelim)
#reduce to get all replacements
dict_text = fn.reduce(lambda a, kv: a.replace(*kv), reps, prelim)
dic = json.loads(dict_text)
print(dic)
get Items:
for x in dic['items']:
print(x['N'])
Output:
2ndChance W200123
3Cnergy
3Cnergy W200528
800 Super
8Telecom^
A-Smart
A-Sonic Aero^
AA
....
I have a question with using python and beautifulsoup.
My end result program basically fills out a form on a website and brings me back the results which I will eventually output to an lxml file. I'll be taking the results from https://interactive.web.insurance.ca.gov/survey/survey?type=homeownerSurvey&event=HOMEOWNERS and I want to get a list for every city all into some excel documents.
Here is my code, I put it on pastebin:
http://pastebin.com/bZJfMp2N
MY RESULTS ARE ALMOST GOOD :D except now I'm getting 355 for my "correct value" instead of 355, for example. I want to parse that and only show the number, you will see when you put this into python.
However, anything I have tried does NOT work, there is no way I can parse that values_2 variable because the results are in bs4.element.resultset when I think i need to parse a string. Sorry if I am a noob, I am still learning and have worked very long on this program.
Would anyone have any input? Anything would be appreciated! I've read up that my results are in a list or something and i can't parse lists? How would I go about doing this?
Here is the code:
__author__ = 'kennytruong'
#THE PROBLEM HERE IS TO PARSE THE RESULTS PROPERLY!!
import urllib.parse, urllib.request
import re
from bs4 import BeautifulSoup
URL = "https://interactive.web.insurance.ca.gov/survey/survey?type=homeownerSurvey&event=HOMEOWNERS"
#Goes through these locations, strips the whitespace in the string and creates a list that starts at every new line
LOCATIONS = '''
ALAMEDA ALAMEDA
'''.strip().split('\n') #strip() basically removes whitespaces
print('Available locations to choose from:', LOCATIONS)
INSURANCE_TYPES = '''
HOMEOWNERS,CONDOMINIUM,MOBILEHOME,RENTERS,EARTHQUAKE - Single Family,EARTHQUAKE - Condominium,EARTHQUAKE - Mobilehome,EARTHQUAKE - Renters
'''.strip().split(',') #strips the whitespaces and starts a newline of the list every comma
print('Available insurance types to choose from:', INSURANCE_TYPES)
COVERAGE_AMOUNTS = '''
15000,25000,35000,50000,75000,100000,150000,200000,250000,300000,400000,500000,750000
'''.strip().split(',')
print('All options for coverage amounts:', COVERAGE_AMOUNTS)
HOME_AGE = '''
New,1-3 Years,4-6 Years,7-15 Years,16-25 Years,26-40 Years,41-70 Years
'''.strip().split(',')
print('All Home Age Options:', HOME_AGE)
def get_premiums(location, coverage_type, coverage_amt, home_age):
formEntries = {'location':location,
'coverageType':coverage_type,
'coverageAmount':coverage_amt,
'homeAge':home_age}
inputData = urllib.parse.urlencode(formEntries)
inputData = inputData.encode('utf-8')
request = urllib.request.Request(URL, inputData)
response = urllib.request.urlopen(request)
responseData = response.read()
soup = BeautifulSoup(responseData, "html.parser")
parseResults = soup.find_all('tr', {'valign':'top'})
for eachthing in parseResults:
parse_me = eachthing.text
name = re.findall(r'[A-z].+', parse_me) #find me all the words that start with a cap, as many and it doesn't matter what kind.
# the . for any character and + to signify 1 or more of it.
values = re.findall(r'\d{1,10}', parse_me) #find me any digits, however many #'s long as long as btwn 1 and 10
values_2 = eachthing.find_all('div', {'align':'right'})
print('raw code for this part:\n' ,eachthing, '\n')
print('here is the name: ', name[0], values)
print('stuff on sheet 1- company name:', name[0], '- Premium Price:', values[0], '- Deductible', values[1])
print('but here is the correct values - ', values_2) #NEEDA STRIP THESE VALUES
# print(type(values_2)) DOING SO GIVES ME <class 'bs4.element.ResultSet'>, NEEDA PARSE bs4.element type
# values_3 = re.split(r'\d', values_2)
# print(values_3) ANYTHING LIKE THIS WILL NOT WORK BECAUSE I BELIEVE RESULTS ARENT STRING
print('\n\n')
def main():
for location in LOCATIONS: #seems to be looping the variable location in LOCATIONS - each location is one area
print('Here are the options that you selected: ', location, "HOMEOWNERS", "150000", "New", '\n\n')
get_premiums(location, "HOMEOWNERS", "150000", "New") #calls function get_premiums and passes parameters
if __name__ == "__main__": #this basically prevents all the indent level 0 code from getting executed, because otherwise the indent level 0 code gets executed regardless upon opening
main()
I am new to Python, and I want your advice on something.
I have a script that runs one input value at a time, and I want it to be able to run a whole list of such values without me typing the values one at a time. I have a hunch that a "for loop" is needed for the main method listed below. The value is "gene_name", so effectively, i want to feed in a list of "gene_names" that the script can run through nicely.
Hope I phrased the question correctly, thanks! The chunk in question seems to be
def get_probes_from_genes(gene_names)
import json
import urllib2
import os
import pandas as pd
api_url = "http://api.brain-map.org/api/v2/data/query.json"
def get_probes_from_genes(gene_names):
if not isinstance(gene_names,list):
gene_names = [gene_names]
#in case there are white spaces in gene names
gene_names = ["'%s'"%gene_name for gene_name in gene_names]**
api_query = "?criteria=model::Probe"
api_query= ",rma::criteria,[probe_type$eq'DNA']"
api_query= ",products[abbreviation$eq'HumanMA']"
api_query= ",gene[acronym$eq%s]"%(','.join(gene_names))
api_query= ",rma::options[only$eq'probes.id','name']"
data = json.load(urllib2.urlopen(api_url api_query))
d = {probe['id']: probe['name'] for probe in data['msg']}
if not d:
raise Exception("Could not find any probes for %s gene. Check " \
"http://help.brain- map.org/download/attachments/2818165/HBA_ISH_GeneList.pdf? version=1&modificationDate=1348783035873 " \
"for list of available genes."%gene_name)
return d
def get_expression_values_from_probe_ids(probe_ids):
if not isinstance(probe_ids,list):
probe_ids = [probe_ids]
#in case there are white spaces in gene names
probe_ids = ["'%s'"%probe_id for probe_id in probe_ids]
api_query = "? criteria=service::human_microarray_expression[probes$in%s]"% (','.join(probe_ids))
data = json.load(urllib2.urlopen(api_url api_query))
expression_values = [[float(expression_value) for expression_value in data["msg"]["probes"][i]["expression_level"]] for i in range(len(probe_ids))]
well_ids = [sample["sample"]["well"] for sample in data["msg"] ["samples"]]
donor_names = [sample["donor"]["name"] for sample in data["msg"] ["samples"]]
well_coordinates = [sample["sample"]["mri"] for sample in data["msg"] ["samples"]]
return expression_values, well_ids, well_coordinates, donor_names
def get_mni_coordinates_from_wells(well_ids):
package_directory = os.path.dirname(os.path.abspath(__file__))
frame = pd.read_csv(os.path.join(package_directory, "data", "corrected_mni_coordinates.csv"), header=0, index_col=0)
return list(frame.ix[well_ids].itertuples(index=False))
if __name__ == '__main__':
probes_dict = get_probes_from_genes("SLC6A2")
expression_values, well_ids, well_coordinates, donor_names = get_expression_values_from_probe_ids(probes_dict.keys())
print get_mni_coordinates_from_wells(well_ids)
whoa, first things first. Python ain't Java, so do yourself a favor and use a nice """xxx\nyyy""" string, with triple quotes to multiline.
api_query = """?criteria=model::Probe"
,rma::criteria,[probe_type$eq'DNA']
...
"""
or something like that. you will get white spaces as typed, so you may need to adjust.
If, like suggested, you opt to loop on the call to your function through a file, you will need to either try/except your data-not-found exception or you will need to handle missing data without throwing an exception. I would opt for returning an empty result myself and letting the caller worry about what to do with it.
If you do opt for raise-ing an Exception, create your own, rather than using a generic exception. That way your code can catch your expected Exception first.
class MyNoDataFoundException(Exception):
pass
#replace your current raise code with...
if not d:
raise MyNoDataFoundException(your message here)
clarification about catching exceptions, using the accepted answer as a starting point:
if __name__ == '__main__':
with open(r"/tmp/genes.txt","r") as f:
for line in f.readlines():
#keep track of your input data
search_data = line.strip()
try:
probes_dict = get_probes_from_genes(search_data)
except MyNoDataFoundException, e:
#and do whatever you feel you need to do here...
print "bummer about search_data:%s:\nexception:%s" % (search_data, e)
expression_values, well_ids, well_coordinates, donor_names = get_expression_values_from_probe_ids(probes_dict.keys())
print get_mni_coordinates_from_wells(well_ids)
You may want to create a file with Gene names, then read content of the file and call your function in the loop. Here is an example below
if __name__ == '__main__':
with open(r"/tmp/genes.txt","r") as f:
for line in f.readlines():
probes_dict = get_probes_from_genes(line.strip())
expression_values, well_ids, well_coordinates, donor_names = get_expression_values_from_probe_ids(probes_dict.keys())
print get_mni_coordinates_from_wells(well_ids)