Why is my code getting the wrong part of the txt string?

Why is my code getting the wrong part of the txt string? - python

The issue I have is that I'm not able to get the correct part of my txt string. The txt string also has to be in the current order, I can't switch role and name.
txt = \
'''
Company name
leader: Claire
cashier: Ole
'''
def role(name):
start= txt.find(name)-len(name)
end= txt.find('\n', start)
return txt[start:end]
If I type role("Ole") I expect the output to be 'cashier'.
The output I am getting though is 'r: Ole'.

You can create a dictionary that associates the names to the correspondent roles, so that it becomes very easy to retrieve them even in a much longer list:
txt = \
'''
Company name
leader: Claire
cashier: Ole
'''
t = txt.split()
roles = {t[i] : t[i-1].strip(':') for i in range(3, len(t), 2)}
def role(name):
return roles[name]
Once set up, it's pretty intuitive.

stringy = "cashier: Ole"
stringy.partition(':')[0]
you can use .partition and bring only the first element
this will work but if there is anything diff about your string it may break down.
stringy = """ Company name
leader: Claire
cashier: Ole"""
def role(name):
slice = stringy.partition(name)[0]
role_str = slice.split(' ')[-2]
return role_str

Related

Filling form fields in Word automatically with pywin32

I have a little problem to solve, but I don´t get an acceptable solution. I try to have a script that automaticcaly fills in FormFields in Word with entries of dictionaries. So far I used pywin32 for that task and somehow I managed to find a solution that works at all. At least as long as I update the word document.
The command that I use so far in a loop is "marks(i).Result = keyValue". Here "marks" is - as far as I understand - a formfields object, which I get by "marks = doc.FormFields".
As I said, all this works to some extent, but as soon as I update the document in Word, all entries disappear again.
Would be really great if someone could help me and knows how to make it that the entry remains permanent. Unfortunately I do not understand the documentation of pywin32 at all. Attached below is the whole function.
Thanks in advance!
def wordExport(window, excDict, docPath):
wordApp = win32com.client.Dispatch('Word.Application') #Word App Object
wordApp.Visible = False
doc = wordApp.Documents.Open(docPath) #Open .docx
marks = doc.FormFields # FormFields object
cMarks = marks.Count # FormField count
cMatches = 0 #Match counter
cOverwrt = 0 #Fields overwritten
for i in range(1, cMarks + 1):
fieldName = str(marks(i).Name) #String object of FormField
#Default Cases --------------------------------------------------------------------
if fieldName in excDict:
keyValue = excDict.get(fieldName) #Get value to key(FormField)
resultIsEmpty = not marks(i).Result #No text has been entered before
# No prior entry present
if keyValue != "None"
marks(i).Result = keyValue + " "
cMatches += 1
if not resultIsEmpty:
cOverwrt += 1
doc.Close()
wordApp.Visible = False

How to filter an input to search a database in python

I'm an intermediate python coder, but i'm still somewhat new to it. I just started using json files (kind of like a database), and my most recent projects was to use the nba database (https://data.nba.net/prod/v1/today.json) for a user input that tells about an nba character that they input in. The problem was that it would pour the entire roster database, so I added a filter. The problem with the filter was that it wouldn't let anything out. If you type Lebron James, shows nothing. Not even an error.
from requests import get
from pprint import PrettyPrinter
BASE_URL = "https://data.nba.net"
ALL_JSON = "/prod/v1/today.json"
printer = PrettyPrinter()
data = get(BASE_URL + ALL_JSON).json()
def get_links():
data = get(BASE_URL + ALL_JSON).json()
links = data['links']
return links
def ask_player():
players = get_links()['leagueRosterPlayers']
playerselected = get(
BASE_URL + players).json()['league']['standard']
'topic' == input('Type the player you wish to know more about:')
playerselected = list(filter(lambda x: x['firstName'] == "topic",playerselected ))
playerselected = list(filter(lambda y: y['lastName'] == "topic",playerselected ))
for standard in playerselected:
name = standard['firstName']
name2 = standard['lastName']
jersey = standard['jersey']
#teams = standard['teams']
college = standard['collegeName']
dob = standard['dateOfBirthUTC']
years = standard['yearsPro']
printer.pprint(f"{name} {name2}, jersey number {jersey}, went to {college} college, was born in {dob}, and has been in the nba for {years} full year(s).")
ask_player()
My code is above. I tried a lot, and I even read most of the database to see if I missed something. Can someone help? It's a project for fun, by the way.

The problem is around these lines of code where topic should be a variable, not a string. Also, you should assign the input to the topic variable, not using ==.
The filter compares the first name and last name, so we split the input to two parts.
topic = input('Type the player you wish to know more about:')
firstName = topic.split()[0]
lastName = topic.split()[1]
playerselected = list(filter(lambda x: x['firstName'] == firstName and x['lastName'] == lastName, playerselected ))
The output is like this for Steven Adams,
('Steven Adams, jersey number 4, went to Pittsburgh college, was born in '
'1993-07-20, and has been in the nba for 8 full year(s).')

Loop through a spreadsheet

I made a Python program using tkinter and pandas to select rows and send them by email.
The program let the user decides on which excel file wants to operate;
then asks on which sheet of that file you want to operate;
then it asks how many rows you want to select (using .tail function);
then the program is supposed to iterate through rows and read from a cell (within selected rows) the email address;
then it sends the correct row to the correct address.
I'm stuck at the iteration process.
Here's the code:
import pandas as pd
import smtplib
def invio_mail(my_tailed__df): #the function imports the sliced (.tail) dataframe
gmail_user = '###'
gmail_password = '###'
sent_from = gmail_user
server = smtplib.SMTP_SSL('smtp.gmail.com', 465)
server.ehlo()
server.login(gmail_user, gmail_password)
list = my_tailed_df
customers = list['CUSTOMER']
wrs = list['WR']
phones = list['PHONE']
streets = list['STREET']
cities = list["CITY"]
mobiles = list['MOBILE']
mobiles2 = list['MOBILE2']
mails = list['INST']
for i in range(len(mails)):
customer = customers[i]
wr = wrs[i]
phone = phones[i]
street = streets[i]
city = cities[i]
mobile = mobiles[i]
mobile2 = mobiles2[i]
"""
for i in range(len(mails)):
if mails[i] == "T138":
mail = "email_1#gmail.com"
elif mails[i] == "T139":
mail = "email_2#gmail.com"
"""
subject = 'My subject'
body = f"There you go" \n Name: {customer} \n WR: {wr} \n Phone: {phone} \n Street: {street} \n City: {city} \n Mobile: {mobile} \n Mobile2: {mobile2}"
email_text = """\
From: %s
To: %s
Subject: %s
%s
""" % (sent_from, ", ".join(mail), subject, body)
try:
server.sendmail(sent_from, [mail], email_text)
server.close()
print('Your email was sent!')
except:
print("Some error")
The program raises a KeyError: 0 after it enters the for loop, on the first line inside the loop: customer = customers[i]
I know that the commented part (the nested for loop) will raise the same error.
I'm banging my head on the wall, I think i've read and tried everything.
Where's my error?

Things start to go wrong here: list = my_tailed_df. In Python list() is a Built-in Type.
However, with list = my_tailed_df, you are overwriting the type. You can check this:
# before the line:
print(type(list))
<class 'type'>
list = my_tailed_df
# after the line:
print(type(list))
<class 'pandas.core.frame.DataFrame'> # assuming that your df is an actual df!
This is bad practice and adds no functional gain at the expense of confusion. E.g. customers = list['CUSTOMER'] is doing the exact same thing as would customers = my_tailed_df['CUSTOMER'], namely creating a pd.Series with the index from my_tailed_df. So, first thing to do, is to get rid of list = my_tailed_df and to change all those list[...] snippets into my_tailed_df[...].
Next, let's look at your error. for i in range(len(mails)): generates i = 0, 1, ..., len(mails)-1. Hence, what you are trying to do, is access the pd.Series at the index 0, 1 etc. If you get the error KeyError: 0, this must simply mean that the index of your original df does not contain this key in the index (e.g. it's a list of IDs or something).
If you don't need the original index (as seems to be the case), you could remedy the situation by resetting the index:
my_tailed_df.reset_index(drop=True, inplace=True)
print(my_tailed_df.index)
# will get you: RangeIndex(start=0, stop=x, step=1)
# where x = len(my_tailed_df)-1 (== len(mails)-1)
Implement the reset before the line customers = my_tailed_df['CUSTOMER'] (so, instead of list = my_tailed_df), and you should be good to go.
Alternatively, you could keep the original index and change for i in range(len(mails)): into for i in mails.index:.
Finally, you could also do for idx, element in enumerate(mails.index): if you want to keep track both of the position of the index element (idx) and its value (element).

Weird sequence when writing into txt document

Hi everyone this is my first time here, and I am a beginner in Python. I am in the middle of writing a program that returns a txt document containing information about a stock (Watchlist Info.txt), based on the input of another txt document containing the company names (Watchlist).
To achieve this, I have written 3 functions, of which 2 functions reuters_ticker() and stock_price() are completed as shown below:
def reuters_ticker(desired_search):
#from company name execute google search for and return reuters stock ticker
try:
from googlesearch import search
except ImportError:
print('No module named google found')
query = desired_search + ' reuters'
for j in search(query, tld="com.sg", num=1, stop=1, pause=2):
result = j
ticker = re.search(r'\w+\.\w+$', result)
return ticker.group()
Stock Price:
def stock_price(company, doc=None):
ticker = reuters_ticker(company)
request = 'https://www.reuters.com/companies/' + ticker
raw_main = pd.read_html(request)
data1 = raw_main[0]
data1.set_index(0, inplace=True)
data1 = data1.transpose()
data2 = raw_main[1]
data2.set_index(0, inplace=True)
data2 = data2.transpose()
stock_info = pd.concat([data1,data2], axis=1)
if doc == None:
print(company + '\n')
print('Previous Close: ' + str(stock_info['Previous Close'][1]))
print('Forward PE: ' + str(stock_info['Forward P/E'][1]))
print('Div Yield(%): ' + str(stock_info['Dividend (Yield %)'][1]))
else:
from datetime import date
with open(doc, 'a') as output:
output.write(date.today().strftime('%d/%m/%y') + '\t' + str(stock_info['Previous Close'][1]) + '\t' + str(stock_info['Forward P/E'][1]) + '\t' + '\t' + str(stock_info['Dividend (Yield %)'][1]) + '\n')
output.close()
The 3rd function, watchlist_report(), is where I am getting problems with writing the information in the format as desired.
def watchlist_report(watchlist):
with open(watchlist, 'r') as companies, open('Watchlist Info.txt', 'a') as output:
searches = companies.read()
x = searches.split('\n')
for i in x:
output.write(i + ':\n')
stock_price(i, doc='Watchlist Info.txt')
output.write('\n')
When I run watchlist_report('Watchlist.txt'), where Watchlist.txt contains 'Apple' and 'Facebook' each on new lines, my output is this:
26/04/20 275.03 22.26 1.12
26/04/20 185.13 24.72 --
Apple:
Facebook:
Instead of what I want and would expect based on the code I have written in watchlist_report():
Apple:
26/04/20 275.03 22.26 1.12
Facebook:
26/04/20 185.13 24.72 --
Therefore, my questions are:
1) Why is my output formatted this way?
2) Which part of my code do I have to change to make the written output in my desired format?
Any other suggestions about how I can clean my code and any libraries I can use to make my code nicer are also appreciated!

You handle two different file-handles - the file-handle inside your watchlist_report gets closed earlier so its being written first, before the outer functions file-handle gets closed, flushed and written.
Instead of creating a new open(..) in your function, pass the current file handle:
def watchlist_report(watchlist):
with open(watchlist, 'r') as companies, open('Watchlist Info.txt', 'a') as output:
searches = companies.read()
x = searches.split('\n')
for i in x:
output.write(i + ':\n')
stock_price(i, doc = output) # pass the file handle
output.write('\n')
Inside def stock_price(company, doc=None): use the provided filehandle:
def stock_price(company, output = None): # changed name here
# [snip] - removed unrelated code for this answer for brevity sake
if output is None: # check for None using IS
print( ... ) # print whatever you like here
else:
from datetime import date
output.write( .... ) # write whatever you want it to write
# output.close() # do not close, the outer function does this
Do not close the file handle in the inner function, the context handling with(..) of the outer function does that for you.
The main takeaway for file handling is that things you write(..) to your file are not neccessarily placed there immediately. The filehandler chooses when to actually persist data to your disk, the latests it does that is when it goes out of scope (of the context handler) or when its internal buffer reaches some threshold so it "thinks" it is now prudent to alter to data on your disc. See How often does python flush to a file? for more infos.

Parsing already parsed results with BeautifulSoup

I have a question with using python and beautifulsoup.
My end result program basically fills out a form on a website and brings me back the results which I will eventually output to an lxml file. I'll be taking the results from https://interactive.web.insurance.ca.gov/survey/survey?type=homeownerSurvey&event=HOMEOWNERS and I want to get a list for every city all into some excel documents.
Here is my code, I put it on pastebin:
http://pastebin.com/bZJfMp2N
MY RESULTS ARE ALMOST GOOD :D except now I'm getting            355 for my "correct value" instead of 355, for example. I want to parse that and only show the number, you will see when you put this into python.
However, anything I have tried does NOT work, there is no way I can parse that values_2 variable because the results are in bs4.element.resultset when I think i need to parse a string. Sorry if I am a noob, I am still learning and have worked very long on this program.
Would anyone have any input? Anything would be appreciated! I've read up that my results are in a list or something and i can't parse lists? How would I go about doing this?
Here is the code:
__author__ = 'kennytruong'
#THE PROBLEM HERE IS TO PARSE THE RESULTS PROPERLY!!
import urllib.parse, urllib.request
import re
from bs4 import BeautifulSoup
URL = "https://interactive.web.insurance.ca.gov/survey/survey?type=homeownerSurvey&event=HOMEOWNERS"
#Goes through these locations, strips the whitespace in the string and creates a list that starts at every new line
LOCATIONS = '''
ALAMEDA ALAMEDA
'''.strip().split('\n') #strip() basically removes whitespaces
print('Available locations to choose from:', LOCATIONS)
INSURANCE_TYPES = '''
HOMEOWNERS,CONDOMINIUM,MOBILEHOME,RENTERS,EARTHQUAKE - Single Family,EARTHQUAKE - Condominium,EARTHQUAKE - Mobilehome,EARTHQUAKE - Renters
'''.strip().split(',') #strips the whitespaces and starts a newline of the list every comma
print('Available insurance types to choose from:', INSURANCE_TYPES)
COVERAGE_AMOUNTS = '''
15000,25000,35000,50000,75000,100000,150000,200000,250000,300000,400000,500000,750000
'''.strip().split(',')
print('All options for coverage amounts:', COVERAGE_AMOUNTS)
HOME_AGE = '''
New,1-3 Years,4-6 Years,7-15 Years,16-25 Years,26-40 Years,41-70 Years
'''.strip().split(',')
print('All Home Age Options:', HOME_AGE)
def get_premiums(location, coverage_type, coverage_amt, home_age):
formEntries = {'location':location,
'coverageType':coverage_type,
'coverageAmount':coverage_amt,
'homeAge':home_age}
inputData = urllib.parse.urlencode(formEntries)
inputData = inputData.encode('utf-8')
request = urllib.request.Request(URL, inputData)
response = urllib.request.urlopen(request)
responseData = response.read()
soup = BeautifulSoup(responseData, "html.parser")
parseResults = soup.find_all('tr', {'valign':'top'})
for eachthing in parseResults:
parse_me = eachthing.text
name = re.findall(r'[A-z].+', parse_me) #find me all the words that start with a cap, as many and it doesn't matter what kind.
# the . for any character and + to signify 1 or more of it.
values = re.findall(r'\d{1,10}', parse_me) #find me any digits, however many #'s long as long as btwn 1 and 10
values_2 = eachthing.find_all('div', {'align':'right'})
print('raw code for this part:\n' ,eachthing, '\n')
print('here is the name: ', name[0], values)
print('stuff on sheet 1- company name:', name[0], '- Premium Price:', values[0], '- Deductible', values[1])
print('but here is the correct values - ', values_2) #NEEDA STRIP THESE VALUES
# print(type(values_2)) DOING SO GIVES ME <class 'bs4.element.ResultSet'>, NEEDA PARSE bs4.element type
# values_3 = re.split(r'\d', values_2)
# print(values_3) ANYTHING LIKE THIS WILL NOT WORK BECAUSE I BELIEVE RESULTS ARENT STRING
print('\n\n')
def main():
for location in LOCATIONS: #seems to be looping the variable location in LOCATIONS - each location is one area
print('Here are the options that you selected: ', location, "HOMEOWNERS", "150000", "New", '\n\n')
get_premiums(location, "HOMEOWNERS", "150000", "New") #calls function get_premiums and passes parameters
if __name__ == "__main__": #this basically prevents all the indent level 0 code from getting executed, because otherwise the indent level 0 code gets executed regardless upon opening
main()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why is my code getting the wrong part of the txt string? - python

Related

Filling form fields in Word automatically with pywin32

How to filter an input to search a database in python

Loop through a spreadsheet

Weird sequence when writing into txt document

Parsing already parsed results with BeautifulSoup

Categories

Resources