Appending Data to an existing data frame with panda

Appending Data to an existing data frame with panda - python

I have been trying to find a way to add information to an excel file using panda by appending it but I can't seem to get it. The information is the input from the user
Every time it runs, the data in the excel sheet seems to overwrite the one before, not appending a new row to it.
FirstName = input('What is your First Name? \n')
LastName = input('What is Your Last Name? \n')
ageCustomer = int(input('What is your current Age? \n'))
genderCustomer = input('What is your biological assigned gender? \n')
socialCustomer = int(input('What is your Social Security Number or ITIN? (Must Be 6 digits) \n'))
bdDayCustomer = int(input('What is the day of your birthday? (Enter Just Number) \n'))
bdMonthCustomer = int(input('What is the month of your birthday? (Enter Just Number) \n'))
bdYearCustomer = int(input('What is the year of your birthday? (Must be 4 digits) \n'))
InAmountCustomer = int(input('What is the Initial Amount Being deposited? \n'))
df1 = pd.DataFrame({'FirstN':[''],
'LastN':[''],
'Age':[0],
'Gender':[''],
'SSN':[0],
'bdDay':[0],
'bdMonth':[0],
'bdyear':[0],
'InAmount':[0],
})
row_to_add = pd.DataFrame({'FirstN':[FirstName],
'LastN':[LastName],
'Age':[ageCustomer],
'Gender':[genderCustomer],
'SSN':[socialCustomer],
'bdDay':[bdDayCustomer],
'bdMonth':[bdMonthCustomer],
'bdyear':[bdYearCustomer],
'InAmount':[InAmountCustomer],
})
df_final = df1.append(row_to_add, ignore_index=True)
writer = pd.ExcelWriter('CustomerInfo.xlsx')
df_final.to_excel(writer)
writer.save()
print(df_final)

Related

Looking for any matching terms from file

I have a file that has a large list of Countries, years, and ages of living expectancies. I cannot figure out how to make sure the user is only allowed to input a year that actually exists. After figuring this out, I will need to call only those years (with corresponding country name, code, and living expectancies. How can I do this?
import pathlib
cwd = pathlib.Path(__file__).parent.resolve()
data_file = f'{cwd}/life-expectancy.csv'
with open(data_file) as f:
while True:
user_year = input('Enter the year of interest: ')
for lines in f:
cat = lines.strip().split(',')
country = cat[0]
code = cat[1]
year = cat[2]
age = cat[3]
if any( [year in user_year for year in cat[2]] ):
print(f'Your year is {user_year}. That is one of our known years.')
print(year)
print()
continue
else:
print('Please enter a valid year (1751-2019)')
print('test')

Solution 1
If all the dates from 1751 to 2019 are in your file, then you don't need to read your file to check that, you can simply do that:
# Ask the user for the year
prompt_text = "Enter the year of interest: "
user_year = int(input(prompt_text))
while not 1751 <= user_year <= 2019:
print("Please enter a valid year (1751-2019)")
user_year = int(input(prompt_text))
After that you can read your file and store the data only if the years are matching:
# Get the data for the asked year
# Example of final data: [("France", "FR", 45), ("Espagne", "ES", 29)]
data = []
with open(data_file, "r", encoding="utf-8") as file:
for line in file:
country, code, year, age = line.strip().split(",")
if int(year) == user_year:
data.append((country, code, int(age)))
Solution 2
If you really need to check the year in your file, e.g. because 1845 is not in it, then read the file once and store all the data in a dictionary indexed by the year and return the data of the asked year if it is present:
data = {}
with open(data_file, "r", encoding="utf-8") as file:
for line in file:
country, code, year, age = line.strip().split(",")
year = int(year)
if year in data:
data[year].append((country, code, int(age)))
else:
data[year] = [(country, code, int(age))]
prompt_text = "Enter the year of interest: "
user_year = int(input(prompt_text))
while user_year not in data:
print("The year is not present in the file")
user_year = int(input(prompt_text))
print(data[user_year])

One could use DataFrames to handle such cases. To know more information on dataframe, take a look into Pandas.DataFrame
To select specific column contents from the dataframe: df[[<col_1>, <col_2>]]
Considering the data fetched could produce the following.
import pandas as pd
df = pd.read_csv("Life Expectancy Data.csv")
year = int(input("Enter the year of interest: "))
df = df[["Country", "Year", "Life expectancy "]]
if year in df["Year"].values:
print(f'Your year is {year}. That is one of our known years.')
display(df.loc[df["Year"] == year])
else:
print("Please enter a valid year (2000-2015)")

Your question includes two questions.
1. Question and answer
I cannot figure out how to make sure the user is only allowed to
input a year that actually exists.
Your range of accepted years is 1751-2019. You could create a list with these integers and check that the user input is within that range. E.g.
allowed_answers = list(range(1751, 2019, 1))
There are multiple ways to check the user input and the one you want to use depends on how you want the user interaction to be. Here are few examples:
1.Stop the program immediately if user input is invalid
user_year = input('Enter the year of interest: ')
allowed_answers = list(range(1751, 2019, 1))
assert user_year in allowed_answers, "User input is invalid"
...
2.Ask user to input number until it is accepted
allowed_answers = list(range(1751, 2019, 1))
user_year = 0
while int(user_year) not in allowed_answers:
print('Please enter a valid year (1751-2019)')
user_year = input('Enter the year of interest: ')
3.Combining the two solutions to have a limit of prompts.
allowed_answers = list(range(1751, 2019, 1))
user_year = 0
for i in range(0,5):
print('Please enter a valid year (1751-2019)')
user_year = input('Enter the year of interest: ')
if int(user_year) in allowed_answers:
input_valid = True
break
else:
input_valid = False
assert input_valid, "No correct input after five tries."
Note that all these solutions only handle inputs that can be converted into integer. To go around that, you might need some try... except clauses for the data transformation from string to integer, or transform the list items of allowed_answers into strings.
2. Question and answer
After figuring this out, I will need to call only those years (with corresponding country name, code, and living expectancies. How can I do this?
I would read the file only once a make it into a dictionary. Then you only need to do the indexing once and search from there as long as your program is running. See https://docs.python.org/3/tutorial/datastructures.html#dictionaries .
With these suggestions I would do the data reading and transformation into dictionary outside (and before) your while loop.

Pass a list to a pandas query via user input

I have this dataframe of drug info and would like to filter the data based on user input.
If I explicitly state the search as follows the result are ok:
df = pd.read_excel("safety.xlsx")
search = ['Lisinopril', 'Perindopril']
print(df.query('`Drug Name` in #search'))
But, if I try to pass the search to user input I can only enter a single drug name without error:
while True:
search = input("Enter drug name...")
print(df.query('`Drug Name` in #search'))
if search == "exit":
break
So I would like for the user to be able to enter a list of drugs, not one at a time. If I enter Lisinopril, Perindopril the result is 'Empty DataFrame'
Terminal:
Enter drug name...Lisinopril
Drug Name U&E C
0 Lisinopril Before commencing, at 1-2 weeks after starting... NaN
Enter drug name...Lisinopril, Perindopril
Empty DataFrame
Columns: [Drug Name, U&E, C]
Index: []
Thanks for any help!

If you would like to query multiple fields in the dataframe, you should append the user input to an array.
query = []
in = input("Drug name: ")
query.append(in)
print(df.query('`Drug Name` in #search'))
If you would like to query a single drug multiple times you should use the == notation in the query.
user_input = input("Drug name: ")
subframe = df.query('`Drug Name` == #user_input'))

How to correctly replace data in text files based on input

I have the following problem. Im trying to replace the name based on the input of gender. If anyone could help improve my code it would be really appreciated.
The text file(duedate.txt):
User: Tommy
Gender: Male
Date due: 2020-02-18
The code I have so far is:
with open f = ('duedate.txt).read()
z = input("Please select gender to change)
zz = input("Please select new name")
if z == 'female'
line.startswith('User'):
field, value = line.split(:)
value = zz
print (zz)
I know the code isn't 100% right but the output, if Jessica was chosen as the name, should be:
User: Jessica
Gender: Female
Date due: 2020-02-18

This should work. Code explanation is given in the comments:
import pandas as pd
import numpy as np
# Read the text file into a dataframe
df = pd.read_csv('duedate.txt', sep = "\n",header=None)
# Do dataframe manipulations
df[['Variable','Value']] = df[0].str.split(':',expand=True)
del df[0]
# Collect inputs from user:
z = input("Please select gender to change")
zz = input("Please select new name")
# modify dataframe based on user inputs
df.loc[0,"Value"]=zz
df.loc[1,"Value"]=z
#Construct output column
df["Output"] = df["Variable"] + ": " + df["Value"] + "\n"
# Save the file back to disk
np.savetxt(r'duedate.txt', df["Output"].values,fmt='%s')

User input to control output

data['Year'] = input("Select a Year: ")
data['Month'] = input("Select a Month: ")
grouping = data.groupby(["Year", "Month"])
monthly_averages = grouping.aggregate({"Value":np.mean})
print(monthly_averages)
Guys - trying to pick a year, and a month, then show the mean value for that month. The last 3 lines alone will show every year and month average, but I want to be able to select one. New to python, not sure how to apply the choice to the grouping.

Do you have an example table you can show us? Something like this should work but I can't test it without an example. I'd recommend reading up on the loc method.
year = input('Select a year: ')
month = input('Select a month: ')
df2 = data.loc['Year' == year]
df3 = df2.loc['Month' == month]
grouping = df3.groupby(["Year", "Month"])
monthly_averages = grouping.aggregate({"Value":np.mean})
print(monthly_averages)

I don't think you need grouping if your picking one month and one year
df[(df['Year'] == 'input_year') & (df['Month'] == 'input_month')].mean()

Error in Code, unable to process based on user parameters (Python 3)

I am new to Python, and working on a project for school.I need to find a similar user profile based on the dataset and user inputs. Essentially a user inputs his/her information, i would like to assign a grade and interest rate of a similar existing applicant in dataframe. However, I am failing miserably. Could somoene please help.
loading data
df = pd.read_csv("LoanStats_2017Q1.csv")
df = df[["verification_status","loan_amnt", "term","grade","int_rate","dti","delinq_amnt","annual_inc", "emp_length" ]]
loan = int(input("What loan amount are you looking to obtain? "))
inc = int(input("What is your annual income? "))
dti = int(input("What is your current Debt-to-Equity Ratio (DTI)? "))
lst=[loan,inc,dti]
similar user is someone within 1% of potential applicant
def simUser(a,b):
return (abs(a-b)/b) <=0.01
if dti > 35:
print('\n'+"Your Debt-to-Income Ratio is too High. Please lower it before proceeding.")
else:
print('\n'+"analyzing..."+'\n')
#create dataframe for analysis
columns = ["loan_amnt","annual_inc","dti"]
lst1 = list(zip(lst,columns))
#go over the loan grades and interest rates
for rowNum in range(len(df)):
lamnt = df.iloc[rowNum]['loan_amnt']
ainc = df.iloc[rowNum]['annual_inc']
dtiu = df.iloc[rowNum]['dti']
#scan data for similar a similar loan profile
for user_input,col in lst1:
lst2 = df[simUser(df.iloc[rowNum][col],user_input) == True]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Appending Data to an existing data frame with panda - python

Related

Looking for any matching terms from file

Pass a list to a pandas query via user input

How to correctly replace data in text files based on input

User input to control output

Error in Code, unable to process based on user parameters (Python 3)

Categories

Resources