Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
So currently, I ask the user for the column name input and the row input but I don't know how to get the cell value.
import pandas as pd
data_column = input("what column do you want to choose")
print(data_column)
data_row = input("What row do you want to choose")
print(data_row)
I have tried with iloc and loc but it doesn't return the cell value.
You should be able to get the value of a specific cell by using
data_table.iloc[data_row, data_column]
Remember that
input("x")
returns a string, so you'd have to cast it into an int if you want to use the variable directly.
data_column = int(input("what column do you want to choose"))
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 days ago.
Improve this question
enter image description here
Currently I have the following excel table that i am importing in pandas. I am trying to transpose the data so that the Name 1, Name 2, etc., is the index column and the dates are now the headers. After, I want to create a new table with this layout that selects the most recent data point based on the dates per name.
transposed_df = df.transpose()
print (transposed_df)
transposed_df.set_index('Name', inplace=True)
latest_date_col_index = df.idxmax(axis=1)
latest_data = df.lookup(df.index, latest_date_col_index)
df_latest = pd.DataFrame(latest_data, index=df.index, columns=['Latest Data'])
print(df_latest)
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I have following data
I'd like to select the bigest value on 2nd column related to value on 1st column.
For value 1 on 1st column, the selected value shall be 5.
The 1st column is time (for example: 06:54:11)
I can use matlab, python, excel, bash.
Using python, you can download your file (assuming it's an Excel file) to a pandas DataFrame, groupby on the first column and find the max value in the second column:
import pandas as pd
df = pd.read_excel('your_data.xlsx')
output = df.groupby('column1')['column2'].max()
Using Matlab you can get the Maximum with the build-in "max" function.
Try using [M,I] = max(data)
and replace data with your matrix name.
M will return you the maxima. In your case M(2) will be the maximum of the second row. With the Index (I) you can grab the corresponding time out of the first row.
time = data(I(2),1)
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
import pandas as pd
SOheader_df = pd.read_excel("E:\My IDEA Documents\IDEA Projects\Sales_systemcontrols\Source Files.ILB\VBAK_SOheaderdata.XLSX")
KNVV_df = pd.read_excel("E:\My IDEA Documents\IDEA Projects\Sales_systemcontrols\Source Files.ILB\KNVV.xlsx" ,usecols=['Customer', 'Sales Organization', 'Distribution Channel', 'Division', 'Cust.Pric.Procedure', 'Acct Assmt Grp Cust.'])
# Change the data type of Sold to Party
KNVV_df['Customer'].astype(str).dtype
SOheader_df['Cust.Pric.Procedure'] = SOheader_df.merge(KNVV_df, how='inner', left_on='Sold-ToParty', right_on='Customer', indicator='True')
Sold to party was in int64 and Customer was in object. Hence type changed but problem continue.
Error message suggests columns dtypes on which you are merging differ.
You can try to cast Customer column to int so dtypes match. Note that in order to change a column dtype, you need to re-assign the original column to the casted column.
KNVV_df['Customer'] = KNVV_df['Customer'].astype(int)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Date,hrs,Count,Status
2018-01-02,4,15,SFZ
2018-01-03,5,16,ACZ
2018-01-04,3,14,SFZ
2018-01-05,5,15,SFZ
2018-01-06,5,18,ACZ
This is the fraction of data to what I've been working on. The actual data is in the same format with around 1000 entries of each date in it. I am taking the start_date and end_date as inputs from user. Consider in this case it is:
start_date:2018-01-02
end_date:2018-01-06
So, I have to display a total for hrs and the count within the selected date range, on the output. Also I want to do it using an #app.callback in dash(plot.ly). Can someone help please?
Use Series.between with filtering by DataFrame.loc and boolean indexing for columns by condition and then sum:
df = df.loc[df['Date'].between('2018-01-02','2018-01-06'), ['hrs','Count']].sum()
print (df)
hrs 22
Count 78
dtype: int64
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have a data frame with 1000 rows and 10 columns.
3 of these columns are 'total_2013', 'total_2014' and 'total_2015'
I would like to create a new column, containing the average of total over these 3 years for each row, but ignoring any 0 values.
If you are using pandas:
Use DataFrame.mean leveraging the skipna attribute.
First replace 0 with None using:
columns = ['total_2013', 'total_2014', 'total_2015']
df[columns].replace(0, None)
Then compute the mean:
df["total"] = df[columns].mean(
axis=1, # columns mean
skipna=True # skip nan values
)