Python Transposing an excel file question: [closed] - python

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 days ago.
Improve this question
enter image description here
Currently I have the following excel table that i am importing in pandas. I am trying to transpose the data so that the Name 1, Name 2, etc., is the index column and the dates are now the headers. After, I want to create a new table with this layout that selects the most recent data point based on the dates per name.
transposed_df = df.transpose()
print (transposed_df)
transposed_df.set_index('Name', inplace=True)
latest_date_col_index = df.idxmax(axis=1)
latest_data = df.lookup(df.index, latest_date_col_index)
df_latest = pd.DataFrame(latest_data, index=df.index, columns=['Latest Data'])
print(df_latest)

Related

ValueError: You are trying to merge on object and int64 columns. Change the type to string but not yet resolved [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
import pandas as pd
SOheader_df = pd.read_excel("E:\My IDEA Documents\IDEA Projects\Sales_systemcontrols\Source Files.ILB\VBAK_SOheaderdata.XLSX")
KNVV_df = pd.read_excel("E:\My IDEA Documents\IDEA Projects\Sales_systemcontrols\Source Files.ILB\KNVV.xlsx" ,usecols=['Customer', 'Sales Organization', 'Distribution Channel', 'Division', 'Cust.Pric.Procedure', 'Acct Assmt Grp Cust.'])
# Change the data type of Sold to Party
KNVV_df['Customer'].astype(str).dtype
SOheader_df['Cust.Pric.Procedure'] = SOheader_df.merge(KNVV_df, how='inner', left_on='Sold-ToParty', right_on='Customer', indicator='True')
Sold to party was in int64 and Customer was in object. Hence type changed but problem continue.
Error message suggests columns dtypes on which you are merging differ.
You can try to cast Customer column to int so dtypes match. Note that in order to change a column dtype, you need to re-assign the original column to the casted column.
KNVV_df['Customer'] = KNVV_df['Customer'].astype(int)

How do I get the sum of column from a csv within specified rows using dates in python? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Date,hrs,Count,Status
2018-01-02,4,15,SFZ
2018-01-03,5,16,ACZ
2018-01-04,3,14,SFZ
2018-01-05,5,15,SFZ
2018-01-06,5,18,ACZ
This is the fraction of data to what I've been working on. The actual data is in the same format with around 1000 entries of each date in it. I am taking the start_date and end_date as inputs from user. Consider in this case it is:
start_date:2018-01-02
end_date:2018-01-06
So, I have to display a total for hrs and the count within the selected date range, on the output. Also I want to do it using an #app.callback in dash(plot.ly). Can someone help please?
Use Series.between with filtering by DataFrame.loc and boolean indexing for columns by condition and then sum:
df = df.loc[df['Date'].between('2018-01-02','2018-01-06'), ['hrs','Count']].sum()
print (df)
hrs 22
Count 78
dtype: int64

How to get cell value from pandas data frame [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
So currently, I ask the user for the column name input and the row input but I don't know how to get the cell value.
import pandas as pd
data_column = input("what column do you want to choose")
print(data_column)
data_row = input("What row do you want to choose")
print(data_row)
I have tried with iloc and loc but it doesn't return the cell value.
You should be able to get the value of a specific cell by using
data_table.iloc[data_row, data_column]
Remember that
input("x")
returns a string, so you'd have to cast it into an int if you want to use the variable directly.
data_column = int(input("what column do you want to choose"))

draw boxplot for data in a loop [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have a 1000*8 dataset and each column represent the price of a stock in different time so there are 8 stocks. I want to draw 8 boxplots for all the stocks to examine the extreme values in a loop in python. Could you please tell me how I can do that?
As a quick alternative to using matplotlib directly, Pandas has a reasonable boxplot function that could be used.
df = pd.DataFrame(np.random.randn(1000, 8), columns=list('ABCDEFGH'))
df.boxplot(column = list(df.columns))
edit: Just realise your question asked to do this in a loop.
for c in df.columns:
fig, ax = plt.subplots()
ax = df.boxplot(column = c)

Calculating mean of each row, ignoring 0 values in python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have a data frame with 1000 rows and 10 columns.
3 of these columns are 'total_2013', 'total_2014' and 'total_2015'
I would like to create a new column, containing the average of total over these 3 years for each row, but ignoring any 0 values.
If you are using pandas:
Use DataFrame.mean leveraging the skipna attribute.
First replace 0 with None using:
columns = ['total_2013', 'total_2014', 'total_2015']
df[columns].replace(0, None)
Then compute the mean:
df["total"] = df[columns].mean(
axis=1, # columns mean
skipna=True # skip nan values
)

Categories

Resources