Using a list to call pandas columns - python

I read in a CSV file
times = pd.read_csv("times.csv",header=0)
times.columns.values
The column names are in a list
titles=('case','num_gen','year')
titles are much longer and complex but for simplicity sake, it is truncated here.
I want to call an index of a column of times using an index from titles.
My attempt is:
times.titles[2][0]
This is tho try to get the effect of:
times.year[0]
I need to do this because there are 75 columns that I need to call in a loop, therefore, I can not have each column name typed out as in the line above.
Any ideas on how to accomplish this?

I think you need to use .iloc let's look at the pandas doc on selection by position:
time.iloc[2,0] #will return the third row and first column, the indexes are zero-based.

Related

Is there a built in function in pandas that does the following?

i have two dataframes df1 and df2. they have a column with common values but in df1['comon_values_column'] every value comes up only once while in df2['comon_values_column'] each value can come up more than once. I want to see if i can do the following with a single line and without a loop
for value in df2['comon_values_column']:
df2['empty_column'].loc[df2['comon_values_column']==value]=df1['other_column'].loc[df1['comon_values_column']==value]
i have tried to use merge but because of the size of the dataframe it is very difficult to make sure that it does exactly what i want

How do I add every column in a pandas dataframe to a list except for the first column?

Normally, I would be able to call dataframe.columns for a list of all columns, but I don't want to include the very first column in my list. Writing each column manually is an option, but one I'd like to avoid, given the few hundred column headers I'm working with. I do need to use this column, though, so deleting it from the dataframe entirely wouldn't work. How can I put every column into a list except for the first one?
This should work:
list(df.columns[1:])

Python: create new columns from rows based on multiple conditions

I've been poking around a bit and can't see to find a close solution to this one:
I'm trying to transform a dataframe from this:
To this:
Such that remark_code_names with similar denial_amounts are provided new columns based on their corresponding har_id and reason_code_name.
I've tried a few things, including a groupby function, which gets me halfway there.
denials.groupby(['har_id','reason_code_name','denial_amount']).count().reset_index()
But this obviously leaves out the reason_code_names that I need.
Here's a minimum:
pd.DataFrame({'har_id':['A','A','A','A','A','A','A','A','A'],'reason_code_name':[16,16,16,16,16,16,16,22,22],
'remark_code_name':['MA04','N130','N341','N362','N517','N657','N95','MA04','N341'],
'denial_amount':[5402,8507,5402,8507,8507,8507,8507,5402,5402]})
Using groupby() is a good way to go. Use it along with transform() and overwrite the column with name 'remark_code_name. This solution puts all remark_code_names together in the same column.
denials['remark_code_name'] = denials.groupby(['har_id','reason_code_name','denial_amount'])['remark_code_name'].transform(lambda x : ' '.join(x))
denials.drop_duplicates(inplace=True)
If you really need to create each code in their own columns, you could apply another function and use .split(). However you will first need to set the number of columns depending on the max number of codes you find in a single row.

Pandas Dataframe delete row by repitition

I am looking to delete a row in a dataframe that is imported into python by pandas.
if you see the sheet below, the first column has same name multiple times. So the condition is, if the first column value re-appears in a next row, delete that row. If not keep that frame in the dataframe.
My final output should look like the following:
Presently I am doing it by converting each column into a list and deleting them by index values. I am hoping there would be an easy way. Rather than this workaround/
df.drop_duplicates([df.columns[0])
should do the trick.
Try the following code;
df.drop_duplicates(subset='columnName', keep=’first’, inplace=true)

Pandas Groupby Count Partial Strings

I am wanting to try to get a count of how many rows within a column contain a partial string based on an imported dataframe. In the sample data below, I want to groupby Trans_type and then get a count of how many rows contain a value.
So I would expect to see:
First, is this possible generically without passing a link to get each types expected brand? If not, how could I pass say Car a list of .str.contains['Audi','BMW'].
Thanks for any help!
Try this one:
df.groupby(df["Trans_type"], df["Brand"].str.extract("([a-zA-Z])+", expand=False)).count()

Categories

Resources