Column in Pandas Series is appearing in a row above the rest - python

import pandas as pd
from IPython.display import display_html
df = pd.DataFrame({'Name': ['Mike','George', 'George']})
name_series = df.groupby("Name").size()
name_series.name = ''
name_dataframe = name_series.to_frame()
name_styler = name_dataframe.style.set_table_attributes("style='display:inline'")
display_html(name_styler.render(), raw = True)
I have this cell where I want to display a DataFrame in this fashion so I can add more directly beside it to, rather than displaying DataFrames one after another.
When I display it with the name set to blank, the data looks like this:
When I set to name to anything, then it seems like the column name appears on the row above the rest of the table:
I checked my code posted in another website and the tables all look like this if I don't set a name to a column, which means there's another row being created for some reason:
How can I place the new column name and the existing column name on the same row?
edit:
After applying name_dataframe.reset_index(inplace=True)
edit 2:
Applying name_series.index.name=None gives:
Is there a way to combine answers to just show both titles and columns ONLY, "Name" and "Count" ?

Related

Put a new column inside a Dataframe Python

The Dataframe that I am working on it have a column called "Brand" that have a value called "SEAT " with the white space. I achieved to drop the white space but I don't know how to put the new column inside the previous Dataframe. I have to do this because I need to filter the previous Dataframe by "SEAT" and show these rows.
I tried this:
import pandas as pd
brand_reviewed = df_csv2.Brand.str.rstrip()
brand_correct = 'SEAT'
brand_reviewed.loc[brand_reviewed['Brand'].isin(brand_correct)]
Thank you very much!
As far as I understand,
you're trying to return rows that match the pattern "SEAT".
You are not forced to create a new column. You can directly do the following:
df2 = brand_reviewed[brand_reviewed.Brand.str.rstrip() == "SEAT"]
print(df2)
You have done great. I will also mentioned another form on how you can clead the white spaces. And also, if you just want to add a new column in your current dataframe, just write the last line of this code.
import pandas as pd
brand_reviewed = pd.read_csv("df_csv2.csv")
data2 = data["Brand"].str.strip()
brand_reviewed["New Column"] = data2
If you have another query, let me know.
Octavio Velázquez

Creating list from imported CSV file with pandas

I am trying to create a list from a CSV. This CSV contains a 2 dimensional table [540 rows and 8 columns] and I would like to create a list that contains the values of an specific column, column 4 to be specific.
I tried: list(df.columns.values)[4], it does mention the name of the column but i'm trying to get the values from the rows on column 4 and make them a list.
import pandas as pd
import urllib
#This is the empty list
company_name = []
#Uploading CSV file
df = pd.read_csv('Downloads\Dropped_Companies.csv')
#Extracting list of all companies name from column "Name of Stock"
companies_column=list(df.columns.values)[4] #This returns the name of the column.
companies_column = list(df.iloc[:,4].values)
So for this you can just add the following line after the code you've posted:
company_name = df[companies_column].tolist()
This will get the column data in the companies column as pandas Series (essentially a Series is just a fancy list) and then convert it to a regular python list.
Or, if you were to start from scratch, you can also just use these two lines
import pandas as pd
df = pd.read_csv('Downloads\Dropped_Companies.csv')
company_name = df[df.columns[4]].tolist()
Another option: If this is the only thing you need to do with your csv file, you can also get away just using the csv library that comes with python instead of installing pandas, using this approach.
If you want to learn more about how to get data out of your pandas DataFrame (the df variable in your code), you might find this blog post helpful.
I think that you can try this for getting all the values of a specific column:
companies_column = df[{column name}]
Replace "{column name}" with the column you want to access the values of.

How to read selected row from a shown pandas dataframe as output in ipywidgets

I have a pandas dataframe displayed through a widgets.Output() in a simple dashboard. What I want to do is read what row the user clicked so I can further process data. The following is an example of the dataframe shown through the ipywidget.
Basically, I want to know either the index or the value when the user selects a given row. Say:
Is there a way I can read this?
An example of the code:
import pandas as pd
import ipywidgets as widgets
df = pd.DataFrame(index=['J21','D21','J22','M23'])
df['Last'] = [4.10,4.38,4.63,4.90]
df['Wtd'] = [-0.02,0.02,0.02,0.01]
df['Mtd'] = [-0.08,-0.02,-0.01,0.01]
df['3mth'] = [-0.10,0.25,0.50,0.73]
df['Ytd'] = [-0.19,0.16,0.38,0.52]
out = widgets.Output()
with out:
display(df)
out
Basically when I show out on my code, it prints the df as an ipywidget. I now want to read when a user click on a given row (say M23), so I can show additional information for that particular item in another ipywidget. Is that doable?
Thanks very much!

How to update selected headers only in a CSV using Python?

I want to change my csv headers from
column_1, column_2, ABC_column, column_4, XYZ_column
To
new_column_1, new_column_2, ABC_column, new_column_4, XYZ_column
I can easily change all the columns using writer.writerow but the when there is a new value in place of ABC_column I want to keet that as well, meaning instead of ABC_column if it comes like DEF_column then I also don't want to change that.
So it should only change those columns which do not comes at 3rd place and 5th place and leave the ones that comes at 3rd and 5th place as it is.
Use pandas:
import pandas as pd
df = pd.read_csv(path_to_csv)
df = df.rename(columns={'column_1': 'new_column_1', 'column_2': 'new_column_2' ... })
df.to_csv(path_to_csv)
you can do any type of renaming logic to that dictionary

Filter Excel Spreadsheet to obtain cell value with Python

I have a GUI (shown below) and with it i want to extract a specific IP Address from an Excel spreadsheet which contains IP Address's (~1200 rows). I cannot find an example of how to search and filter the Spreadsheet to achieve what i require.
In my Spreadsheet I want to:
Search Column E for the value i enter in the GUI ie K11, which will narrow it down to ~10 Rows. I then want to search Column C for the string "Telephone" which will narrow it down to 2 Rows. I then want to extract to contents of these 2 rows in the B Column and assign each of them to variables.
Using the solution provided:
#Filter rows by column E (Station name) for results with cell = Asset No. (i.e "G04")
xlsx_filter1 = IP_Plan.index[IP_Plan['Station name'] == IPP2.get()].tolist()
IP_Plan=IP_Plan.loc[xlsx_filter1]
#Filter rows by column C (Type) for results with cell = Device Type (i.e "IP Telephone - Norphonic N-K1")
xlsx_filter2 = IP_Plan.index[IP_Plan['Type'] == "IP Telephone - Norphonic N-K1"].tolist()
IP_Plan=IP_Plan.loc[xlsx_filter2]
#File cell value by column B (IP address)
Output_IP_Address = IP_Plan["IP address"]
print(Output_IP_Address)
Produces this output upon the print command
I would like to use these two IP Addresses with my program so would like to obtain these values from the list without the index and assign them as separate variables how do i do this?
Output_IP_Address1 =
Output_IP_Address2 =
I require this so i can display the variables as a Label in the GUI (see GUI pic example shows 00.000.000.0) and use the variables in my Ping code to test and return result.
IP_Display_Nac = Label(IPP, text=Output_IP_Address1, anchor=W)
IP_Display_Tow = Label(IPP, text=Output_IP_Address2, anchor=W)
Try to use the pandas library to import the excel file, like so:
import pandas as pd
df = pd.read_excel("nameOfYourExcelFile.xlsx")
The variable df is now a so called dataframe object that is similar to an excel table. Here is some small headstart how to work with pandas Data Frames:
df.head()
this gives you the first few rows of the dataframe, by this you can check the structure of the dataframe and the names of the columns, for example.
df["Name of the desired Column"] # gives you the complete desired column as a vector
This only applies if you have headers in your column, like that:
In this example df["a"] would be [5,9,2,3].
If you have no headers in your file, then just import the excel file like so:
df = pd.read_excel("nameOfYourExcelFile.xlsx", header=None)
and call your column by a numbered index starting at 0, so column A would be df[0] and column E would be df[4], etc...
Other helpful functions:
df.iloc[1,:] # gives you the complete row with index 1
df.iloc[1,2] # gives you the item in row with index 1 and column with index 2
list(df) # gives you a list of all column header, in case you are in doubt which to take
Now here some example code how you could achieve your result:
indices_first_check = df.index[df['Name of your column E'] == "K11"].tolist()
This gives you a list of all row indices where the value in column E is "K11".
Then you can slice all other rows off:
df = df.iloc[indices_first_check,:]
Now get the indices for the "Telephone":
indices_second_check = df.index[df['Name of your column C'] == "Telephone"].tolist()
df = df.iloc[indices_second_check,:]
Now you can have your 2 desired values within a list:
list_desired_values = list(df["Name of your column B"])
Hope this helps.
edit: Changed last line so that Dataframe Column (Pandas Series object) ist being casted to a list.

Categories

Resources