Print beauty dataframe in Jupyter Notebook

Print beauty dataframe in Jupyter Notebook - python

I would like to beautify my print output for a dataframe.
d = {'Carid ': [1, 2, 3], 'Carname': ['Mercedes-Benz', 'Audi', 'BMW'], 'model': ['S-Klasse AMG 63s', 'S6', 'X6 M-Power']}
df = pd.DataFrame(data=d)
print(df.head())
df.head()
As you can see the print-outpot not beauty. The last statement of df.head() is beauty.
Is there any option to get the same result in the print-statement in Jupyter Notebook?

Use display instead of print.
display(df.head())

Sure there is. You'll want to use Jupyter's built-in display, which you'll need to import first:
from IPython.display import display
Then, instead of print, use display:
display(df.head())
Source: https://tdhopper.com/blog/printing-pandas-data-frames-as-html-in-jupyter-notebooks

Related

qgrid not showing output Python

I am trying to run the below mentioned query. It was also executed successfully but is not showing any kind of output. I am totally clueless about the same thing.
Why is this happening?
import pandas as pd
import qgrid
df = pd.DataFrame({'A': [1.2, 'foo', 4], 'B': [3, 4, 5]})
df = df.set_index(pd.Index(['bar', 7, 3.2]))
view = qgrid.show_grid(df, grid_options={'fullWidthRows': True}, show_toolbar=True)
view
also see My attached Screen shot for the same.

Python Pandas .str.extract method fails when indexing

I'd like to set values on a slice of a DataFrame using .loc using pandas str extract method .str.extract() however, it's not working due to indexing errors. This code works perfectly if I swap extract with contains.
Here is a sample frame:
import pandas as pd
df = pd.DataFrame(
{
'name': [
'JUNK-0003426', 'TEST-0003435', 'JUNK-0003432', 'TEST-0003433', 'TEST-0003436',
],
'value': [
'Junk', 'None', 'Junk', 'None', 'None',
]
}
)
Here is my code:
df.loc[df["name"].str.startswith("TEST"), "value"] = df["name"].str.extract(r"TEST-\d{3}(\d+)")
How can I set the None values to the extracted regex string

Hmm the problem seems to be that .str.extract returns a pd.DataFrame, you can .squeeze it to turn it into a series and it seems to work fine:
df.loc[df["name"].str.startswith("TEST"), "value"] = df["name"].str.extract(r"TEST-\d{3}(\d+)").squeeze()
indexing alignment takes care of the rest.

Instead of trying to get the group, you can replace the rest with the empty string:
df.loc[df['value']=='None', 'value'] = df.loc[df['value']=='None', 'name'].str.replace('TEST-\d{3}', '')
Was this answer helpful to your problem?

Here is a way to do it:
df.loc[df["name"].str.startswith("TEST"), "value"] = df["name"].str.extract(r"TEST-\d{3}(\d+)").loc[:,0]
Output:
name value
0 JUNK-0003426 Junk
1 TEST-0003435 3435
2 JUNK-0003432 Junk
3 TEST-0003433 3433
4 TEST-0003436 3436

How to convert datatype of the columns?

I picked up part of the code from here and expanded a bit. However, I am not able to convert the datatypes of Basket & Count columns for further processing.
for e.g., Basket and Count columns are int64, I would like to change them to float64.
import ipywidgets as widgets
from IPython.display import display, clear_output
# creating a DataFrame
df = pd.DataFrame({'Basket': [1, 2, 3],
'Name': ['Apple', 'Orange',
'Count'],
'id': [111, 222,
333]})
vardict = df.columns
select_variable = widgets.Dropdown(
options=vardict,
value=vardict[0],
description='Select variable:',
disabled=False,
button_style=''
)
def get_and_plot(b):
clear_output
s = select_variable.value
col_dtype = df[s].dtypes
print(col_dtype)
display(select_variable)
select_variable.observe(get_and_plot, names='value')
Thanks in advance.

Extract the country zip code based on the full country code - DataFrame Python

I have in my data information about place as full post code for example CZ25145. I would like to create new column for this with value CZ. How to do this?
I have this:
import pandas as pd
df = pd.DataFrame({
'CODE_LOAD_PLACE' : ['PL43100', 'CZ25905', 'DE29333', 'DE29384', 'SK92832']
},)
I would like to get it like below:
df = pd.DataFrame({
'CODE_LOAD_PLACE' : ['PL43100', 'CZ25905', 'DE29333', 'DE29384', 'SK92832'],
'COUNTRY_LOAD_PLACE' : ['PL', 'CZ', 'DE', 'DE', 'SK']
},)
I try use .factorize and .groupby but no positive final effect.

Use .str and select the first 2 characters:
df["COUNTRY_LOAD_PLACE"] = df["CODE_LOAD_PLACE"].str[:2]

Extract data from specific format in Pandas DF

I have a raw data in csv format which looks like this:
product-name brand-name rating
["Whole Wheat"] ["bb Royal"] ["4.1"]
Expected output:
product-name brand-name rating
Whole Wheat bb Royal 4.1
I want this to affect every entry in my dataset. I have 10,000 rows of data. How can I do this using pandas?
Can we do this using regular expressions? Not sure how to do it.
Thank you.
Edit 1:
My data looks something like this:
df = {
'product-name': [
[""'Whole Wheat'""], [""'Milk'""] ],
'brand-name': [
[""'bb Royal'""], [""'XYZ'""] ],
'rating': [
[""'4.1'""], [""'4.0'""] ]
}
df_p = pd.DataFrame(data=df)
It outputs like this: ["bb Royal"]
PS: Apologies for my programming. I am quite new to programming and also to this community. I really appreciate your help here :)

IIUC select first values of lists:
df = df.apply(lambda x: x.str[0])
Or if values are strings:
df = df.replace('[\[\]]', '', regex=True)

You can use the explode function
df = df.apply(pd.Series.explode)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Print beauty dataframe in Jupyter Notebook - python

Use display instead of print. display(df.head())

Sure there is. You'll want to use Jupyter's built-in display, which you'll need to import first: from IPython.display import display Then, instead of print, use display: display(df.head()) Source: https://tdhopper.com/blog/printing-pandas-data-frames-as-html-in-jupyter-notebooks

Related

qgrid not showing output Python

Python Pandas .str.extract method fails when indexing

How to convert datatype of the columns?

Extract the country zip code based on the full country code - DataFrame Python

Extract data from specific format in Pandas DF

Categories

Resources