Read columns with brackets [duplicate] - python

This question already has answers here:
Pandas column access w/column names containing spaces
(6 answers)
Closed last year.
I'm trying to read a column named Goods_Issue_Date_(GID)
How can I read this?
I tried:
Df.Goods_Issue_Date_(GID)
Returns Invalid Syntax

Using the following dataframe as an example
data = [['Carrots', "Tuesday"], ['Apples', "Monday"], ['Pears', "Sunday"]]
df = pd.DataFrame(data, columns = ['Product', 'Goods_Issue_Date_(GID)'])
df.head()
Product Goods_Issue_Date_(GID)
0 Carrots Tuesday
1 Apples Monday
2 Pears Sunday
You can select the Goods_Issue_Date_(GID) column like so
df['Goods_Issue_Date_(GID)']
0 Tuesday
1 Monday
2 Sunday
Name: Goods_Issue_Date_(GID), dtype: object

Related

String slice of a column in a datframe [duplicate]

This question already has answers here:
Pandas make new column from string slice of another column
(3 answers)
Closed 4 months ago.
data = [['Tom', '5-123g'], ['Max', '6-745.0d'], ['Bob', '5-900.0e'], ['Ben', '2-345',], ['Eva', '9-712.x']]
df = pd.DataFrame(data, columns=['Person', 'Action'])
I want to shorten the "Action" column to a length of 5. My current df has two columns:
['Person'] and ['Action']
I need it to look like this:
person Action Action_short
0 Tom 5-123g 5-123
1 Max 6-745.0d 6-745
2 Bob 5-900.0e 5-900
3 Ben 2-345 2-345
4 Eva 9-712.x 9-712
What I´ve tried was:
Checking the type of the Column
df['Action'].dtypes
The output is:
dtype('0')
Then I tried:
df['Action'] = df['Action'].map(str)
df['Action_short'] = df.Action.str.slice(start=0, stop=5)
I also tried it with:
df['Action'] = df['Action'].astype(str)
df['Action'] = df['Action'].values.astype(str)
df['Action'] = df['Action'].map(str)
df['Action'] = df['Action'].apply(str)```
and with:
df['Action_short'] = df.Action.str.slice(0:5)
df['Action_short'] = df.Action.apply(lambda x: x[:5])
df['pos'] = df['Action'].str.find('.')
df['new_var'] = df.apply(lambda x: x['Action'][0:x['pos']],axis=1)
The output from all my versions was:
person Action Action_short
0 Tom 5-123g 5-12
1 Max 6-745.0d 6-745
2 Bob 5-900.0e 5-90
3 Ben 2-345 2-34
4 Eva 9-712.x 9-712
The lambda funktion is not working with 3-222 it sclices it to 3-22
I don't get it why it is working for some parts and for others not.
Try this:
df['Action_short'] = df['Action'].str.slice(0, 5)
By using .str on a DataFrame or a single column of a DataFrame (which is a pd.Series), you can access pandas string manipulation methods that are designed to look like the string operations on standard python strings.
# slice by specifying the length you need
df['Action_short']=df['Action'].str[:5]
df
Person Action Action_short
0 Tom 5-123g 5-123
1 Max 6-745.0d 6-745
2 Bob 5-900.0e 5-900
3 Ben 2-345 2-345
4 Eva 9-712.x 9-712

How to transfer rows to columns in a DataFrama using Python [duplicate]

This question already has answers here:
How can I pivot a dataframe?
(5 answers)
Closed 1 year ago.
I need some help
I have the follow CSV file with this Data Frame:
how could I transfer the data of cases in columns week 1, week 2 (...) using Python and Pandas?
It would be something like this:
x = (
df.pivot_table(
index=["city", "population"],
columns="week",
values="cases",
aggfunc="max",
)
.add_prefix("week ")
.reset_index()
.rename_axis("", axis=1)
)
print(x)
Prints:
city population week 1 week 2
0 x 50000 5 10
1 y 88000 2 15

How to skip Column title row in Pandas DataFrame [duplicate]

This question already has answers here:
Prevent pandas read_csv treating first row as header of column names
(4 answers)
Closed 3 years ago.
How to skip Column title row in Pandas DataFrame
My Code:
sample = pd.DataFrame(pd.read_csv('Fremont TMY_Sample_Original.csv', `Import csv`low_memory=False))
sample_header = sample.iloc[:1, 0:20] `Wants to separate first two row because these are different data at start `
sample2 = sample[sample.iloc[:, 0:16] `wants to take required data for next process`
sample2 = ('sample2', (header=False)) `Trying to skip column title row`
print(sample2)
expected output:
its an example
Data for all year (This row I wants to remove and Remaining I wants to keep)
Date Time(Hour) WindSpeed(m/s)
0 5 1 10
1 4 2 17
2 6 3 16
3 7 4 11
This should work
df = pd.read_csv("yourfile.csv", header = None)

python pandas groupby unexpected empty column [duplicate]

This question already has answers here:
How to assign a name to the size() column?
(5 answers)
Closed 3 years ago.
I want to aggregate some data to append to a dataframe. The following gives me the number of wins per name
import pandas as pd
data = [[1,'tom', 10], [1,'nick', 15], [2,'juli', 14], [2,'peter', 20], [3,'juli', 3], [3,'peter', 13]]
have = pd.DataFrame(data, columns = ['Round', 'Winner', 'Score'])
WinCount= have.groupby(['Winner']).size().to_frame('WinCount')
WinCount
, but the output does not give me two columns, named Winner and WinCount. In stead, the first column has no name, and the column name then appears on the second line:
How can I get a dataframe without these two "blank" fields
Try this
WinCount=have.groupby(['Winner']).size().to_frame('WinCount').reset_index()
Output
Winner WinCount
0 juli 2
1 nick 1
2 peter 2
3 tom 1

Python: combining two columns [duplicate]

This question already has answers here:
Combine two columns of text in pandas dataframe
(21 answers)
Closed 5 years ago.
I have two columns, one has the year, and another has the month data, and I am trying to make one column from them (containing year and month).
Example:
click_year
-----------
2016
click_month
-----------
11
I want to have
YearMonth
-----------
201611
I tried
date['YearMonth'] = pd.concat((date.click_year, date.click_month))
but it gave me "cannot reindex from a duplicate axis" error.
Bill's answer on the post might be what you are looking for.
import pandas as pd
df = pd.DataFrame({'click_year': ['2014', '2015'], 'click_month': ['10', '11']})
>>> df
click_month click_year
0 10 2014
1 11 2015
df['YearMonth'] = df[['click_year','click_month']].apply(lambda x : '{}{}'.format(x[0],x[1]), axis=1)
>>> df
click_month click_year YearMonth
0 10 2014 201410
1 11 2015 201511

Categories

Resources