DataFrame Pandas Python select data from date

DataFrame Pandas Python select data from date - python

df1 = df1[df1['TIME STAMP'].between('2021-01-27 00:00:00', '2021-10-10 23:59:59')]
The above code is selecting a dataframe from two specific dates and it works fine.
I want to select a from date and to date (infinity/the last date of dataframe) or any option to select only from date.

You can use comparison operators between timestamps:
import pandas as pd
df1 = df1[df1['TIME STAMP'] >= pd.Timestamp('2021-01-27 00:00:00')]

Related

Removing time from a date column in pandas

I have pandas data frame that had a Date (string) which i could convert and set it up as a index using the set_index and to_datetime functions
usd2inr_df.set_index(pd.to_datetime(usd2inr_df['Date']), inplace=True)
but the resulting dataframe has the time portion which i wanted to remove ...
2023-02-14 00:00:00
I wanted to have it as 2023-02-14
How do i setup the call such that, i can get have the date without the time portion as a index on my dataframe
usd2inr_df['Date'] = pd.to_datetime(usd2inr_df['Date']).dt.normalize()
usd2inr_df.set_index(usd2inr_df['date'])

Using the .to_datetime() method, converts a Series to a pandas datetime object.
Using the Series.dt.date, returns a 'yyyy-mm-dd' date form.
Using the DataFrame.index, sets the index of the dataFrame.
import pandas as pd
# create a dataFrame as an example
df = pd.DataFrame({'Name': ['Example'],'Date': ['2023-02-14 10:01:11']})
print(df)
# convert 'yyyy-mm-dd hh:mm:ss' to 'yyyy-mm-dd'.
df['Date'] = pd.to_datetime(df['Date']).dt.date
# set 'Date' as index
df.index = df['Date']
print(df)
Output
Name Date
0 Example 2023-02-14 10:01:11
-------------------------------------------------------
Name Date
Date
2023-02-14 Example 2023-02-14

How can i take specific Months out from a Column in python

I have a dataframe that has a column 'mon/yr' that has month and year stored in this format Jun/19 , Jan/22,etc.
I want to Extract only these from that column - ['Jul/19','Oct/19','Jan/20','Apr/20','Jul/20','Oct/20','Jan/21','Apr/21','Jul/21','Oct/21','Jan/22']
and put them into a variable called 'dates' so that I can use it for plotting
My code which does not work -
dates = df["mon/yr"] == ['Jul/19','Oct/19','Jan/20','Apr/20','Jul/20','Oct/20','Jan/21','Apr/21','Jul/21','Oct/21','Jan/22']
This is a python code

this is how to filter rows
df.loc[df['column_name'].isin(some_values)]

Using your dates list, if we wanted to extract just 'Jul/20' and 'Oct/20' we can do:
import pandas as pd
df = pd.DataFrame(['Jul/19','Oct/19','Jan/20','Apr/20','Jul/20','Oct/20','Jan/21','Apr/21','Jul/21','Oct/21','Jan/22'], columns = ['dates'])
mydates = ['Jul/20','Oct/20']
df.loc[df['dates'].isin(mydates)]
which produces:
dates
4 Jul/20
5 Oct/20
So, for your actual use case, assuming that df is a pandas dataframe, and mon/yr is the name of the column, you can do:
dates = df.loc[df['mon/yr'].isin(['Jul/19','Oct/19','Jan/20','Apr/20','Jul/20','Oct/20','Jan/21','Apr/21','Jul/21','Oct/21','Jan/22'])]

How can I find the first entry of each day if i have a timestamp column as MM/DD/YY HH:MM as 24hours clock I'm excel using pandas/python/excel?

My table has a timestamp column with entries of date in the format of MM/DD/YY HH:MM 24HOURS CLOCK.
I'm supposed to find the salesperson who does the first entry of each day using this column. I also need to save the first entries in a xlsx file as an output.

I would use Pandas library. You can use Pandas to search into a column and save the result as a CSV file. Could be something like this:
import pandas as pd
# Open the file:
pd = pd.read_csv('data.csv')
# Search for specific range of date within the date column:
df = df[(df['date'] > '2022-01-01 00:00') & (df['date'] < '2022-02-01 00:00')]
# Save the result:
df.to_csv(index=False)

Add "days_since_epoch" column to Pandas TimeSeries DataFrame

A DataFrame has Date as Index. I need to add a column, value of the column should be days_since_epoch. This value can be calculated with
(date_value - datetime.datetime(1970,1,1)).days
How can this value be calculated for all rows in dataframe ?
Following code demonstrate the operation with a sample DataFrame, is there a better way of doing this ?
import pandas as pd
date_range = pd.date_range(start='1/1/1970', end='12/31/2018', freq='D')
df = pd.DataFrame(date_range, columns=['date'])
df['days_since_epoch']=range(0,len(df))
df = df.set_index('date')
Note : this is an example, dates in DataFrame need not start from 1st Jan 1970.

Subtract from Datetimeindex scalar and then call TimedeltaIndex.days:
df['days_since_epoch1']= (df.index - pd.Timestamp('1970-01-01')).days

Pandas: Select rows that are null or earlier than today?

I have to implement the equivalent of the following SQL in a Pandas dataframe:
select * from table where ISNULL(date, GETDATE()) >= as_of_date
Basically, I want to select the rows where the value of date is more than as_of_date. There are some rows where date is null, and in those cases, I want to only select those rows if as_of_date is less than or equal to today's date.
Is there a way to do this in Pandas?

You might need:
from datetime import date
df[df.date.fillna(date.today()) >= as_of_date]
You also need to make sure date column and as_of_date are both datetime objects, if not, use pd.to_datetime() to convert:
df['date'] = pd.to_datetime(df.date)
as_of_date = pd.to_datetime(as_of_date)

df[(df['date'] < datetime.now().date()) & (df['date'] == None)]
But note this is just an example if you provide same code and df I can help you with greater details.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

DataFrame Pandas Python select data from date - python

df1 = df1[df1['TIME STAMP'].between('2021-01-27 00:00:00', '2021-10-10 23:59:59')] The above code is selecting a dataframe from two specific dates and it works fine. I want to select a from date and to date (infinity/the last date of dataframe) or any option to select only from date.

You can use comparison operators between timestamps: import pandas as pd df1 = df1[df1['TIME STAMP'] >= pd.Timestamp('2021-01-27 00:00:00')]

Related

Removing time from a date column in pandas

How can i take specific Months out from a Column in python

How can I find the first entry of each day if i have a timestamp column as MM/DD/YY HH:MM as 24hours clock I'm excel using pandas/python/excel?

Add "days_since_epoch" column to Pandas TimeSeries DataFrame

Pandas: Select rows that are null or earlier than today?

Categories

Resources