Iterate through defined Datetime index range in Pandas Dataframe - python

I can see from here how to iterate through a list of dates from a datetime index. However, I would like to define the range of dates using:
my_df['Some_Column'].first_valid_index()
and
my_df['Some_Column'].last_valid_index()
My attempt looks like this:
for today_index, values in range(my_df['Some_Column'].first_valid_index() ,my_df['Some_Column'].last_valid_index()):
print(today_index)
However I get the following error:
TypeError: 'Timestamp' object cannot be interpreted as an integer
How do I inform the loop to restrict to those specific dates?

I think you need date_range:
s = my_df['Some_Column'].first_valid_index()
e = my_df['Some_Column'].last_valid_index()
r = pd.date_range(s, e)
And for loop use:
for val in r:
print (val)
If need selecting rows in DataFrame:
df1 = df.loc[s:e]

Related

pandas Dataframe how to filter with Date as index

I would like to filter DataFrame by using datetime as index .
I put this code :
filt = (confirmed_df.index > '3/19/20')
confirmed_df.loc[filt]
and the result shows again rows which are before '3/19/20'
do you know what's wrong with this code ?
I think here is necessary create DatetimeIndex first:
confirmed_df.index = pd.to_datetime(confirmed_df.index, format='%m/%d/%y')
And then filter by string in format YYYY-MM-DD:
filt = (confirmed_df.index > '2020-03-19')

How to get value of dataframe?

I want to get value of dataframe for add to MySQL. This is my dataframe.
l_id = df['ID'].str.replace('PDF-', '').item()
print(type(l_id))
It show error like this.
ValueError: can only convert an array of size 1 to a Python scalar
If I not use .item() It cannot add to MySQL. How to get value of dataframe ?
Try using replace with nan instead of '' and remove nans and get actual item:
l_id = df['ID'].str.replace('PDF-', pd.np.nan).dropna().item()
There is no attribute .item() in dataframe, but you can do:
df = pd.DataFrame(['PDF-0A1','PDF-02B','PDF-03C'],columns=['ID']) #small dataframe to test
for ids in df.ID:
l_id = ids.replace('PDF-','')
print(l_id)
#0A1
#02B
#03C

Changing class pandas.Series to a list

Trying to change a column from an array that has type
to a list.
Tried changing it directly to a list, but it still comes up as a series after checking the type of it.
First I get the first 4 numbers to I can have just the year, then I create a new column in the table called year to hold that new data.
year = df['date'].str.extract(r'^(\d{4})')
df['year'] = pd.to_numeric(year)
df['year'].dtype
print(type(df['year']))
Want the type of 'year' to be a list. Thanks!
If you want to get a list with years values into date column, you could try this:
import pandas as pd
df = pd.DataFrame({'date':['2019/01/02', '2018/02/03', '2017/03/04']})
year = df.date.str.extract(r'(\d{4})')[0].to_list()
print(f'type: {type(year)}: {year}')
# type: <class 'list'>: ['2019', '2018', '2017']
df.date.str.extract returns a new DataFrame with one row for each subject string, and one column for each group, then we take the first (only) group [0]
It seems pretty straightforward to turn a series into a list. The builtin list function works fine:
> df = pd.DataFrame({'date':['2019/01/02', '2018/02/03', '2017/03/04']})
> dates = list(df['date'])
> type(dates)
< <class 'list'>
> dates
< ['2019/01/02', '2018/02/03', '2017/03/04']

python - TypeError: 'Timestamp' object is not iterable

I have a dataframe and take the first line of it.
df = pd.DataFrame({'Date': ['2019-01-02', '2019-01-03', '2019-01-04', '01/07/2019', '01/08/2019', '01/09/2019', '01/10/2019', '01/11/2019', '01/14/2019', '24/08/2019']})
df['Date'] = pd.to_datetime(df["Date"])
a = df['Date'].iloc[0]
I am trying to iterate through the df['Date'] to only include weekday's. Using a If statement with a pass give's none where it would be a weekend. I am now trying to use continue which I need a loop for. However each time I loop through it I either get TypeError: 'Timestamp' object is not iterable
def weekday_end(day_now):
d = day_now
for row in d:
if d.isoweekday()==6:
continue
elif d.isoweekday()==7:
continue
else:
return d
for rows in a:
df = weekday_end(rows)
print (df)
You are trying to iterate over a single value, where you wanted the whole column. That can be fixed by dropping the .iloc[0]:
a = df['Date']
That still won't let your code work, because your weekday_end() function will itself try to iterate over single values again, you only have to use:
for value in a:
if value.weekday < 6:
print(value)
However, it is much faster to use vectorised operations:
df['Date'][df['Date'].dt.weekday < 6]
That produces a series of weekday timestamps. If you wanted to select rows in the dataframe based on weekdays, apply the filter to the whole dataframe:
df[df['Date'].dt.weekday < 6]

ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series

I'm using Pandas 0.20.3 in my python 3.X. I want to add one column in a pandas data frame from another pandas data frame. Both the data frame contains 51 rows. So I used following code:
class_df['phone']=group['phone'].values
I got following error message:
ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series
class_df.dtypes gives me:
Group_ID object
YEAR object
Terget object
phone object
age object
and type(group['phone']) returns pandas.core.series.Series
Can you suggest me what changes I need to do to remove this error?
The first 5 rows of group['phone'] are given below:
0 [735015372, 72151508105, 7217511580, 721150431...
1 []
2 [735152771, 7351515043, 7115380870, 7115427...
3 [7111332015, 73140214, 737443075, 7110815115...
4 [718218718, 718221342, 73551401, 71811507...
Name: phoen, dtype: object
In most cases, this error comes when you return an empty dataframe. The best approach that worked for me was to check if the dataframe is empty first before using apply()
if len(df) != 0:
df['indicator'] = df.apply(assign_indicator, axis=1)
You have a column of ragged lists. Your only option is to assign a list of lists, and not an array of lists (which is what .value gives).
class_df['phone'] = group['phone'].tolist()
The error of the Question-Headline
"ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series"
might as well occur if for what ever reason the table does not have any rows.
Instead of using an if-statement, you can use set result_type argument of apply() function to "reduce".
df['new_column'] = df.apply(func, axis=1, result_type='reduce')
The data assigned to a column in the DataFrame must be a single dimension array. For example, consider a num_arr to be added to a DataFrame
num_arr.shape
(1, 126)
For this num_arr to be added to a DataFrame column, It should be reshaped....
num_arr = num_arr.reshape(-1, )
num_arr.shape
(126,)
Now I could set this arr as a DataFrame column
df = pd.DataFrame()
df['numbers'] = num_arr

Categories

Resources