I am processing a dataset with a date column in it. But the date format is strange to me:
date
59:06.4
42:42.9
07:18.0
......
I have never seen this format before. Could anyone let me know what this format is? and if I use python to process it, what functions I should use?
I think I know. This is the date + time format. When I read it in python. It automatically transfer into datetime format
Related
I have a String in the format of "MMM-YY" (ie) "jun-22","Jan-22" etc.
I want to convert it into Date with 01st Day of Month in the following format.
Jan-22 --> 01-Jan-22
Feb-21 --> 01-Feb-21
I have tried a few ways but couldn't get to the solution.
Can someone please advise on what is the quickest and most efficient way of doing this in a Pyspark Dataframe.
Code used could be pyspark or Python.
Thanks for the help. I was able to add "01-" at the beginning of the date string and converting it into a date.
I have data which is in-64 in the Index with values like "01/11/2018" in the index. It is data that has been imported from a csv. I am unable to convert it to a "01-11-2018" format. How do I do this because I get an error message:
'time data 0 does not match format '%Y' (match)'
I got the data from the following website:
https://www.nasdaq.com/symbol/spy/historical
and you can find a ' Download this file in Excel Format ' icon at the bottom.
import datetime
spyderdat.index = pd.to_datetime(spyderdat.index, format='%Y')
spyderdat.head()
How do I format this correctly?
Thanks a lot.
Your format string must match exactly:
import datetime
spyderdat.index = pd.to_datetime(spyderdat.index, format='%d/%m/%Y')
spyderdat.head()
Example w/o spyder:
import datetime
date = "1/11/2018"
print(datetime.datetime.strptime(date,"%d/%m/%Y"))
Output:
2018-11-01 00:00:00
You can strftime this datetime then anyhow you like. See link for formats. Or you store datetimes.
Assuming your input is a string, simply converting the / to - won't fix the issue.
The real problem is that you've told to_datetime to expect the input string to be only a 4-digit year but you've handed it an entire date, days and months included.
If you meant to use only the year portion you should manually extract the year first with something like split.
If you meant to use the full date as a value, you'll need to change your format to something like %d/%m/%Y. (Although I can't tell if your input is days first or months first due to their values.)
The easy way is to try this
datetime.datetime.strptime("01/11/2018", '%d/%m/%Y').strftime('%d-%m-%Y')
I have a python script that generates a datetime string using this line of code:
data['timestamp'] = datetime.isoformat(datetime.utcnow())
That generates something like the following:
2017-05-24T04:08:09.530033
How do I convert that to "MYSQL insertable" datetime format in a clean way?
Thanks!
Try to use MySQL's STR_TO_DATE() function to parse the string that you're attempting to insert.
I hope this may help you
You can specify any type of format like this depending on the one you `ve set in mysql
data['timestamp'] =pd.to_datetime(data['timestamp'] , format='%d%b%Y:%H:%M:%S.%f')
First off, it looks like you ran from datetime import * rather than import datetime. That's tempting because it lets you type less when you want to refer to parts of the module, but it can get you into name collision issues later. An alternative with less typing is something like import datetime as dt, that way later you can just use dt.datetime. This will make your code cleaner.
MySQL accepts several date formats, which can be read about in detail here. In particular:
The DATETIME type is used for values that contain both date and time
parts. MySQL retrieves and displays DATETIME values in YYYY-MM-DD HH:MM:SS format.
ISO8601 numbers look just like that! 2017-05-24T04:19:32
So if the only difference is the "T" in the middle instead of a space, just run something like this, assuming you don't change your import statements.
timestamp = str(datetime.isoformat(datetime.utcnow()))
timestamp = timestamp.replace("T", " ")
data['timestamp'] = timestamp
I have a pandas dataframe with a column containing a date; the format of the original string is YYYY/DD/MM HH:MM:SS.
I am trying to convert the string into a datetime format, by using
df['Date']=pd.to_datetime(df['Data'], errors='coerce')
but plotting it I can see it doesn't recognize the correct format.
Can you help me to understand whether there is an option to give python the correct format to read the column?
I have seen the format tag for to_datetime function, but I can't use it correctly.
Thanks a lot for your help!
Try this:
df['Date'] = pd.to_datetime(df['Data'], format='%Y/%d/%m %H:%M:%S')
It looks like you're using a non-standard date format. It should be YYYY-MM-DD. Try formating with the strptime() method.
time.strptime('2016/15/07', '%Y/%d/%m')
If you need to get it to a string after that use time.strftime().
I'm reading a date from an Excel cell in Python (using .Value on the cell)... the result that I get is:
07/06/10 00:00:00
I thought this was a string, and so went about trying to figure out how to convert this to the format I need ("yyyyMMdd", or "20100706" in this example). However, after some playing around I realized that it is not being pulled as a string... Running type() on it returns <type 'time'> .
I then assumed that it was a Python time object, and tried using strftime on it to convert it to a string... but that didn't work either. It doesn't recognize the strftime method on the value.
Any idea on how to parse this properly and get the format I want? What am I doing wrong? (And if this clearly contains a date as well as a time, why is Python automatically considering it a time object?)
You can convert it to a time object like this:
import time
time.strptime(str(thetime), '%m/%d/%y %H:%M:%S')
And then you should be able to manipulate it to your hearts content.
HTH
Have you tried str(time)? That should give it to you in a string and then you can play around with the formatting all you like.