How to read a Time-series data in Python 3.7 - python

So today, I started with time-series data using Python. First, I tried reading the time series data from a CSV file by using the panda library pd
Unfortunately, I keep getting this error? Any help on this would be highly appreciated.
PS: I am using Python 3.73
address = 'C:/Users/Anih John/Desktop/Python-workstation/ff/Superstore-Sales.csv'
Superstore = pd.read_csv(address, index_col='Order Date',parse_dates=True)
print(Superstore)
Then I get the following error:
Unable to open 'parsers.pyx': Unable to read file (Error: File not found (c:\users\anih john\desktop\python-workstation\ff\pandas\_libs\parsers.pyx)).

I would try with the following commands:
import pandas as pd
address = 'C:\\Users\\Anih John\\Desktop\\Python-workstation\\ff\\Superstore-Sales.csv'
Superstore = pd.read_csv(address, index_col='Order Date',parse_dates=True, engine='python',sep=',')
print(Superstore)
EDIT: Or simply using the file in the read_csv function.
import pandas as pd
Superstore = pd.read_csv('C:\\Users\\Anih John\\Desktop\\Python-workstation\\ff\\Superstore-Sales.csv', index_col='Order Date',parse_dates=True, engine='python',sep=',')
print(Superstore)

Related

Python: Reading tdms files using Python npTDMS and creating a Pandas dataframe

I'm able to read a labview .tdms file using Python npTDMS package, could read metadata and sample data for groups and channels as well.
However, the file has timestamp values with year '9999'. Hence getting the following error while converting to a pandas dataframe:
OutOfBoundsDatetime: Out of bounds nanosecond timestamp:.
I went through the documentation in:
https://nptdms.readthedocs.io/en/stable/apireference.html#nptdms.TdmsFile.as_dataframe; however, couldn't find an option to deal with this data situation.
Tried passing errors='coerce' while calling as.dataframe() didn't work either. Any pointers or directions to read the .tdms file to a pandas dataframe, with this data situation, would be very helpful.
Changing the data at the source is not an option.
Code snippet to read tdms file:
import numpy as np
import pandas as pd
from nptdms import TdmsFile as td
tdms_file = td.read(<tdms file name>)
tdms_file_df = tdms_file.as_dataframe()
Error while creating a pandas dataframe

Basic Importing Excel Documents Into Python

I'm a new Python user and am simply trying to export an Excel (or CSV) file into Jupyter Notebook to play around with.
From google searching, the common code I see is something like the below:
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
df = pd.read_excel('File.xlsx', sheetname='Sheet1')
print("Column headings:")
print(df.columns)
I tried this with a CSV file and got the below error message:
File "", line 5
df = pd.read_excel(C:\Users\dhauge1\Desktop\Python Workshop\fortune500.csv, sheetname=fortune500)
^ SyntaxError: invalid syntax
Please see above for error message. Is anyone able to help me understand what I'm doing wrong?
when reading a csv file use the comand pd.read_csv('filename')

How to convert a csv file to a dataframe in Python 3.6 [duplicate]

This question already exists:
Reading CSV files in Python, using Jupyter Notebook through IntelliJ IDEA
Closed 4 years ago.
Im trying to tackle the Kaggle Titanic challenge. Bear with me, as Im fairly new to data science. I was previously struggling to get the following syntax to work: my previous question(Reading CSV files in Python 3.6, using IntelliJ IDEA)
Reading CSV files in Python, using Jupyter Notebook through IntelliJ IDEA
import numpy as np
import pandas as pd
from pandas import Series,Dataframe
titanic_df = pd.read_csv('train.csv')
titanic.head()
However, using the below code, I am able to open the file and read it/print its contents, but i need to convert the data to a dataframe so that it can be worked with. Any suggestions?
file_path = '/Volumes/LACIE SETUP/Data_Science/Data_Analysis_Viz_InPython/Example_Projects/train.csv'
with open(file_path) as train_fp:
for line in train_fp:
# print(line)
This above code was able to print out the data but when I tried passing
'file_path' to:
titanic_df = pd.read_csv('file_path.csv')
i received the same error as before. Not sure what Im doing wrong. I KNOW the file 'train.csv' exists in that location because 1) i put it there and 2) its contents can be printed when pointed to its location.
So what the heck am I doing wrong??? :/
read_csv will create a Pandas DataFrame. So, as long as your file path is right, this following code should work. Also, make sure to use the file_path variable and not the string "file_path.csv"
import pandas as pd
file_path = '/Volumes/LACIE SETUP/Data_Science/Data_Analysis_Viz_InPython/Example_Projects/train.csv'
titanic_df = pd.read_csv(file_path)
titanic_df.head()

Read SAS data into Python

I am tying to read SAS dataset using Python but this is showing an error:
"IndexError list assignment index out of range"
I am not sure what could be the reason. Can anyone help me out?
Following is the code where I am trying to read SAS data (which is in multi millions) into Python:
import pandas as pd
import numpy as np
from sas7bdat import SAS7BDAT
with SAS7BDAT('/dat_xyz/mdpqr/data_test.sas7bdat') as m:
mdata = m.to_data_frame()
Let me know the solution.
Thanks,
Surya

Pandas excel reading buffer error (python 3)

I am having a problem reading an excel file from a download link using pandas. The excelString below loads correctly and looks like an excel file, but when trying to convert it to excel using pandas it says the file name is too long. Any assistance would be appreciated. This is a useful generic problem to solve for anyone accessing iShares index membership info.
import urllib
import pandas as pd
f = urllib.request.urlopen('https://www.ishares.com/us/239714/fund-download.dl')
excelString = f.read().decode('utf-8')
pd.ExcelFile(excelString)
The Error returned is OSError: [Errno 36] File name too long
Works fine for me using Python3 and pandas 0.16.2 - do you have the latest version?

Categories

Resources