Closing file after using to_csv() - python

I am new to python and so far I am loving the ipython notebook for learning. Am I using the to_csv() function to write out a pandas dataframe out to a file. I wanted to open the csv to see how it would look in excel and it would only open in read only mode because it was still in use by another How do I close the file?
import pandas as pd
import numpy as np
import statsmodels.api as sm
import csv
df = pd.DataFrame(file)
path = "File_location"
df.to_csv(path+'filename.csv', mode='wb')
This will write out the file no problem but when I "check" it in excel I get the read only warning. This also brought up a larger question for me. Is there a way to see what files python is currently using/touching?

This is the better way of doing it.
With context manager, you don't have to handle the file resource.
with open("thefile.csv", "w") as f:
df.to_csv(f)

#rpattiso
thank you.
try opening and closing the file yourself:
outfile = open(path+'filename.csv', 'wb')
df.to_csv(outfile)
outfile.close()

The newest pandas to_csv closes the file automatically when it's done.

Related

What do I have to change that Jupyter shows columns?

I just want to import this csv file. It can read it but somehow it doesn't create columns. Does anyone know why?
This is my code:
import pandas as pd
songs_data = pd.read_csv('../datasets/spotify-top50.csv', encoding='latin-1')
songs_data.head(n=10)
Result that I see in Jupyter:
P.S.: I'm kinda new to Jupyter and programming, but after all I found it should work properly. I don't know why it doesn't do it.
To properly load a csv file you should specify some parameters. for example in you case you need to specify quotechar:
df = pd.read_csv('../datasets/spotify-top50.csv',quotechar='"',sep=',', encoding='latin-1')
df.head(10)
If you still have a problem you should have a look at your CSV file again and also pandas documentation, so that you can set parameters to match with your CSV file structure.

Python pandas read_csv returning FileNotFoundError despite existing Mac

I am trying to read into a pandas dataframe from a csv. The data is in the format:
date,total_bytes
2018-08-27,1.84E+14
2018-08-30,1.90E+14
2018-08-31,1.93E+14
My code looks like:
from pandas import read_csv
from pandas import datetime
from matplotlib import pyplot
series =
read_csv(r'/Users/taylorjewell/Desktop/dataset_size_daily.csv',
header=0)
print(series.head())
series.plot()
pyplot.show()
Despite that path existing (I have checked countless times), I am getting a file not found exception for some reason:FileNotFoundError: File b'/Users/taylorjewell/Desktop/dataset_size_daily' does not exist
I am running this on a mac if that is relevant. Any help you are able to offer would be much appreciated!!
For file paths, I would suggest using pathlib:
from pathlib import Path
data_file = Path("/Users/taylorjewell/Desktop/dataset_size_daily.csv")
series = read_csv(data_file, header=0)
However, it also depends on where you are trying to access the file from.
i dont think you need to use the r bit for mac
try
read_csv('/Users/taylorjewell/Desktop/dataset_size_daily.csv',
header=0)
Just ran into this issue today and wanted to share-
If you download a CSV file to a mac
But then open the file and save it
The file extension changes to .numbers
So make sure you just move the file without opening it, and double-check that the file extension is .csv

Editing a csv file with python

Ok so I'm looking to create a program that will interact with an excel spreadsheet. The idea that seemed to work the most is converting it to a csv file. I've managed to make a program that prints the data but I want it to edit it and thus change the results in the csv file itself.
Sorry if it's a bit confusing as my programming skills aren't great.
Heres the code:
import csv
with open('wert.csv') as csvfile:
freq=csv.reader(csvfile, delimiter=',')
for row in freq:
print(row[0],row[1],row[2])
If anyone has a better idea on how to make this program work then it would be greatly appreciated.
Thanks
You could try using the pandas package, a widely used data analysis/manipulation library.
import pandas as pd
data = pd.read_csv('foo.csv')
#change data here, see pandas documentation
data.to_csv('bar.csv')
You can find the docs here
If you csv file is composed of just numbers (floats) or numbers and a header, you can try reading it with:
import numpy as np
data=np.genfromtxt('name.csv',delimiter=',',skip_header=1)
Then modify your data in python, and save it with:
data_modified=data**2 #for example
np.savetxt('name_modified.csv',data_modified,delimiter=',',header='whaterverheader,you,want')
You can read the excel file directly using pandas and do the processing directly
import pandas
measured_data = pandas.read_excel(filename)

CParserError: Error tokenizing data

I'm having some trouble reading a csv file
import pandas as pd
df = pd.read_csv('Data_Matches_tekha.csv', skiprows=2)
I get
pandas.io.common.CParserError: Error tokenizing data. C error: Expected 1 fields in line 526, saw 5
and when I add sep=None to df I get another error
Error: line contains NULL byte
I tried adding unicode='utf-8', I even tried CSV reader and nothing works with this file
the csv file is totally fine, I checked it and i see nothing wrong with it
Here are the errors I get:
In your actual code, the line is:
>>> pandas.read_csv("Data_Matches_tekha.xlsx", sep=None)
You are trying to read an Excel file, and not a plain text CSV which is why things are not working.
Excel files (xlsx) are in a special binary format which cannot be read as simple text files (like CSV files).
You need to either convert the Excel file to a CSV file (note - if you have multiple sheets, each sheet should be converted to its own csv file), and then read those.
You can use read_excel or you can use a library like xlrd which is designed to read the binary format of Excel files; see Reading/parsing Excel (xls) files with Python for for more information on that.
Use read_excel instead read_csv if Excel file:
import pandas as pd
df = pd.read_excel("Data_Matches_tekha.xlsx")
I have encountered the same error when I used to_csv to write some data and then read it in another script. I found an easy solution without passing by pandas' read function, it's a package named Pickle.
You can download it by typing in your terminal
pip install pickle
Then you can use for writing your data (first) the code below
import pickle
with open(path, 'wb') as output:
pickle.dump(variable_to_save, output)
And finally import your data in another script using
import pickle
with open(path, 'rb') as input:
data = pickle.load(input)
Note that if you want to use, when reading your saved data, a different python version than the one in which you saved your data, you can precise that in the writing step by using protocol=x with x corresponding to the version (2 or 3) aiming to use for reading.
I hope this can be of any use.

How do I open a CSV file in IPython on Windows

When I use Python for data analysis I want to open a csv file in IPython. When I use this statement:
In [92]: open('ch06/ex1.csv').read()
Out[92]: 'something,a,b,c,d,message\none,1,2,3,4,NA\ntwo,5,6,,8,world\nthree,9,10,11,12,foo'
I directly open the file. How do I open the file as a table?
I'm not sure what you are asking but if you're into data analysis you should learn pandas:
import pandas as pd
myFile = pd.read_csv('ch06/ex1.csv')
myFile.head()

Categories

Resources