Editing a csv file with python - python

Ok so I'm looking to create a program that will interact with an excel spreadsheet. The idea that seemed to work the most is converting it to a csv file. I've managed to make a program that prints the data but I want it to edit it and thus change the results in the csv file itself.
Sorry if it's a bit confusing as my programming skills aren't great.
Heres the code:
import csv
with open('wert.csv') as csvfile:
freq=csv.reader(csvfile, delimiter=',')
for row in freq:
print(row[0],row[1],row[2])
If anyone has a better idea on how to make this program work then it would be greatly appreciated.
Thanks

You could try using the pandas package, a widely used data analysis/manipulation library.
import pandas as pd
data = pd.read_csv('foo.csv')
#change data here, see pandas documentation
data.to_csv('bar.csv')
You can find the docs here

If you csv file is composed of just numbers (floats) or numbers and a header, you can try reading it with:
import numpy as np
data=np.genfromtxt('name.csv',delimiter=',',skip_header=1)
Then modify your data in python, and save it with:
data_modified=data**2 #for example
np.savetxt('name_modified.csv',data_modified,delimiter=',',header='whaterverheader,you,want')

You can read the excel file directly using pandas and do the processing directly
import pandas
measured_data = pandas.read_excel(filename)

Related

While parsing .ods file with pandas using read_excel() (and odf), how to drop comments in cells?

I'm trying to parse .ods files with pandas, using pd.read_excel() function, which uses odf under the hood. The problem I face is simple: some cells have comments, and pandas treat them as if they were some regular content.
Here is a basic example ; given a very simple .ods file with a single comment:
Importing that file into a dataframe using
import pandas as pd
df = pd.read_excel("example_with_comment.ods")
gives:
while I would have liked to retrieve the content of the cell only.
Does anyone know how to drop the comments during parsing ?
I'm using pandas 1.3.4.
Thanks a lot to anyone who could give me a hint !
It seems like a bug. You may try, instead of read_excel, to use this module:
https://pypi.org/project/pandas-ods-reader/

Options for creating csv structure I can work with

My task is to take an output from a machine, and convert that data to json. I am using python, but the issue is the structure of the output.
From my research online, csv usually has the first row with the keys and the values in the same order underneath. Example: https://e.nodegoat.net/CMS/upload/guide-import_person_csv_notepad.png
However, the output from my machine doesn't look like this.
Mine looks like:
Date:,10/10/2015
Name:,"Company name"
Location:,"Company location"
Serial num:,"Serial number"
So the machine i'm working with is outputting each result on a new .dat file instead of appending onto a single csv with the row of keys and whatnot. Technically, yes the data is separated with csv, but not sure how to work with this.
How should I go about turning this kind of data to json? Should I look into restructuring the data to the default csv? Or is there a way I can work with this and not do any cleanup to convert this? In either case, any direction is appreciated.
You can try transpose using pandas
import pandas as pd
from io import StringIO
data = '''\
Date:,10/10/2015
Name:,"Company name"
Location:,"Company location"
Serial num:,"Serial number"
'''
f = StringIO(data)
df = pd.read_csv(f)
t = df.set_index(df.columns[0]).T
print(t['Location:'][0])
print(t['Serial num:'][0])

What do I have to change that Jupyter shows columns?

I just want to import this csv file. It can read it but somehow it doesn't create columns. Does anyone know why?
This is my code:
import pandas as pd
songs_data = pd.read_csv('../datasets/spotify-top50.csv', encoding='latin-1')
songs_data.head(n=10)
Result that I see in Jupyter:
P.S.: I'm kinda new to Jupyter and programming, but after all I found it should work properly. I don't know why it doesn't do it.
To properly load a csv file you should specify some parameters. for example in you case you need to specify quotechar:
df = pd.read_csv('../datasets/spotify-top50.csv',quotechar='"',sep=',', encoding='latin-1')
df.head(10)
If you still have a problem you should have a look at your CSV file again and also pandas documentation, so that you can set parameters to match with your CSV file structure.

How to convert a csv file to a dataframe in Python 3.6 [duplicate]

This question already exists:
Reading CSV files in Python, using Jupyter Notebook through IntelliJ IDEA
Closed 4 years ago.
Im trying to tackle the Kaggle Titanic challenge. Bear with me, as Im fairly new to data science. I was previously struggling to get the following syntax to work: my previous question(Reading CSV files in Python 3.6, using IntelliJ IDEA)
Reading CSV files in Python, using Jupyter Notebook through IntelliJ IDEA
import numpy as np
import pandas as pd
from pandas import Series,Dataframe
titanic_df = pd.read_csv('train.csv')
titanic.head()
However, using the below code, I am able to open the file and read it/print its contents, but i need to convert the data to a dataframe so that it can be worked with. Any suggestions?
file_path = '/Volumes/LACIE SETUP/Data_Science/Data_Analysis_Viz_InPython/Example_Projects/train.csv'
with open(file_path) as train_fp:
for line in train_fp:
# print(line)
This above code was able to print out the data but when I tried passing
'file_path' to:
titanic_df = pd.read_csv('file_path.csv')
i received the same error as before. Not sure what Im doing wrong. I KNOW the file 'train.csv' exists in that location because 1) i put it there and 2) its contents can be printed when pointed to its location.
So what the heck am I doing wrong??? :/
read_csv will create a Pandas DataFrame. So, as long as your file path is right, this following code should work. Also, make sure to use the file_path variable and not the string "file_path.csv"
import pandas as pd
file_path = '/Volumes/LACIE SETUP/Data_Science/Data_Analysis_Viz_InPython/Example_Projects/train.csv'
titanic_df = pd.read_csv(file_path)
titanic_df.head()

Closing file after using to_csv()

I am new to python and so far I am loving the ipython notebook for learning. Am I using the to_csv() function to write out a pandas dataframe out to a file. I wanted to open the csv to see how it would look in excel and it would only open in read only mode because it was still in use by another How do I close the file?
import pandas as pd
import numpy as np
import statsmodels.api as sm
import csv
df = pd.DataFrame(file)
path = "File_location"
df.to_csv(path+'filename.csv', mode='wb')
This will write out the file no problem but when I "check" it in excel I get the read only warning. This also brought up a larger question for me. Is there a way to see what files python is currently using/touching?
This is the better way of doing it.
With context manager, you don't have to handle the file resource.
with open("thefile.csv", "w") as f:
df.to_csv(f)
#rpattiso
thank you.
try opening and closing the file yourself:
outfile = open(path+'filename.csv', 'wb')
df.to_csv(outfile)
outfile.close()
The newest pandas to_csv closes the file automatically when it's done.

Categories

Resources