How to insert data into a specific cell in csv with python? - python

I am trying to insert data into a specific cell in csv. My code is as follows.
The existing file.
Output
The data in cell A1("Custmor") is replaced with new data("Name").
My code is as follows.
import pandas as pd
#The existing CSV file
file_source = r"C:\Users\user\Desktop\Customer.csv"
#Read the existing CSV file
df = pd.read_csv(file_source)
#Insert"Name"into cell A1 to replace "Customer"
df[1][0]="Name"
#Save the file
df.to_csv(file_source, index=False)
And it doesn't work. Please help me finding the bug.

Customer is column header, you need do
df = df.rename(columns={'Customer': 'Name'})

I am assuming you are going to want to work with header less csv so if that's the case, your code is already correct, just need to add header=None while reading from csv
import pandas as pd
#The existing CSV file
file_source = r"C:\Users\user\Desktop\Customer.csv"
#Read the existing CSV file
df = pd.read_csv(file_source,header=None) #notice this line is now different
#Insert"Name"into cell A1 to replace "Customer"
df[1][0]="Name"
#Save the file
df.to_csv(file_source, index=False,header=None) #made this header less too

Related

Attempting to add a column heading to the newly created csv file

I'm trying to add the add the header to my csv file that I created in the code given below:
There's only 1 column in the csv file that I'm trying to create,
the data frame consists of an array, the array is
[0.6999346, 0.6599296, 0.69770324, 0.71822715, 0.68585426, 0.6738229, 0.70231324, 0.693281, 0.7101939, 0.69629824]
i just want to create a csv file with header like this
Desired csv File , I want my csv file in this format
Please help me with detailed code, I'm new to coding.
I tried this
df = pd.DataFrame(c)
df.columns = ['Confidence values']
pd.DataFrame(c).to_csv('/Users/sunny/Desktop/objectdet/final.csv',header= True , index= True)
But i'm getting this csv file
Try this
import pandas as pd
array = [0.6999346, 0.6599296, 0.69770324, 0.71822715, 0.68585426, 0.6738229, 0.70231324, 0.693281, 0.7101939, 0.69629824]
df = pd.DataFrame(array)
df.columns = ['Confidence values']
df.to_csv('final.csv', index=True, header=True)
Your action pd.DataFrame(c) is creating a new dataframe with no header, while your df is a dataframe with header.
You are writing the dataframe with no header to a csv, that's why you dont get your header in your csv. All you need to do is replace pd.DataFrame(c) with df

How to preserve complicated excel header formats when manipulating data using Pandas Python?

I am parsing a large excel data file to another one, however the headers are very abnormal. I tried to use "read_excel skiprows" and that did not work. I also tried to include the header in
df = pd.read_excel(user_input, header= [1:3], sheet_name = 'PN Projection'), but then I get this error "ValueError: cannot join with no overlapping index names." To get around this I tried to name the columns by location and that did not work either.
When I run the code as shows below everything works fine, but past cell "U" I get the header titles to be "unnamed1, 2, ..." I understand this is because pandas is considering the first row to be the header(which are empty), but how do I fix this? Is there a way to preserve the headers without manually typing in the format for each cell? Any and all help is appreciated, thank you!
small section of the excel file header
the code I am trying to run
#!/usr/bin/env python
import sys
import os
import pandas as pd
#load source excel file
user_input = input("Enter the path of your source excel file (omit 'C:'): ")
#reads the source excel file
df = pd.read_excel(user_input, sheet_name = 'PN Projection')
#Filtering dataframe
#Filters out rows with 'EOL' in column 'item status' and 'xcvr' in 'description'
df = df[~(df['Item Status'] == 'EOL')]
df = df[~(df['Description'].str.contains("XCVR", na=False))]
#Filters in rows with "XC" or "spartan" in 'description' column
df = df[(df['Description'].str.contains("XC", na=False) | df['Description'].str.contains("Spartan", na=False))]
print(df)
#Saving to a new spreadsheet called Filtered Data
df.to_excel('filtered_data.xlsx', sheet_name='filtered_data')
If you do not need the top 2 rows, then:
df = pd.read_excel(user_input, sheet_name = 'PN Projection',error_bad_lines=False, skiprows=range(0,2)
This has worked for me when handling several strangely formatted files. Let me know if this isn't what your looking for, or if their are additional issues.

Insert a single cell of string above header row in python pandas

I have my dataframe ready to be written to an excel file but I need to add a single cell of string above it. How do I do that?
You can save the dataframe starting from the second row and then use other tools to write the first cell of your excel file.
Note that writing from pandas to excel overwrites its data, so we have to follow this order (but there are also methods how to write to an existing excel file without overwriting data).
1. Save the dataframe, specifying startrow=1:
df.to_excel("filename.xlsx", startrow=1, index=False)
2. Write a cell value.
For example, using openpyxl (from a GeeksforGeeks tutorial):
from openpyxl import load_workbook
# load excel file
workbook = load_workbook(filename="filename.xlsx")
# open workbook
sheet = workbook.active
# modify the desired cell
sheet["A1"] = "A60983A Register"
# save the file
workbook.save(filename="filename.xlsx")
import pandas as pd
df = pd.DataFrame({'label':['first','second','first','first','second','second'],
'first_text':['how is your day','the weather is nice','i am feeling well','i go to school','this is good','that is new'],
'second_text':['today is warm','this is cute','i am feeling sick','i go to work','math is hard','you are old'],
'third_text':['i am a student','the weather is cold','she is cute','ii am at home','this is bad','this is trendy']})
df.loc[-1] = df.columns.values
df.sort_index(inplace=True)
df.reset_index(drop=True, inplace=True)
df.rename(columns=
{"label": "Register", 'first_text':'', 'second_text':'', 'third_text':''},
inplace=True)
Try this MRE, so you can change your data as well.

how to remove header from excel file using python

experts, i want to remove a very first row from excel file using python. I am sharing a a screen print of my source excel file
i want out put as
I am using below python code to remove first row from excel but when i am reading it as data frame and printing that i am observing that data in data frame is being read as shown in below screen print
and the code which i am using is
import pandas as pd
import os
def remove_header():
file_name = "AV Clients.xlsx"
os.chmod(file_name, 0o777)
df = pd.read_excel(file_name) #Read Excel file as a DataFrame
#df = df.drop([0])
print(df)
#df.to_excel("AV_Clients1.xlsx", index=False)
remove_header()
Please suggest how i can remove a very first row from excel file whose screen print i have shared at top.
Thanks in advance
Kawaljeet
Just add skiprows argument while reading excel.
import pandas as pd
import os
def remove_header():
file_name = "AV Clients.xlsx"
os.chmod(file_name, 0o777)
df = pd.read_excel(file_name, skiprows = 1) #Read Excel file as a DataFrame
print(df)
df.to_excel("AV_Clients1.xlsx", index=False)
remove_header()

How to process only specific lines from a .csv file?

I need to do some calculations using a .csv file. The first 4 rows of the file are header information, so actual data starts at row 5 down to row 80,000+, and I will be calculating averages for specific columns. How do I only process lines after the header information?
This is part of my code so far:
for datafile in datafolder:
# open file in read mode
o_csvFile = open(datafile)
# get the 5th line
fifthLine = linecache.getline(roverFile, 5)
# use while loop to read each line in file
startReading >= fifthLine
while startReading:
line = o_csvFile.readline()
With Pandas you can use the skiprows argument of read_csv() to begin after a set of header rows:
import pandas as pd
pd.read_csv("data.csv", skiprows=4)

Categories

Resources