Hi I am new to Python and I am trying to read from a csv file using the following code
dataFrame = pd.read_csv(path, header=None)
X = dataFrame.loc[:,1:93]
y = dataFrame.loc[:,94]
print(X)
print(y)
But I get the following error
KeyError: 'the label [94] is not in the [columns]'
But when I copy the contents of same csv file into another and run the code, it works.. Can anyone help me with this as I cannot keep on copying the csv files as there are huge number of files.
The csv file to be read in 'path' has been created using the following code
criterion = dataFrame[93].map(lambda x: x==some_value)
with open(temp_file, 'a') as f:
dataFrame[criterion2].to_csv(f, sep='\t', encoding='utf-8',header=False)
My problem has been solved.. Actually my columns in csv were getting read as a single column with /t appended. So I used delimiter specification as follows
dataFrame = pd.read_csv(temp_file,sep=', ', delimiter='\t')
the code works now.
Related
I am pretty new to Python and I am trying to filter some rows in a dataframe based on whether they contain strings or not. I want the script to automatically use the input name to save the filtered dataframe on a text file.
Suppose I read my file with python3 code.py input.txt and my code looks like this:
#!/usr/bin/python3
import pandas as pd
import sys
data = pd.read_csv(sys.argv[1], sep='\t', header=0)
selectedcols = data['Func.refGene']
selectedrows = selectedcols.str.contains("exonic|splicing")
selecteddata = data[selectedrows]
selecteddata.to_csv(f'{sys.argv[1][:-4]}_exonic.splicing.txt', index=None, sep='\t', mode = 'a')
Where 'Func.refGene' is the column I want to search through for the strings "exonic" and "splicing". I have written this code and it worked before, but now I try to run it and the following error occurs:
File "code.py", line 12
selecteddata.to_csv(f'{sys.argv[1][:-4]}_exonic.splicing.txt', index=None, sep='\t', mode = 'a')
^
SyntaxError: invalid syntax
Would anyone know what could be wrong? I have searched for this syntax and haven't had any success.
Try this for below python 3.6,
selecteddata.to_csv('{0}_exonic.splicing.txt'.format(sys.argv[1][:-4]), index=None, sep='\t', mode = 'a')
f-string supports from python 3.6 https://docs.python.org/3/whatsnew/3.6.html#pep-498-formatted-string-literals
is it possible to read csv file using string from another df?
normally, to read a csv file, i'd use the code as follow:
df = pd.read_csv("C:/Users/Desktop/file_name.csv")
however, i'd like to automate reading a csv file using string from another df:
df1_string = df1.iloc[0]['file_name']
df2 = pd.read_csv("C:/Users/Desktop/df1_string.csv")
i got a FileNotFoundError when i tried the above code:
FileNotFoundError: [Errno 2] File b'C:/Users/Desktop/df1_string,csv' does not exist
kindly advices, many thanks
Use python string formatting:
df1_string = df1.iloc[0]['file_name']
df2 = pd.read_csv(f"C:/Users/Desktop/{df1_string }.csv")
I am using the code below to create a file using Python. I don't get any error message when I run it but at the same time no file gets created
df_csv = pd.read_csv (r'X:\Google Drive\Personal_encrypted\Training\Ex_Files_Python_Excel\Exercise Files\names.csv', header=None)
df_csv.to_csv = (r"C:\temp\modified_names.csv")
You are setting df_csv.to_csv to a tuple, which is not how you call methods in python.
Solution:
df_csv.to_csv(r"C:\temp\modified_names.csv")
DataFrame.to_csv documentation here: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html
Edit: I also noticed the title says "Create Excel File"
To do that you would do the following:
df_csv.to_excel(r"C:\temp\modified_names.xlsx")
Documentation: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html
I usually make the .csv file like this:
import csv
with open(FILENAME, 'w') as file:
csv_write = csv.writer(file,delimiter='\t')
csv_write.writerow(LINE)
LINE : is an array of row you want to write
I am reading csv files into python using:
df = pd.read_csv(r"C:\csvfile.csv")
But the file has some summary data, and the raw data start if a value "valx" is found. If "valx" is not found then the file is useless. I would like to create news dataframes that start when "valx" is found. I have been trying for a while with no success. Any help on how to achieve this is greatly appreciated.
Unfortunately, pandas only accepts skiprows for rows to skip in the beginning. You might want to parse the file before creating the dataframe.
As an example:
import csv
with open(r"C:\csvfile.csv","r") as f:
lines = csv.reader(f, newline = '')
if any('valx' in i for i in lines):
data = lines
Using the Standard Libary csv module, you can read file and check if valx is in the file, if it is found, the content will be returned in the data variable.
From there you can use the data variable to create your dataframe.
I made csv file in my python code itself and going to append next data in ti it but the error is comming
io.UnsupportedOperation: not readable
I tried code is:
df.to_csv('timepass.csv', index=False)
with open(r'timepass.csv', 'a') as f:
writer = csv.reader(f)
your_list = list(writer)
print(your_list)
want to append next data and store in the same csv file. so that csv file having both previous and current data.
so please help me to find out..
Thanks in advance...
It is so simple just try this:
import pandas as pd
df = pd.read_excel("NSTT.xlsx","Sheet1") #reading Excel
print(df) #Printing data frame
df.to_excel("new.xlsx") #Writing Dataframe into New Excel file
Now here if you want to append data in the same file then use
df.to_excel("new.xlsx","a")
And no need to add in a list as you can directly access the data same as a list with data frame only you have to define the location .
Please check this.
You can use pandas in python to read csv and write csv:
import pandas as pd
df = pd.read_csv("csv file")
print(df)
Try:
with open(r'timepass.csv', 'r') as f:
reader = list(csv.reader(f))
print(reader)
Here you are opening your file as r, which means read-only and assigning the list contents to reader with list(csv.reader(f)). Your earlier code a opens the file for appending only where in the documentation is described as:
'a' opens the file for appending; any data written to the file is
automatically added to the end
and does not support the read().
And if you want to append data to the csv file from a different list, use the with open as a with the writer method.
with open('lake.csv','a') as f:
csv.writer(f,[1,2,3]) #dummy list [1,2,3]
Or directly from the pandas.DataFrame.to_csv method from your new dataframe, with header = False so as not to append headers:
df.to_csv('timepass.csv', index=False)
df_new.to_csv(r'timepass.csv', mode='a', header=False) #once you have updated your dataframe, you can directly append it to the same csv file
you can use pandas for appending two csv quickly
import pandas as pd
dataframe1=pd.read_csv("a.csv")
dataframe2=pd.read_csv("b.csv")
dataframe1=dataframe1.append(dataframe2)
dataframe1=dataframe1.reset_index(drop=True)
dataframe1.to_csv("a.csv")