read csv file using string from another df(pandas, python, dataframe)

read csv file using string from another df(pandas, python, dataframe) - python

is it possible to read csv file using string from another df?
normally, to read a csv file, i'd use the code as follow:
df = pd.read_csv("C:/Users/Desktop/file_name.csv")
however, i'd like to automate reading a csv file using string from another df:
df1_string = df1.iloc[0]['file_name']
df2 = pd.read_csv("C:/Users/Desktop/df1_string.csv")
i got a FileNotFoundError when i tried the above code:
FileNotFoundError: [Errno 2] File b'C:/Users/Desktop/df1_string,csv' does not exist
kindly advices, many thanks

Use python string formatting:
df1_string = df1.iloc[0]['file_name']
df2 = pd.read_csv(f"C:/Users/Desktop/{df1_string }.csv")

Related

Python - create CSV file

I am using the code below to create a file using Python. I don't get any error message when I run it but at the same time no file gets created
df_csv = pd.read_csv (r'X:\Google Drive\Personal_encrypted\Training\Ex_Files_Python_Excel\Exercise Files\names.csv', header=None)
df_csv.to_csv = (r"C:\temp\modified_names.csv")

You are setting df_csv.to_csv to a tuple, which is not how you call methods in python.
Solution:
df_csv.to_csv(r"C:\temp\modified_names.csv")
DataFrame.to_csv documentation here: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html
Edit: I also noticed the title says "Create Excel File"
To do that you would do the following:
df_csv.to_excel(r"C:\temp\modified_names.xlsx")
Documentation: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html

I usually make the .csv file like this:
import csv
with open(FILENAME, 'w') as file:
csv_write = csv.writer(file,delimiter='\t')
csv_write.writerow(LINE)
LINE : is an array of row you want to write

Load data into pandas dataframe from zip file

I am working to load data into pandas dataframe from downloaded zip file using REST API. I am able to load the file into dataframe if I know the name of the file using the following code:
z=zipfile.ZipFile(io.BytesIO(request_download.content))
df=pd.read_csv(z.open('Test1-RB.csv'))
print(df)
Is there a way that I can load the data into the dataframe without specifying the filename?
I am trying to do something like this:
z=zipfile.ZipFile(io.BytesIO(request_download.content))
df=pd.read_csv(z,compression="zip")
print(df)
but I am getting following error on trying to do that
Invalid file path or buffer object type: <class 'zipfile.ZipFile'>

I was able to get around this through the following code. I am not entering any filename on this now.
z=zipfile.ZipFile(io.BytesIO(request_download.content))
dfs=[]
for file in z.namelist():
dfs.append(pd.read_csv(z.open(file),sep=';'))
df1 = pd.concat(dfs)
print(df1)

How to read the data with special characters in xlsx file using pandas dataframe?

I want to read the xlsx file in the pandas data frame and perform some operations on the data. I am able to read the file with the command:
df = pd.read_excel('file.xlsx')
but when I am trying to perform some operation on the data, I am getting the following error:
ValueError: could not convert string to float:''disc abc r14jt mt cxp902 5 r2eu fail''
How I can resolve this problem. I already tried encoding='utf-8' but then also I am getting the error.
Actually I have one xlsx file 'original.xlsx', I am filtering some data from that file and saving that data as 'file.xlsx' with below command:
original.to_excel("file.xlsx",index=False,header=['a','b','c'],engine='xlsxwriter')
Now when I am trying to read the 'file.xlsx' file and perform some operation on it, I am getting that error. Is there any issue in the way I am saving the file or while reading it.

xl_file = pd.ExcelFile(file_name)
dfs = {sheet_name: xl_file.parse(sheet_name)
for sheet_name in xl_file.sheet_names}

You can try:
import pandas as pd
df = pd.read_excel('file.xlsx', encoding='latin1')

if a column of float is writted as a="3.300,144" you should do the following:
a = a.replace(".", "")
a = a.replace(",", ".")
float(a)
Output a
33300.144

KeyError while trying to read from Python dataframe

Hi I am new to Python and I am trying to read from a csv file using the following code
dataFrame = pd.read_csv(path, header=None)
X = dataFrame.loc[:,1:93]
y = dataFrame.loc[:,94]
print(X)
print(y)
But I get the following error
KeyError: 'the label [94] is not in the [columns]'
But when I copy the contents of same csv file into another and run the code, it works.. Can anyone help me with this as I cannot keep on copying the csv files as there are huge number of files.
The csv file to be read in 'path' has been created using the following code
criterion = dataFrame[93].map(lambda x: x==some_value)
with open(temp_file, 'a') as f:
dataFrame[criterion2].to_csv(f, sep='\t', encoding='utf-8',header=False)

My problem has been solved.. Actually my columns in csv were getting read as a single column with /t appended. So I used delimiter specification as follows
dataFrame = pd.read_csv(temp_file,sep=', ', delimiter='\t')
the code works now.

CParserError: Error tokenizing data

I'm having some trouble reading a csv file
import pandas as pd
df = pd.read_csv('Data_Matches_tekha.csv', skiprows=2)
I get
pandas.io.common.CParserError: Error tokenizing data. C error: Expected 1 fields in line 526, saw 5
and when I add sep=None to df I get another error
Error: line contains NULL byte
I tried adding unicode='utf-8', I even tried CSV reader and nothing works with this file
the csv file is totally fine, I checked it and i see nothing wrong with it
Here are the errors I get:

In your actual code, the line is:
>>> pandas.read_csv("Data_Matches_tekha.xlsx", sep=None)
You are trying to read an Excel file, and not a plain text CSV which is why things are not working.
Excel files (xlsx) are in a special binary format which cannot be read as simple text files (like CSV files).
You need to either convert the Excel file to a CSV file (note - if you have multiple sheets, each sheet should be converted to its own csv file), and then read those.
You can use read_excel or you can use a library like xlrd which is designed to read the binary format of Excel files; see Reading/parsing Excel (xls) files with Python for for more information on that.

Use read_excel instead read_csv if Excel file:
import pandas as pd
df = pd.read_excel("Data_Matches_tekha.xlsx")

I have encountered the same error when I used to_csv to write some data and then read it in another script. I found an easy solution without passing by pandas' read function, it's a package named Pickle.
You can download it by typing in your terminal
pip install pickle
Then you can use for writing your data (first) the code below
import pickle
with open(path, 'wb') as output:
pickle.dump(variable_to_save, output)
And finally import your data in another script using
import pickle
with open(path, 'rb') as input:
data = pickle.load(input)
Note that if you want to use, when reading your saved data, a different python version than the one in which you saved your data, you can precise that in the writing step by using protocol=x with x corresponding to the version (2 or 3) aiming to use for reading.
I hope this can be of any use.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

read csv file using string from another df(pandas, python, dataframe) - python

Use python string formatting: df1_string = df1.iloc[0]['file_name'] df2 = pd.read_csv(f"C:/Users/Desktop/{df1_string }.csv")

Related

Python - create CSV file

Load data into pandas dataframe from zip file

How to read the data with special characters in xlsx file using pandas dataframe?

KeyError while trying to read from Python dataframe

CParserError: Error tokenizing data

Categories

Resources