I have multiple scripts, each of them have a Dataframe. I want to Export one column from each script/Dataframe into a single csv.
I can crate a csv from my "first" script with one column:
Vorhersagen = pd.DataFrame(columns=["solar_prediction"])
Vorhersagen["solar_prediction"] = Erzeugung["erzeugung_predicted"]
Vorhersagen.to_csv(r"H:/.../Vorhersagen_2017.csv")
Now i have a csv (called "Vorhersagen_2017") with the column "solar_prediction". But how can I add another column (from another script) to the same csv as a second column? (The columns have the same length)
If I understood correctly you want to update the csv file by running different scripts. If this is the case then I would just read the file, append the new column and save the file again. Something like:
df=pd.read_csv('Vorhersagen_2017.csv',...)
df2=pd.concat([df,df1],axis=1) #df1 is the dataframe created by your second script
df2.to_csv(...)
Then you would have to run this iteratively in all your scripts.
However, I think is more efficient to import all your script as modules in a main script and run them from there. From this main script you could easily concatenate the various columns and save them as a csv at once.
Related
So basically I have multiple excel files (different names) in a folder and I want to copy the same cell (for example B3) from all files and create a column in New excel file and put all the value there.
The file above is what I want to import (multiple files like that). I want to copy the names and emails and save it to the new file like the one below.
So you want to read multiple files, get a specific cell and then create a new data frame and save it as a new Excel file:
cells = []
for f in glob.glob("*.xlsx"):
data = pd.read_excel(f, 'Sheet1')
cells.append(data.iloc[3,5])
pd.Series(cells).to_excel('file.xlsx')
In my particular example I took cell F4 (row=3, col=5) - you can obviously take any other cell that you like or even more than one cell and then save it to a different list, combining the two lists in the end. You could also have more complex logic where you could check one cell to decide which other cell to look at next.
The key point is that you want to iterate through a bunch of files and for each of them:
read the file
extract whatever data you are interested in
set this data aside somewhere
Once you've gone through all the files combine all the data in any way that you like and then save it to disk in a format of your choice.
I'm running a python script to automate some of my day-to-day tasks at work. One task I'm trying to do is simply add a row to an existing ods sheet that I usually open via LibreOffice.
This file has multiple sheets and depending on what my script is doing, it will add data to different sheets.
The thing is, I'm having trouble finding a simple and easy way to just add some data to the first unpopulated row of the sheet.
Reading about odslib3, pyexcel and other packages, it seems that to write a row, I need to specifically tell the row number and column to write data, and opening the ods file just to see what cell to write and tell the pythom script seems unproductive
Is there a way to easily add a row of data to an ods sheet without informing row number and column ?
If I understand the question I believe that using a .remove() and a .append() will do the trick. It will create and populate data on the last row (can't say its the most efficient though).
EX if:
from pyexcel_ods3 import save_data
from pyexcel_ods3 import get_data
data = get_data("info.ods")
print(data["Sheet1"])
[['first_row','first_row'],[]]
if([] in data["Sheet1"]):
data["Sheet1"].remove([])#remove unpopulated row
data["Sheet1"].append(["second_row","second_row"])#add new row
print(data["Sheet1"])
[['first_row','first_row'],['second_row','second_row']]
I have problem with saving pandas DataFrame to csv. I run code on jupyter notebook and everything works fine. After runing the same code on server columns values are saved to random columns…
csvPath = r''+str(pathlib.Path().absolute())+ '/products/'+brand['file_name']+'_products.csv'
productFrame.to_csv(csvPath,index=True)
I've print DataFrame before saving – looks as it should be. After saving, I open file and values ale mixed.
How to make it always work in the proper way?
If you want to force the column order when exporting to csv, use
df[cols].to_csv()
where cols is a list of column names in the desired order.
I have a CSV file that has a table with information that I'd like to reference in another table. To give you a better perspective, I have the following example:
"ID","Name","Flavor"
"45fc754d-6a9b-4bde-b7ad-be91ae60f582","account1-test1","m1.medium"
"83dbc739-e436-4c9f-a561-c5b40a3a6da5","account3-test2","m1.tiny"
"ef68fcf3-f624-416d-a59b-bb8f1aa2a769","account1-test3","m1.medium"
I would like to add another column that references the Name column and pulls the customner name in one column and the rest of the info into another column, example:
"ID","Name","Flavor","Customer","Misc"
"45fc754d-6a9b-4bde-b7ad-be91ae60f582","account1-test1","m1.medium","account1","test1"
"83dbc739-e436-4c9f-a561-c5b40a3a6da5","account3-test2","m1.tiny","account3,"test2"
"ef68fcf3-f624-416d-a59b-bb8f1aa2a769","account1-test3","m1.medium","account1","test3"
The task here is to have a python script that opens the original CSV file, and creates a new CSV file with the added column. Any ideas? I've been having trouble parsing through the name column successfully.
data = pd.read_csv('your_file.csv')
data[['Customer','Misc']] = data.Name.str.split("-",expand=True)
Now you can again save it to csv file by :
data.to_csv('another_file.csv')
Have you tried opening your csv file with a pandas DataFrame. This can be done with:
df = pd.read_csv('input_data.csv')
If the customer and misc columns are part of another csv file you can load this with the same method as above (naming df2) and then append with the following:
df['Customer'] = df2['Customer']
You can then output the DataFrame as a csv file with the following:
df.to_csv('output_data_name.csv')
I've created a csv file with the column names and saved it using pandas library. This file will be used to create a historic record where the rows will be charged one by one in different moments... what I'm doing to add rows to this csv previously created is transform the record to a DataFrame and then using to_csv() I choose mode = 'a' as a parameter in order to append this record to the existing file. The problem here is that I would like to see and index automatically generated in the file everytime I add a new row. I already know when I import this file as a DF, an index is generated automatically, but this is within the idle interface...when I open the csv with Excel for example...the file doesn't have an index.
While writing your files to csv, you can use set index = True in the to_csv method. This ensures that the index of your dataframe is written explicitly to the csv file