Is there a way to open a .csv file right after using Dataframe.to_csv?
Currently, I am using os.startfile to open the .csv file in a folder (search for .csv file and open it) - but I want to open the specific .csv I just created using df.to_csv.
Here is my current code using os.startfile:
dirName3 = r"\\xx\xx\SourceFolder"
fn2 = [f2 for f2 in os.listdir(dirName3)\
if f2.endswith('.csv') and os.path.isfile(os.path.join(dirName3, f2))][0]
path3 = os.path.join(dirName3, fn2)
open1 = os.startfile(path3)
The above code will open the .csv file I've created but only if it is top of the folder. So if there are others in the folder it may not be at the top and may open a different file.
I also can't specify an absolute path because the .csv name (using df.to_csv) will change day to day based on user input. I also won't be able to search by date because there may be multiple files from the same day in the folder.
Any help appreciated.
Answering my own question after discussion with others in comments above.
Came up with this to solve the problem:
import os
dirName3 = r"\\xx\xx\Source Folder"
fn2 = [f2 for f2 in os.listdir(dirName3)\
if f2.endswith(str(datetime.now().strftime('%d_%m_%y_')) + Qname1 + '.csv') and os.path.isfile(os.path.join(dirName3, f2))][0]
path3 = os.path.join(dirName3, fn2)
open1 = os.startfile(path3)
Using f.endswith - instead of '.csv' as in my original above, I used the same information I used to write the csv (using to_csv function which isn't included here). This really only works because I have included a date stamp in the file names - because the Qname1 (user input) can be similar for different days I need the date to differentiate between files.
Cheers stackoverflow.
Related
I have a number of csv files (6) in a Linux folder that I need to rename and relocate to a new folder on the same server.
<entity_name>_yyyymmdd_hhmmss.csv - bearing in mind the <entity_name> is a string that varies from file to file.
I need to be able to keep the original <entity_name> but replace the yyyymmdd_hhmmss with to day's date in the format yyyymmdd, so what we end up with is <entity_name>_yyyymmdd.csv
if this could be done using Python thanks.
Being new to Python the Internet was awash with ideas, some were close, but none seemed to help me achieve what I am after.
I have successfully manged to loop through the folder I need and read each file name, but am stuck renaming the files.
It's just straight-line coding. For each file, extract the base, add the date and extension, and rename.
import os
import glob
import datetime
today = datetime.date.today().strftime('%Y%m%d')
newpath = "wherever/we/go/"
for name in glob.glob('*.csv'):
base = name.split('_',1)[0]
newname = f'{base}_{today}.csv'
os.rename( name, newpath+newname )
I have a file .pdf in a folder and I have a .xls with two-column. In the first column I have the filename without extension .pdf and in the second column, I have a value.
I need to open file .xls, match the value in the first column with all filenames in the folder and rename each file .pdf with the value in the second column.
Is it possible?
Thank you for your support
Angelo
You'll want to use the pandas library within python. It has a function called pandas.read_excel that is very useful for reading excel files. This will return a dataframe, which will allow you to use iloc or other methods of accessing the values in the first and second columns. From there, I'd recommend using os.rename(old_name, new_name), where old_name and new_name are the paths to where your .pdf files are kept. A full example of the renaming part looks like this:
import os
# Absolute path of a file
old_name = r"E:\demos\files\reports\details.txt"
new_name = r"E:\demos\files\reports\new_details.txt"
# Renaming the file
os.rename(old_name, new_name)
I've purposely left out a full explanation because you simply asked if it is possible to achieve your task, so hopefully this points you in the right direction! I'd recommend asking questions with specific reproducible code in the future, in accordance with stackoverflow guidelines.
I would encourage you to do this with a .csv file instead of a xls, as is a much easier format (requires 0 formatting of borders, colors, etc.).
You can use the os.listdir() function to list all files and folders in a certain directory. Check os built-in library docs for that. Then grab the string name of each file, remove the .pdf, and read your .csv file with the names and values, and the rename the file.
All the utilities needed are built-in python. Most are the os lib, other are just from csv lib and normal opening of files:
with open(filename) as f:
#anything you have to do with the file here
#you may need to specify what permits are you opening the file with in the open function
I have an Excel source file in a source folder (*.xlsm) and another file (also *.xlsm) that contain some data. I have to create a third file, that has to be a *.xls file, that is basically the Excel source file that contains some data of the second file. In order to do that I have written this code:
from openpyxl import load_workbook
file1 = "C:\\Users\Desktop\file1.xlsm"
file2 = "C:\\Users\Desktop\file2.xlsm"
file3 = "C:\\Users\Desktop\file3.xls"
wb1 = load_workbook(file1)
sheet1 = wb1["Sheet1"]
wb2 = load_workbook(file2)
sheet2 = wb2["Sheet1"]
sheet1["A1"].value = sheet2["A1"].value
wb1.save(file3)
The code seems to be OK and doesn't return any error, but the I cannot open the created file3.
I don't understand why, I tried to change the extension of the third file but both *.xlsx and *.xlsm show this problem. I also tried to delete the line part
sheet1["A1"].value = sheet2["A1"].value
To understand if the problem was linked to the writing of the sheet, but the problem remains.
First of all please not that your code is not creating any new file but just resaving an existing one.
Also is not clear what you want: do you want to create file3? With what information? Your code is not doing anything of that.
However I tried to run a short version of your code and I got the error:
openpyxl.utils.exceptions.InvalidFileException: openpyxl does not
support .xlsm' file format, please check you can open it with Excel
first. Supported formats are: .xlsx,.xlsm,.xltx,.xltm
Most likely your file format is unsupported. Try to resave your files in the format xlsx. I think the problem are macros: if you don't have any of them in your files then changing the format should not be any issue. If you have I am not sure openpyxl will work in that way (without any workaround at least).
This answer might help. It propose to extract the xlms files (they are zip files), work on the ones that represent the format of your sheet (not the macro) and then put everything together again.
One error might be that the filepath variables require unicode escape's for the \
Thus: the correct version would be
file1 = "C:\\Users\\Desktop\\file1.xlsm"
file2 = "C:\\Users\\Desktop\\file2.xlsm"
file3 = "C:\\Users\\Desktop\\file3.xls"
I was working on saving text to different files. so, now I already created several files and each text file has some texts/paragraph in it. Now, I just want to save these files to a directory. I already created a self-defined directory, but now it is empty. I want to save these text files into my directory.
The partial code is below:
for doc in root:
docID = doc.find('DOCID').text.strip()
text = doc.find('TEXT').text,strip()
f = open("%s" %docID, 'w')
f.write(str(text))
Now, I created all the files with text in it. and I also have a blank folder/directory now. I just don't know how to put these files into the directory.
I would be appreciate it.
========================================================================
[Solved] Thank you guys for your all helping! I figured it out. I just edit my summary here. I got a few problems.
1. my docID was saved as tuple. I need to convert to string without any extra symbol. here is the reference i used: https://stackoverflow.com/a/17426417/9387211
2. I just created a new path and write the text to it. i used this method: https://stackoverflow.com/a/8024254/9387211
Now, I can share my updated code and there is no more problem here. Thanks everyone again!
for doc in root:
docID = doc.find('DOCID').text.strip()
did = ''.join(map(str,docID))
text = doc.find('TEXT').text,strip()
txt = ''.join(map(str,docID))
filename = os.path.join(dst_folder_path, did)
f = open(filename, 'w')
f.write(str(text))
Suppose you have all the text files in home directory (~/) and you want to move them to /path/to/dir folder.
from shutil import copyfile
import os
docid_list = ['docid-1', 'docid-2']
for did in docid_list:
copyfile(did, /path/to/folder)
os.remove(did)
It will copy the docid files in /path/to/folder path and remove the files from the home directory (assuming you run this operation from home dir)
You can frame the file path for open like
doc_file = open(<file path>, 'w')
Im trying to convert all the pdf stored in one file, say 60 pdfs into text documents and store them in different folders. the folder should have unique names.
i tried this code.The folders where created, but the pdftotext conversion command doesnt work in the loop:
import os
def listfiles(path):
for root, dirs, files in os.walk(path):
for f in files:
print(f)
newpath = r'/home/user/files/'
p=f.replace("pdf","")
newpath=newpath+p
if not os.path.exists(newpath): os.makedirs(newpath)
os.system("pdftotext f f.txt")
f=listfiles("/home/user/reports")
One problem here is the os.system("pdftotext f f.txt") call. I assume you want the f's here replaced with the current file in the loop. If that is the case you need to change this to os.system("pdftotext {0} {0}.txt".format(f))
Another issue may be that the working directory is not being set up so the call to system is looking for the file in the wrong place. Try using os.chdir every time you change folders.
to place the text file in a diffrent folder try:
os.system("pdftotext {0} {1}/{0}.txt".format(f, newpath))
I don't know Python, but I think I can clearly see a mistake there. It looks like you are just replacing the ".pdf" with a ".txt". Since a PDF isn't just plain text, this won't work.
For the convertion look at the top answer of this post:
Python module for converting PDF to text