I have a number of csv files (6) in a Linux folder that I need to rename and relocate to a new folder on the same server.
<entity_name>_yyyymmdd_hhmmss.csv - bearing in mind the <entity_name> is a string that varies from file to file.
I need to be able to keep the original <entity_name> but replace the yyyymmdd_hhmmss with to day's date in the format yyyymmdd, so what we end up with is <entity_name>_yyyymmdd.csv
if this could be done using Python thanks.
Being new to Python the Internet was awash with ideas, some were close, but none seemed to help me achieve what I am after.
I have successfully manged to loop through the folder I need and read each file name, but am stuck renaming the files.
It's just straight-line coding. For each file, extract the base, add the date and extension, and rename.
import os
import glob
import datetime
today = datetime.date.today().strftime('%Y%m%d')
newpath = "wherever/we/go/"
for name in glob.glob('*.csv'):
base = name.split('_',1)[0]
newname = f'{base}_{today}.csv'
os.rename( name, newpath+newname )
Related
I am trying to run this script so that it creates a .txt file based on the date.
#Importing libraries
import os
import sys
#Selecting time and methods
from time import gmtime, strftime
from datetime import datetime
#Fortmatting date & printing it
date = datetime.now().strftime("%Y_%m_%d-%I:%M_%p")
#Creating the name of the file, plus the date we created above
newFile = ("Summary_" + date) + '.txt'
#Creating the filepath where we want to save this file
filepath = os.path.join('C:\\Users\\myUser\\Desktop\\myFolder', newFile)
#If the path doesn't exist
if not os.path.exists('C:\\Users\\myUser\\Desktop\\myFolder'):
os.makedirs('C:\\Users\\MyUser\\Desktop\\myFolder')
f = open(filepath, "w")
Everything works, however, the file gets created as a File and not a .txt file.
I am not sure if I am not concatenating the strings correctly, or if there is an additional step I need to take.
I noticed that if I type anything after the date string, it doesn't get added to the end of the file.
If I type the string manually like this, then it works just fine:
filepath = os.path.join('C:\\Users\\myUser\\Desktop\\myFolder', "Summary_DateGoesHere.txt")
Am I missing something in my script?
I appreciate the assistance.
The issue here is , Windows won't allow to create a file name having colon :
Here, your filename variable is - "Summary_2022_05_05-11:24_AM.txt". As colon is not allowed, its truncating after 11, and creating a file name with just Summary_2022_05_05-11.
To fix this, you can use underscore or hyphen before minutes field like this:
date = datetime.now().strftime("%Y_%m_%d-%I_%M_%p")
I'm trying to solve a problem at work, I am not a developer but work in general IT operations and am trying to learn a bit here and there, so I may be way off right now with what I'm trying to do here. I've just been utilizing online resources, here and a little bit from the book Automate the Boring Stuff with Python. Here is my objective:
I have two files that are automatically placed in a folder on my computer every morning at the same time using a post processor, and I need to add yesterday's date to the end of the file names before I upload them to an FTP server which I have do each morning around the same time. I am trying to write a Python script that I can somehow schedule to run each morning right after the files are placed in the folder, which will append yesterday's date in MMDDYYYY format. For example, if the files are called "holdings.CSV" and "transactions.CSV" when they are placed in the folder, I need to rename them to "holdings01112022.CSV" and "transactions01112022.CSV". I only want to rename the new files in the folder, the files from previous days with the dates already appended will remain in the folder. Again, I'm a total beginner, so my code may not make sense and there may be superfluous or redundant lines, I'd love corrections... Am I going down the right path here, am I off altogether? Any suggestions?
import os, re
from datetime import date
from datetime import timedelta
directory = 'C:\Users\me\main folder\subfolder'
filePattern = re.compile('%s.CSV', re.VERBOSE)
for originalName in os.listdir('.'):
mo = filePattern.search(originalName)
if mo == None:
continue
today = date.today()
yesterday = today - timedelta(days = 1), '%M%D%Y'
for originalName in directory:
newName = originalName + yesterday
os.rename(os.path.join(directory, originalName), os.path.join(directory, newName))
Any help is appreciated. Thanks.
Here's a short example on how to code your algorithm.
import pathlib
from datetime import date, timedelta
if __name__ == '__main__':
directory = pathlib.Path('/Users/cesarv/Downloads/tmp')
yesterday = date.today() - timedelta(days=1)
for file in directory.glob('*[!0123456789].csv'):
new_path = file.with_stem(file.stem + yesterday.strftime('%m-%d-%Y'))
if not new_path.exists():
print(f'Renaming {file.name} to {new_path.name}')
file.rename(new_path)
else:
print(f'File {new_path.name} already exists.')
By using pathlib, you'll simplify the handling of filenames and paths and, if you run this program on Linux or macOS, it will work just the same.
Note that:
We're limiting the list of files to process with glob(), where the pattern *[!0123456789].csv means "all filenames that do not end with a digit before the suffix (the [!0123456789] represents one character, before the suffix, that should not - ! - equal any of the characters in the brackets)." This allows you to process only files that do not contain a date in their names.
The elements given by the for cycle, referenced with the file variable, are objects of class Path, which give us methods and properties we can work with, such as with_stem() and stem.
We create a new Path object, new_path, which will have its stem (the file's name part without the suffix .csv) renamed to the original file's name (file.name) plus yesterday in the format you require by using the method with_stem().
Since this new Path object contains the same path of the original file, we can use it to rename the original file.
As you can see, we can also check that a file with the new name does not exist before any renaming by using the exists() method.
You can also review the documentation on pathlib.
If you have any doubts, ask away!
Are you running to any issue? If all you want is to get the csv file you can simplify your code by using one if statement instead of using regex. You also need to convert your date to string before adding it to your file name. (i.e: newName = originalName[:-3] +str(yesterday)).
If this code didnt solve your problem please mention the error that you are receiving to make it easier for others to help you.
Here is my suggestion to eliminate possible errors without knowing what type of error you are getting.
import os, re
from datetime import date
from datetime import timedelta
directory = 'C:/Users/me/main folder/subfolder'
for originalName in os.listdir(directory):
print(f'original name is = {originalName}')
if originalName[-3:].lower() =='csv':
today = date.today()
yesterday = today - timedelta(days = 1)
yesterday_file_name = originalName[:-4]+yesterday.strftime("%m%d%Y")+'.csv'
os.rename(os.path.join(directory, originalName), os.path.join(directory, yesterday_file_name))
#print('formatted file names are')
#print(yesterday_file_name)
#here is the sample output
#original name is = holdings.csv
#formatted file names are
#holdings01112022.csv
#original name is = transactions.CSV
#formatted file names are
#transactions01112022.csv
I have files being put into a folder. Each day I would like to take those files and move them into a folder with that days date as the folder name. I've been able to create the folder using the current date
import shutil, os
import time
date = time.strftime("%Y%m%d")
parent_dir = "C:\dfolder"
path = os.path.join(parent_dir, date)
os.mkdir(path)
This was successful creating the folder. My problem is the part of the code that will find that newly created folder each day and move the files into it. I have been able to use shutil.move to move the files into the folder but I have to specify the name of the destination. Is there a way to automate this each day? Maybe by having it put the files for that day into the most recently created folder or something of the sort?
How about this: It makes a new folder for tomorrow and renames today's folder:
import shutil, os
import time
date = time.strftime("%Y%m%d")
parent_dir = "C:\dfolder"
path = os.path.join(parent_dir, date)
os.rename("New folder", path) // Renames today's folder to it's proper name
os.mkdir("New folder") // Makes the new folder for tomorrow
This is better than checking the dates of folders being made. It just does it for you for even more automation.
Using the os.rename is working, however it is also moving my file.
import shutil, os
import time
date = time.strftime("%Y%m%d")
parent_dir = "C:\wfolder"
os.rename (parent_dir , date)
Any thoughts?
I had a similar problem a while back. If this folder is not used for anything else the best solution is to keep a list of current files in the folder. The folder can be checked with os.listdir(). Then you can compare the two list and take the difference which leaves you with the most newly add file. The result of os.listdir() becomes your new baseline. I will try to find the exact code I used and place it here.
Code:
old_list= os.listdir(your_path)
## Your function for creating folder here ##
new_list= os.listdir(your_path)
dl = []
for item in new_list:
if item not in old_list:
dl.append(item)
dl # is the list of all new files in the directory #
In my program I import a certain file, at the end I create a file in which I save the outputs. I would like to use the name of the imported file (with some additions) as the name of the newly created file.
Since I have a large amount of files to import, I would like to do this in order to have a better overview of which files were created based on which files.
For example: I open a file with file=open(‘summer.txt’) and I would like to use the 'summer' of the original file and just add the year, the name of the new file would be ‘summer2020.txt’.
I would like to automatically detect the name of the imported file, add the year and that would be the name of my new file. (the addition to the name ('2020') would be the same for every opened file)
I’ m still new to python, so I don‘t know if this is possible or if there is a better way of doing it. Thank you very much for any help!
If you are using Python 3.9, you can use pathlib and the new with_stem method
file_path=Path('summer.txt')
new_file_path = file_path.with_stem(file_path.stem+'2020')
Advantage of using pathlib is that it supports multiple platforms (win/osx/linux) and their different ways of specifying paths. Also allows you to easily split the file name in to the stem summer and suffix .txt.
For versions of python like 3.6.5, you can do the following instead
new_file_path = file_path.with_name(f"{file_path.stem}2020{file_path.suffix}")
Here's an example to automatically get the current year:
from datetime import datetime
curdate = datetime.now()
curyear = str(curdate.year)
file = open('summer'+curyear+'.txt')
As you have many files you maybe want something like
import glob, shutil
myfiles = glob.glob("*.txt")
for fname in myfiles:
basename = fname[:-4] # drop .txt at the end
newname = f"{basname}2020.txt"
shutil.copy(fname, newname)
I am looking to pull in a csv file that is downloaded to my downloads folder into a pandas dataframe. Each time it is downloaded it adds a number to the end of the string, as the filename is already in the folder. For example, 'transactions (44).csv' is in the folder, the next time this file is downloaded it is named 'transactions (45).csv'.
I've looked into the glob library or using the os library to open the most recent file in my downloads folder. I was unable to produce a solution. I'm thinking I need some way to connected to the downloads path, find all csv file types, those with the string 'transactions' in it, and grab the one with the max number in the full filename string.
list(csv.reader(open(path + '/transactions (45).csv'))
I'm hoping for something like this path + '/%transactions%' + 'max()' + '.csv' I know the final answer will be completely different, but I hope this makes sense.
Assuming format "transactions (number).csv", try below:
import os
import numpy as np
files=os.listdir('Downloads/')
tranfiles=[f for f in files if 'transactions' in f]
Now, your target file is as below:
target_file=tranfiles[np.argmax([int(t.split('(')[1].split(')')[0]) for t in tranfiles])]
Read that desired file as below:
df=pd.read_csv('Downloads/'+target_file)
One option is to use regular expressions to extract the numerically largest file ID and then construct a new file name:
import re
import glob
last_id = max(int(re.findall(r" \(([0-9]+)\).csv", x)[0]) \
for x in glob.glob("transactions*.csv"))
name = f'transactions ({last_id}).csv'
Alternatively, find the newest file directly by its modification time
Note that you should not use a CSV reader to read CSV files in Pandas. Use pd.read_csv() instead.