I'm trying to solve a problem at work, I am not a developer but work in general IT operations and am trying to learn a bit here and there, so I may be way off right now with what I'm trying to do here. I've just been utilizing online resources, here and a little bit from the book Automate the Boring Stuff with Python. Here is my objective:
I have two files that are automatically placed in a folder on my computer every morning at the same time using a post processor, and I need to add yesterday's date to the end of the file names before I upload them to an FTP server which I have do each morning around the same time. I am trying to write a Python script that I can somehow schedule to run each morning right after the files are placed in the folder, which will append yesterday's date in MMDDYYYY format. For example, if the files are called "holdings.CSV" and "transactions.CSV" when they are placed in the folder, I need to rename them to "holdings01112022.CSV" and "transactions01112022.CSV". I only want to rename the new files in the folder, the files from previous days with the dates already appended will remain in the folder. Again, I'm a total beginner, so my code may not make sense and there may be superfluous or redundant lines, I'd love corrections... Am I going down the right path here, am I off altogether? Any suggestions?
import os, re
from datetime import date
from datetime import timedelta
directory = 'C:\Users\me\main folder\subfolder'
filePattern = re.compile('%s.CSV', re.VERBOSE)
for originalName in os.listdir('.'):
mo = filePattern.search(originalName)
if mo == None:
continue
today = date.today()
yesterday = today - timedelta(days = 1), '%M%D%Y'
for originalName in directory:
newName = originalName + yesterday
os.rename(os.path.join(directory, originalName), os.path.join(directory, newName))
Any help is appreciated. Thanks.
Here's a short example on how to code your algorithm.
import pathlib
from datetime import date, timedelta
if __name__ == '__main__':
directory = pathlib.Path('/Users/cesarv/Downloads/tmp')
yesterday = date.today() - timedelta(days=1)
for file in directory.glob('*[!0123456789].csv'):
new_path = file.with_stem(file.stem + yesterday.strftime('%m-%d-%Y'))
if not new_path.exists():
print(f'Renaming {file.name} to {new_path.name}')
file.rename(new_path)
else:
print(f'File {new_path.name} already exists.')
By using pathlib, you'll simplify the handling of filenames and paths and, if you run this program on Linux or macOS, it will work just the same.
Note that:
We're limiting the list of files to process with glob(), where the pattern *[!0123456789].csv means "all filenames that do not end with a digit before the suffix (the [!0123456789] represents one character, before the suffix, that should not - ! - equal any of the characters in the brackets)." This allows you to process only files that do not contain a date in their names.
The elements given by the for cycle, referenced with the file variable, are objects of class Path, which give us methods and properties we can work with, such as with_stem() and stem.
We create a new Path object, new_path, which will have its stem (the file's name part without the suffix .csv) renamed to the original file's name (file.name) plus yesterday in the format you require by using the method with_stem().
Since this new Path object contains the same path of the original file, we can use it to rename the original file.
As you can see, we can also check that a file with the new name does not exist before any renaming by using the exists() method.
You can also review the documentation on pathlib.
If you have any doubts, ask away!
Are you running to any issue? If all you want is to get the csv file you can simplify your code by using one if statement instead of using regex. You also need to convert your date to string before adding it to your file name. (i.e: newName = originalName[:-3] +str(yesterday)).
If this code didnt solve your problem please mention the error that you are receiving to make it easier for others to help you.
Here is my suggestion to eliminate possible errors without knowing what type of error you are getting.
import os, re
from datetime import date
from datetime import timedelta
directory = 'C:/Users/me/main folder/subfolder'
for originalName in os.listdir(directory):
print(f'original name is = {originalName}')
if originalName[-3:].lower() =='csv':
today = date.today()
yesterday = today - timedelta(days = 1)
yesterday_file_name = originalName[:-4]+yesterday.strftime("%m%d%Y")+'.csv'
os.rename(os.path.join(directory, originalName), os.path.join(directory, yesterday_file_name))
#print('formatted file names are')
#print(yesterday_file_name)
#here is the sample output
#original name is = holdings.csv
#formatted file names are
#holdings01112022.csv
#original name is = transactions.CSV
#formatted file names are
#transactions01112022.csv
Related
I have a number of csv files (6) in a Linux folder that I need to rename and relocate to a new folder on the same server.
<entity_name>_yyyymmdd_hhmmss.csv - bearing in mind the <entity_name> is a string that varies from file to file.
I need to be able to keep the original <entity_name> but replace the yyyymmdd_hhmmss with to day's date in the format yyyymmdd, so what we end up with is <entity_name>_yyyymmdd.csv
if this could be done using Python thanks.
Being new to Python the Internet was awash with ideas, some were close, but none seemed to help me achieve what I am after.
I have successfully manged to loop through the folder I need and read each file name, but am stuck renaming the files.
It's just straight-line coding. For each file, extract the base, add the date and extension, and rename.
import os
import glob
import datetime
today = datetime.date.today().strftime('%Y%m%d')
newpath = "wherever/we/go/"
for name in glob.glob('*.csv'):
base = name.split('_',1)[0]
newname = f'{base}_{today}.csv'
os.rename( name, newpath+newname )
I've only just recently starting to code with Python and I found my first problem which I can't seem to figure out after days of research. Hopefully someone on this forum can help me out.
The situation: In our company I have multiple folders and subfolders. Within those subfolders I we have excel files called:
Item Supply Demand "date".xlsx
Backorder report"date".xlsx
Product available report"date".xlsx
Everyday in the morning our IT downloads a new file with these names and the date of today. For example today it will look like this: Item Supply Demand 23-06-22.xlsx
The goal: I want to find the most recent Excel file within our subfolders which contains the name "Item Supply Demand". I already know how to find the most recent Excel file with the glob.glob function. However, I cannot seem to add an extra filter on a name part. Below the code that I already have:
import sys
import csv
import pandas as pd
import glob
import os.path
import pathlib
import re
#search for all Excel files
files = glob.glob(r"Pathname\**\*.xlsx", recursive = True)
#find most recent Item Supply Demand report
text_files = str(files)
if 'Item Supply Demand' in text_files:
max_file = max(files, key=os.path.getctime)
#Add the file to the dataframe
df = pd.read_excel(max_file)
df
Does anyone know what is currently missing or wrong on my code?
Many thanks in advance for helping our!
Cheer,
Kav
Try this, your already 99% of the way there.
files = glob.glob(r"Pathname\**\*Item Supply Demand*.xlsx", recursive = True)
Then I suppose the code block underneath can drop the conditional to become
# find most recent Item Supply Demand report
max_file = max(files, key=os.path.getctime)
Note - I haven't checked will that syntax do what you want - or even work at all - I'm assuming its working for you as its not the focus of your question.
edit: Just checked that - nice - it will give you exactly what you want.
The variable "files" is already list of strings. You can create list of string that match only the substring, then use that list.
wanted_file_substring = "Item Supply Demand"
matching_files = [specific_file for specific_file in files if wanted_file_substring in specific_file]
max_file = max(matching_files, key=os.path.getctime)
Edit my answer:
Either answer you choose, you need to initialize variable outside of "if" statement or move the read_excel line into the if statement. If the file you want is not found, your program will error out, because pandas is trying to reference a variable that doesn't exist.
Change the if statement to:
if files:
max_file = max(.....)
pd.read_excel(max_file)
I have the below code which downloads a sheet to a folder on my computer. How do I have it download the excel sheet to a newly created folder with the current day's datestamp? So for example, I want the file to download to a folder called:
C:/Users/E29853/OneDrive/Smartsheets/Templates/20220610/
for any files downloaded on June 10, 2022.
This is the code I have:
import os, smartsheet
token=os.environ['SMARTSHEET_ACCESS_TOKEN']
smartsheet_client = smartsheet.Smartsheet(token)
smartsheet_client.errors_as_exceptions(True)
smartsheet_client.Sheets.get_sheet_as_excel(
8729488427892475,
'C:/Users/E29853/OneDrive/Smartsheets/Templates',
'Region.xlsx'
)
In order to augment your existing code to achieve your stated objective, you need to know how to achieve the following two things with Python:
how to get the current date (string) in yyyymmdd format
how to create a new directory if it doesn't already exist
I'm fairly new to Python myself, but was able to figure this out thanks to Google. In case it's helpful for you in the future, here was my process for figuring this out.
Step 1: Determine how to get the current date (yyyymmdd) in Python
Google search for python get current date yyyymmdd
The top search result was a Stack Overflow answer with > 1000 upvotes (which indicates a broadly approved answer that should be reliable).
Note that the date format was slightly different in this question/answer (yyyy-mm-dd) -- I omitted the hyphens in my code, to get the desired format yyyymmdd.
Now that I know how to get the date string in the desired format, I'll be able to concatenate it with the string that represents my base path, to get my target path:
# specify path
path = 'c:/users/kbrandl/desktop/' + current_date
Step 2: Determine how to create a directory (if it doesn't already exist) in Python
Google search for python create folder if not exists
Once again, the top search result provided the sample code I was looking for.
With this info, I now know how to create my target directory (folder) if it doesn't yet exist:
# create directory if it doesn't exist
if not os.path.exists(path):
os.mkdir(path)
Putting this all together now...the following code achieves your stated objective.
import os, smartsheet
from datetime import datetime
sheetId = 3932034054809476
# get current date in yyyymmdd format
current_date = datetime.today().strftime('%Y%m%d')
# specify path
path = 'c:/users/kbrandl/desktop/' + current_date
# create directory if it doesn't exist
if not os.path.exists(path):
os.mkdir(path)
# download file to specified path
smartsheet_client.Sheets.get_sheet_as_excel(
sheetId,
path,
'MyFileName.xlsx'
)
I am trying to run this script so that it creates a .txt file based on the date.
#Importing libraries
import os
import sys
#Selecting time and methods
from time import gmtime, strftime
from datetime import datetime
#Fortmatting date & printing it
date = datetime.now().strftime("%Y_%m_%d-%I:%M_%p")
#Creating the name of the file, plus the date we created above
newFile = ("Summary_" + date) + '.txt'
#Creating the filepath where we want to save this file
filepath = os.path.join('C:\\Users\\myUser\\Desktop\\myFolder', newFile)
#If the path doesn't exist
if not os.path.exists('C:\\Users\\myUser\\Desktop\\myFolder'):
os.makedirs('C:\\Users\\MyUser\\Desktop\\myFolder')
f = open(filepath, "w")
Everything works, however, the file gets created as a File and not a .txt file.
I am not sure if I am not concatenating the strings correctly, or if there is an additional step I need to take.
I noticed that if I type anything after the date string, it doesn't get added to the end of the file.
If I type the string manually like this, then it works just fine:
filepath = os.path.join('C:\\Users\\myUser\\Desktop\\myFolder', "Summary_DateGoesHere.txt")
Am I missing something in my script?
I appreciate the assistance.
The issue here is , Windows won't allow to create a file name having colon :
Here, your filename variable is - "Summary_2022_05_05-11:24_AM.txt". As colon is not allowed, its truncating after 11, and creating a file name with just Summary_2022_05_05-11.
To fix this, you can use underscore or hyphen before minutes field like this:
date = datetime.now().strftime("%Y_%m_%d-%I_%M_%p")
In my program I import a certain file, at the end I create a file in which I save the outputs. I would like to use the name of the imported file (with some additions) as the name of the newly created file.
Since I have a large amount of files to import, I would like to do this in order to have a better overview of which files were created based on which files.
For example: I open a file with file=open(‘summer.txt’) and I would like to use the 'summer' of the original file and just add the year, the name of the new file would be ‘summer2020.txt’.
I would like to automatically detect the name of the imported file, add the year and that would be the name of my new file. (the addition to the name ('2020') would be the same for every opened file)
I’ m still new to python, so I don‘t know if this is possible or if there is a better way of doing it. Thank you very much for any help!
If you are using Python 3.9, you can use pathlib and the new with_stem method
file_path=Path('summer.txt')
new_file_path = file_path.with_stem(file_path.stem+'2020')
Advantage of using pathlib is that it supports multiple platforms (win/osx/linux) and their different ways of specifying paths. Also allows you to easily split the file name in to the stem summer and suffix .txt.
For versions of python like 3.6.5, you can do the following instead
new_file_path = file_path.with_name(f"{file_path.stem}2020{file_path.suffix}")
Here's an example to automatically get the current year:
from datetime import datetime
curdate = datetime.now()
curyear = str(curdate.year)
file = open('summer'+curyear+'.txt')
As you have many files you maybe want something like
import glob, shutil
myfiles = glob.glob("*.txt")
for fname in myfiles:
basename = fname[:-4] # drop .txt at the end
newname = f"{basname}2020.txt"
shutil.copy(fname, newname)