I am implementing a logging process which append to the log file.
I want to check, if the log file exist then append to more lines to the file. If not then create the new file then append but i keep getting error saying: No such file or directory
try:
f = open(os.path.join(
BASE_DIR, '/app/logs/log-' + current_date + '.csv'), "a+")
f.write(message + "\n")
except IOError:
f = open(os.path.join(
BASE_DIR, '/app/logs/log-' + current_date + '.csv'), "w+")
f.write(message + "\n")
finally:
f.close()
What mistake am i making here?
============ Update
This code is working :
try
f = open('log-' + current_date + '.csv'), "a+")
f.write(message + "\n")
except IOError:
f = open('log-' + current_date + '.csv'), "w+")
f.write(message + "\n")
finally:
f.close()
if i open the file like this, its working. But as soon as i add the path there. Its just keep saying no file or directory.
=============== Update
Never mind, it has been working.I forgot to rebuild my docker image to see the results. :DD.
So the problem is the incorrect path.
The output of os.path.join will be /app/logs/log-<current_date>.csv. This is not what you want. Remove the leading / from that second argument and it will work as you want. This happens because you passed it an absolute path as the second input. See os.path.join documentation for an explanation.
Why not do something like this:
import os
check = os.path.isfile("file.txt")
with open("file.txt", "a+") as f:
if not check:
f.write("oi")
else:
f.write("oi again")
Related
I've written a simple python script to search for a log file in a folder (which has approx. 4 million files) and read the file.
Currently, the average time taken for the entire operation is 20 seconds. I was wondering if there is a way to get the response faster.
Below is my script
import re
import os
import timeit
from datetime import date
log_path = "D:\\Logs Folder\\"
rx_file_name = r"[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12}"
log_search_script = True
today = str(date.today())
while log_search_script:
try:
log_search = input("Enter image file name: ")
file_name = re.search(rx_file_name, log_search).group()
log_file_name = str(file_name) + ".log"
print(f"\nLooking for log file '{log_file_name}'...\n")
pass
except:
print("\n ***** Invalid input. Try again! ***** \n")
continue
start = timeit.default_timer()
if log_file_name in os.listdir(log_path):
log_file = open(log_path + "\\" + log_file_name, 'r', encoding="utf8")
print('\n' + "--------------------------------------------------------" + '\n')
print(log_file.read())
log_file.close()
print('\n' + "--------------------------------------------------------" + '\n')
print("Time Taken: " + str(timeit.default_timer() - start) + " seconds")
print('\n' + "--------------------------------------------------------" + '\n')
else:
print("Log File Not Found")
search_again = input('\nDo you want to search for another log ("y" / "n") ?').lower()
if search_again[0] == 'y':
print("======================================================\n\n")
continue
else:
log_search_script = False
Your problem is the line:
if log_file_name in os.listdir(log_path):
This has two problems:
os.listdir will create a huge list which can take a lot of time (and space...).
the ... in ... part will now go over that huge list linearly and search for the file.
Instead, let your OS do the hard work and "ask for forgivness, not permission". Just assume the file is there and try to open it. If it is not actually there - an error will be raised, which we will catch:
try:
with open(log_path + "\\" + log_file_name, 'r', encoding="utf8") as file:
print(log_file.read())
except FileNotFoundError:
print("Log File Not Found")
You can use glob.
import glob
print(glob.glob(directory_path))
I know there's a lot of content about reading & writing out there, but I'm still not quite finding what I need specifically.
I have 5 files (i.e. in1.txt, in2.txt, in3.txt....), and I want to open/read, run the data through a function I have, and then output the new returned value to corresponding new files (i.e. out1.txt, out2.txt, out3.txt....)
I want to do this in one program run. I'm not sure how to write the loop to process all the numbered files in one run.
If you want them to be processed serially, you can use a for loop as follows:
inpPrefix = "in"
outPrefix = "out"
for i in range(1, 6):
inFile = inPrefix + str(i) + ".txt"
with open(inFile, 'r') as f:
fileLines = f.readlines()
# process content of each file
processedOutput = process(fileLines)
#write to file
outFile = outPrefix + str(i) + ".txt"
with open(outFile, 'w') as f:
f.write(processedOutput)
Note: This assumes that the input and output files are in the same directory as the script is in.
If you are looking just for running one by one separately you can do:
import os
count = 0
directory = "dir/where/your/files/are/"
for filename in os.listdir(directory):
if filename.endswith(".txt"):
count += 1
with open(directory + filename, "r") as read_file:
return_of_your_function = do_something_with_data()
with open(directory + count + filename, "w") as write_file:
write_file.write(return_of_your_function)
Here, you go! I would do something like this:
(Assuming all the input .txt files are in the same input folder)
input_path = '/path/to/input/folder/'
output_path = '/path/to/output/folder/'
for count in range(1,6):
input_file = input_path + 'in' + str(count) + '.txt'
output_file = output_path + 'out' + str(count) + '.txt'
with open(input_file, 'r') as f:
content = f.readlines()
output = process_input(content)
with open(output_file, 'w') as f:
w.write(output)
Im trying to write some data to a file but I have some problems with the path Im using.
This is my code:
my_path = r'c:\data\XYM\Desktop\MyFolder 7-sep'
with open(my_path + '\\' + 'Vehicles_MM' + '\\' + name_vehicile + '-AB.txt', 'w') as output:
writer = csv.writer(output, delimiter = '\t')
writer.writerow(headers)
writer.writerow(data)
for vehicle_loc_list in vehicle_loc_dict.values():
for record_group in group_records(vehicle_loc_list):
writer.writerow(output_record(record_group))
This is the error I receive:
FileNotFoundError: [Errno 2] No such file or directory: 'c:\\data\\XYM\\Desktop\\MyFolder 7-sep\\Vehicles_MM\\20200907-AB.txt'
Based on revelations in comments, the problem is that you are trying to write to a subdirectory c:\data\XYM\Desktop\MyFolder 7-sep\Vehicle_MM\ which doesn't exist, and which actually you don't want to write into.
The fix is to remove the directory separator \\; maybe use a different separator instead. For example,
with open(my_path + '\\' + 'Vehicles_MM-' + name_vehicile + '-AB.txt', 'w') as output:
If you did want to write to this subdirectory, you have to make sure it exists before you attempt to open a file inside it.
os.makedirs(my_path + '\\' + 'Vehicles_MM', exist_ok=True)
with open(...
The same thing is somewhat more readable with pathlib.Path;
from pathlib import Path
my_path = Path(r'c:\data\XYM\Desktop\MyFolder 7-sep')
vehicles_mm = my_path / 'Vehicles_MM'
vehicles_mm.mkdir(parents=True, exist_ok=True)
filename = vehicles_m / name_vehicile + '-AB.txt'
with filename.open('w') as output:
...
You should use one of the builtins to work with paths. Either os.path or pathlib.Path
# with os.path:
import os.path as p
filename = p.join(my_path, "Vehicles_MM", name_vehicle + "-AB.txt")
assert p.exists(p.dirname(filename))
# with pathlib.Path:
from pathlib import Path
my_path = Path("c:\data\XYM\Desktop\MyFolder 7-sep")
filename = my_path.joinpath("Vehicles_MM", name_vehicle + "-AB.txt")
assert filename.parent.exists()
I have a script where I download a bunch of Zip files (150+) from a website and unzip them. I just noticed that the Zip files weren't completely extracting - i.e., there should be 68 files in each directory and there are only 62. The script ran fine with no errors.
Any thoughts? I tried running one Zip file through by itself and it extracted fine. Could the operation be timing out or something? Please forgive my code, I'm new.
I'm running Python 2.7.
import csv, urllib, urllib2, zipfile
from datetime import date
dlList =[]
dloadUrlBase = r"https://websoilsurvey.sc.egov.usda.gov/DSD/Download/Cache/SSA/"
dloadLocBase = r"Z:/Shared/Corporate/Library/GIS_DATA/Soils/"
stateDirList =[]
countyDirList =[]
fileNameList=[]
unzipList =[]
extractLocList=[]
logfile = 'log_{}.txt'.format(date.today())
with open(r'N:\Shared\Service Areas\Geographic Information Systems\Tools and Scripts\Soil_Downloads\FinalListforDownloads.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
stateDirList.append(row['StateDir'])
countyDirList.append(row['CountyDir'])
fileNameList.append(row['File_Name'])
for state, county, fileName in zip(stateDirList, countyDirList, fileNameList):
dloadDir = dloadLocBase + state + r"/" + county + "/" + fileName
requestURL = dloadUrlBase + fileName
extractLocList.append(dloadLocBase + state + r"/" + county + "/")
try:
urllib.urlretrieve(requestURL, dloadDir)
print requestURL + " found"
urllib.urlcleanup()
unzipList.append(dloadDir)
f = open(logfile, 'a+')
f.write(dloadDir + " has been downloaded")
f.close()
except:
pass
for zFile, uzDir in zip(unzipList, extractLocList):
zip_ref = zipfile.ZipFile(zFile, "r")
zip_ref.extractall(uzDir)
zip_ref.close()
Instead of just passing when an error is raised, log / print what the error is. That should indicate what is the issue or a set of issues.
except Exception as e:
print e # or print e.message
Turns out that this was a syncing issue with my network. We use a cloud based network that syncs with our offices, so somehow not all of the files were being synced and getting left in a queue.
So I'm writing a script to take large csv files and divide them into chunks. These files each have lines formatted accordingly:
01/07/2003,1545,12.47,12.48,12.43,12.44,137423
Where the first field is the date. The next field to the right is a time value. These data points are at minute granularity. My goal is to fill files with 8 days worth of data, so I want to write all the lines from a file for 8 days worth into a new file.
Right now, I'm only seeing the program write one line per "chunk," rather than all the lines. Code shown below and screenshots included showing how the chunk directories are made and the file as well as its contents.
For reference, day 8 shown and 1559 means it stored the last line right before the mod operator became true. So I'm thinking that everything is getting overwritten somehow since only the last values are being stored.
import os
import time
CWD = os.getcwd()
WRITEDIR = CWD+"/Divided Data/"
if not os.path.exists(WRITEDIR):
os.makedirs(WRITEDIR)
FILEDIR = CWD+"/SP500"
os.chdir(FILEDIR)
valid_files = []
filelist = open("filelist.txt", 'r')
for file in filelist:
cur_file = open(file.rstrip()+".csv", 'r')
cur_file.readline() #skip first line
prev_day = ""
count = 0
chunk_count = 1
for line in cur_file:
day = line[3:5]
WDIR = WRITEDIR + "Chunk"
cur_dir = os.getcwd()
path = WDIR + " "+ str(chunk_count)
if not os.path.exists(path):
os.makedirs(path)
if(day != prev_day):
# print(day)
prev_day = day
count += 1
#Create new directory
if(count % 8 == 0):
chunk_count += 1
PATH = WDIR + " " + str(chunk_count)
if not os.path.exists(PATH):
os.makedirs(PATH)
print("Chunk count: " + str(chunk_count))
print("Global count: " + str(count))
temp_path = WDIR +" "+str(chunk_count)
os.chdir(temp_path)
fname = file.rstrip()+str(chunk_count)+".csv"
with open(fname, 'w') as f:
try:
f.write(line + '\n')
except:
print("Could not write to file. \n")
os.chdir(cur_dir)
if(chunk_count >= 406):
continue
cur_file.close()
# count += 1
The answer is in the comment but let me give it here so that your question is answered.
You're opening your file in 'w' mode which overwrites all the previously written content. You need to open it in the 'a' (append) mode:
fname = file.rstrip()+str(chunk_count)+".csv"
with open(fname, 'a') as f:
See more on open function and modes in Python documentation. It specifically mentions about 'w' mode:
note that 'w+' truncates the file