I'm having a function in Python that when being called first, reads the content of a file to a list and checks whether or not an element is within that list.
def is_in_file(element, path):
with open(path, 'r') as f:
lines = [line.strip() for line in f.readlines()]
return element in lines
When the function is being called again, however, the content of the file should not be read again; the function should instead remember the value of lines from the first call.
Is there a way to preserve the context of a function when calling the function again? I don't want to make lines global to not litter the above namespace. I guess it's quite similar to the use of a generator and the yield statement...
My opinion is that the correct way is to encapsulate this in a class. The path is set at instance creation, and method calls use the list of lines. That way you can even have different files at the same time:
class finder:
def __init__(self, path):
with open(path, 'r') as f:
self.lines = [line.strip() for line in f]
def is_in_file(self, element):
return element in lines
That is not exactly what you have asked for, but is much more OO.
Dirty hack: add variable to function object and store value there.
def is_in_file(element, path):
if not hasattr(is_in_file, "__lines__"):
with open(path, 'r') as f:
setattr(is_in_file, "__lines__", [line.strip() for line in f.readlines()])
return element in is_in_file.__lines__
You could save the lines in a keyword argument declared with a mutable default value:
def is_in_file(element, path, lines=[]):
if lines:
return element in lines
with open(path, 'r') as f:
lines += [line.strip() for line in f.readlines()]
return element in lines
Caveat:
you must be sure that this function is only called with one file; if you call it with a second file, it will not open it and continue to return values based on the first file opened.
A more flexible solution:
A more flexible solution is maybe to use a dictionary of lines, where each new file can be opened once and stored, using the path as key; you can then call the function with different files, and get the correct results while memoizing the contents.
def is_in_file(element, path, all_lines={}):
try:
return element in all_lines[path]
except KeyError:
with open(path, 'r') as f:
all_lines[path] = [line.strip() for line in f.readlines()]
return element in lines
OO solution:
Create a class to encapsulate the content of a file, like what #SergeBallesta proposed; although it does not address exactly what you requested, it is likely the better solution in the long run.
Use the functools.lru_cache decorator to set up a helper function that reads in any given file only once and then stores the result.
from functools import lru_cache
#lru_cache(maxsize=1)
def read_once(path):
with open(path) as f:
print('reading {} ...'.format(path))
return [line.strip() for line in f]
def in_file(element, path):
return element in read_once(path)
Demo:
>>> in_file('3', 'file.txt')
reading file.txt ...
True
>>> in_file('3', 'file.txt')
True
>>> in_file('3', 'anotherfile.txt')
reading anotherfile.txt ...
False
>>> in_file('3', 'anotherfile.txt')
False
This has the serious advantage that in_file does not have to be called with the same file name every time.
You can adjust the maxsize argument to a higher number if you want more than one file to be cached at any given time.
Lastly: consider at set for the return value of read_once if all you are interested in are membership tests.
This answer proposes a class similar similar to Serge Ballesta's idea.
The difference is that it totally feels like a function because we use it's __call__ method instead of dot-notation in order to conduct the search.
In addition, you can add as many searchable files as you want.
Setup:
class in_file:
def __init__(self):
self.files = {}
def add_path(self, path):
with open(path) as f:
self.files[path] = {line.strip() for line in f}
def __call__(self, element, path):
if path not in self.files:
self.add_path(path)
return element in self.files[path]
in_file = in_file()
Usage
$ cat file1.txt
1
2
3
$ cat file2.txt
hello
$ python3 -i demo.py
>>> in_file('1', 'file1.txt')
True
>>> in_file('hello', 'file1.txt')
False
>>> in_file('hello', 'file2.txt')
True
Related
I am looking to use .yaml to manage several global parameters for a program. I would prefer to manage this from within a function, something like the below. However, it seems globals().update() does not work when included inside a function. Additionally, given the need to load an indeterminate number of variables with unknown names, using the basic global approach is not appropriate. Ideas?
.yaml
test:
- 12
- 13
- 14
- stuff:
john
test2: yo
Python
import os
import yaml
def load_config():
with open(os.path.join(os.getcwd(), {file}), 'r') as reader:
vals = yaml.full_load(reader)
globals().update(vals)
Desired output
load_config()
test
---------------
[12,13,14,{'stuff':'john'}]
test2
---------------
yo
What I get
load_config()
test
---------------
NameError: name 'test' is not defined
test2
---------------
NameError: name 'test2' is not defined
Please note: {file} is for you, the code is not actually written that way. Also note that I understand the use of global is not normally recommended, however it is what is required for the answer of this question.
You had {file} in your code, I've assumed that was intended to just be a string of the actual filename. I certainly hope you weren't looking to .format() and then eval() this code? That would be a very bad and unsafe way to run code.
Just return the dictionary vals itself, and access it as needed:
import os
import yaml
def load_config(fn):
with open(os.path.join(os.getcwd(), fn), 'r') as reader:
# only returning the value, so doing it in one step:
return yaml.full_load(reader)
cfg = load_config('test.yaml')
print(cfg)
print(cfg['test2'])
Output:
{'test': [12, 13, 14, {'stuff': 'john'}], 'test2': 'yo'}
yo
You should definitely never just update globals() with content from an external file. Use of globals() is only for very specific use cases anyway.
Getting the exact desired output is just a matter of formatting the contents of the dictionary:
import os
import yaml
def load_config(fn):
with open(os.path.join(os.getcwd(), fn), 'r') as reader:
return yaml.full_load(reader)
def print_config(d):
for k, v in d.items():
print(f'{k}\n---------------\n{v}\n')
cfg = load_config('test.yaml')
print_config(cfg)
Which gives exactly the output you described.
Note that this is technically superfluous:
os.path.join(os.getcwd(), fn)
By default, file operations are executed on the current working directory, so you'd achieve the same with:
def load_config(fn):
with open(fn, 'r') as reader:
return yaml.full_load(reader)
If you wanted to open the file in the same folder as the script itself, consider this instead:
def load_config(fn):
with open(os.path.join(os.path.dirname(__file__), fn), 'r') as reader:
return yaml.full_load(reader)
I'm trying to make a sort of logbook in a text file to avoid re-doing efforts. I have the following function that perform this task:
def write_to_logbook(target_name):
with open('C:\Documents\logbook.txt', 'a+') as f:
for lines in f:
if target_name not in lines:
f.write(target_name + '\n')
f.close() #when I didn't have f.close() here, it also wasn't writing to the txt file
When I check the text file after I run the script, it remains empty. I'm not sure why.
I call it as such (in reality target name is pulled down from a unique ID, but since I don't want to put everything here, this is the gist):
target_name = 'abc123'
write_to_logbook(target_name)
You need to (potentially) read the entire file before you can decide if target_name has to be added to the file.
def write_to_logbook(target_name):
fname = r'C:\Documents\logbook.txt')
with open(fname) as f:
if any(target_name in line for line in f):
return
with open(fname, 'a') as f:
print(target_name, file=f)
any will return True as soon as any line containing target_name is found, at which point the function itself will return.
If the target name isn't found after reading the entire file, then the second with statement will append the target name to the file.
I got it sorted. I used chepner's solution as a jumping off point, since it didn't exactly work (only wrote one target_name for some reason) and kind of did a hybrid of the two:
def write_to_logbook(target_name):
fname = 'filepath'
with open(fname) as f:
for lines in f:
if target_name in lines:
return
with open(fname, 'a+') as f:
f.write(target_name + '\n')
Thanks for the solution, it helped.
def Delete_con():
contact_to_delete= input("choose name to delete from contact")
to_Delete=list(contact_to_delete)
with open("phonebook1.txt", "r+") as file:
content = file.read()
for line in content:
if not any(line in line for line in to_Delete):
content.write(line)
I get zero error. but the line is not deleted. This function ask the user what name he or she wants to delete from the text file.
This should help.
def Delete_con():
contact_to_delete= input("choose name to delete from contact")
contact_to_delete = contact_to_delete.lower() #Convert input to lower case
with open("phonebook1.txt", "r") as file:
content = file.readlines() #Read lines from text
content = [line for line in content if contact_to_delete not in line.lower()] #Check if user input is in line
with open("phonebook1.txt", "w") as file: #Write back content to text
file.writelines(content)
Assuming that:
you want the user to supply just the name, and not the full 'name:number' pair
your phonebook stores one name:number pair per line
I'd do something like this:
import os
from tempfile import NamedTemporaryFile
def delete_contact():
contact_name = input('Choose name to delete: ')
# You probably want to pass path in as an argument
path = 'phonebook1.txt'
base_dir = os.path.dirname(path)
with open(path) as phonebook, \
NamedTemporaryFile(mode='w+', dir=base_dir, delete=False) as tmp:
for line in phonebook:
# rsplit instead of split supports names containing ':'
# if numbers can also contain ':' you need something smarter
name, number = line.rsplit(':', 1)
if name != contact_name:
tmp.write(line)
os.replace(tmp.name, path)
Using a tempfile like this means that if something goes wrong while processing the file you aren't left with a half-written phonebook, you'll still have the original file unchanged. You're also not reading the entire file into memory with this approach.
os.replace() is Python 3.3+ only, if you're using something older you can use os.rename() as long as you're not using Windows.
Here's the tempfile documentation. In this case, you can think of NamedTemporaryFile(mode='w+', dir=base_dir, delete=False) as something like open('tmpfile.txt', mode='w+'). NamedTemporaryFile saves you from having to find a unique name for your tempfile (so that you don't overwrite an existing file). The dir argument creates the tempfile in the same directory as phonebook1.txt which is a good idea because os.replace() can fail when operating across two different filesystems.
Instead of this:
FILE = open(f)
do_something(FILE)
FILE.close()
it's better to use this:
with open(f) as FILE:
do_something(FILE)
What if I have something like this?
if f is not None:
FILE = open(f)
else:
FILE = None
do_something(FILE)
if FILE is not None:
FILE.close()
Where do_something also has an "if FILE is None" clause, and still does something useful in that case - I don't want to just skip do_something if FILE is None.
Is there a sensible way of converting this to with/as form? Or am I just trying to solve the optional file problem in a wrong way?
If you were to just write it like this:
if f is not None:
with open(f) as FILE:
do_something(FILE)
else:
do_something(f)
(file is a builtin btw )
Update
Here is a funky way to do an on-the-fly context with an optional None that won't crash:
from contextlib import contextmanager
none_context = contextmanager(lambda: iter([None]))()
# <contextlib.GeneratorContextManager at 0x1021a0110>
with (open(f) if f is not None else none_context) as FILE:
do_something(FILE)
It creates a context that returns a None value. The with will either produce FILE as a file object, or a None type. But the None type will have a proper __exit__
Update
If you are using Python 3.7 or higher, then you can declare the null context manager for stand-in purposes in a much simpler way:
import contextlib
none_context = contextlib.nullcontext()
You can read more about these here:
https://docs.python.org/3.7/library/contextlib.html#contextlib.nullcontext
Since Python 3.7, you can also do
from contextlib import nullcontext
with (open(file) if file else nullcontext()) as FILE:
# Do something with `FILE`
pass
See the official documentation for more details.
This seems to solve all of your concerns.
if file_name is not None:
with open(file_name) as fh:
do_something(fh)
else:
do_something(None)
something like:
if file: #it checks for None,false values no need of "if file is None"
with open(file) as FILE:
do_something(FILE)
else:
FILE=None
In Python 3.3 and above, you can use contextlib.ExitStack to handle this scenario nicely
with contextlib.ExitStack() as stack:
FILE = stack.enter_context(open(f)) if f else None
do_something(FILE)
Python 3.7 supports contextlib.nullcontext, which can be used to avoid creating your own dummy context manager.
This examples shows how you can conditionally open a file or use the stdout:
import contextlib
import sys
def write_to_file_or_stdout(filepath=None, data):
with (
open(filepath, 'w') if filepath is not None else
contextlib.nullcontext(sys.stdout)
) as file_handle:
file_handle.write(data)
contextlib.nullcontext() can be called without any arguments if the value can be None.
While all of the other answers are excellent, and preferable, note that the with expression may be any expression, so you can do:
with (open(file) if file is not None else None) as FILE:
pass
Note that if the else clause were evaluated, to yield None this would result in an exception, because NoneType does not support the appropriate operations to be used as a context manager.
I'm using Python, and would like to insert a string into a text file without deleting or copying the file. How can I do that?
Unfortunately there is no way to insert into the middle of a file without re-writing it. As previous posters have indicated, you can append to a file or overwrite part of it using seek but if you want to add stuff at the beginning or the middle, you'll have to rewrite it.
This is an operating system thing, not a Python thing. It is the same in all languages.
What I usually do is read from the file, make the modifications and write it out to a new file called myfile.txt.tmp or something like that. This is better than reading the whole file into memory because the file may be too large for that. Once the temporary file is completed, I rename it the same as the original file.
This is a good, safe way to do it because if the file write crashes or aborts for any reason, you still have your untouched original file.
Depends on what you want to do. To append you can open it with "a":
with open("foo.txt", "a") as f:
f.write("new line\n")
If you want to preprend something you have to read from the file first:
with open("foo.txt", "r+") as f:
old = f.read() # read everything in the file
f.seek(0) # rewind
f.write("new line\n" + old) # write the new line before
The fileinput module of the Python standard library will rewrite a file inplace if you use the inplace=1 parameter:
import sys
import fileinput
# replace all occurrences of 'sit' with 'SIT' and insert a line after the 5th
for i, line in enumerate(fileinput.input('lorem_ipsum.txt', inplace=1)):
sys.stdout.write(line.replace('sit', 'SIT')) # replace 'sit' and write
if i == 4: sys.stdout.write('\n') # write a blank line after the 5th line
Rewriting a file in place is often done by saving the old copy with a modified name. Unix folks add a ~ to mark the old one. Windows folks do all kinds of things -- add .bak or .old -- or rename the file entirely or put the ~ on the front of the name.
import shutil
shutil.move(afile, afile + "~")
destination= open(aFile, "w")
source= open(aFile + "~", "r")
for line in source:
destination.write(line)
if <some condition>:
destination.write(<some additional line> + "\n")
source.close()
destination.close()
Instead of shutil, you can use the following.
import os
os.rename(aFile, aFile + "~")
Python's mmap module will allow you to insert into a file. The following sample shows how it can be done in Unix (Windows mmap may be different). Note that this does not handle all error conditions and you might corrupt or lose the original file. Also, this won't handle unicode strings.
import os
from mmap import mmap
def insert(filename, str, pos):
if len(str) < 1:
# nothing to insert
return
f = open(filename, 'r+')
m = mmap(f.fileno(), os.path.getsize(filename))
origSize = m.size()
# or this could be an error
if pos > origSize:
pos = origSize
elif pos < 0:
pos = 0
m.resize(origSize + len(str))
m[pos+len(str):] = m[pos:origSize]
m[pos:pos+len(str)] = str
m.close()
f.close()
It is also possible to do this without mmap with files opened in 'r+' mode, but it is less convenient and less efficient as you'd have to read and temporarily store the contents of the file from the insertion position to EOF - which might be huge.
As mentioned by Adam you have to take your system limitations into consideration before you can decide on approach whether you have enough memory to read it all into memory replace parts of it and re-write it.
If you're dealing with a small file or have no memory issues this might help:
Option 1)
Read entire file into memory, do a regex substitution on the entire or part of the line and replace it with that line plus the extra line. You will need to make sure that the 'middle line' is unique in the file or if you have timestamps on each line this should be pretty reliable.
# open file with r+b (allow write and binary mode)
f = open("file.log", 'r+b')
# read entire content of file into memory
f_content = f.read()
# basically match middle line and replace it with itself and the extra line
f_content = re.sub(r'(middle line)', r'\1\nnew line', f_content)
# return pointer to top of file so we can re-write the content with replaced string
f.seek(0)
# clear file content
f.truncate()
# re-write the content with the updated content
f.write(f_content)
# close file
f.close()
Option 2)
Figure out middle line, and replace it with that line plus the extra line.
# open file with r+b (allow write and binary mode)
f = open("file.log" , 'r+b')
# get array of lines
f_content = f.readlines()
# get middle line
middle_line = len(f_content)/2
# overwrite middle line
f_content[middle_line] += "\nnew line"
# return pointer to top of file so we can re-write the content with replaced string
f.seek(0)
# clear file content
f.truncate()
# re-write the content with the updated content
f.write(''.join(f_content))
# close file
f.close()
Wrote a small class for doing this cleanly.
import tempfile
class FileModifierError(Exception):
pass
class FileModifier(object):
def __init__(self, fname):
self.__write_dict = {}
self.__filename = fname
self.__tempfile = tempfile.TemporaryFile()
with open(fname, 'rb') as fp:
for line in fp:
self.__tempfile.write(line)
self.__tempfile.seek(0)
def write(self, s, line_number = 'END'):
if line_number != 'END' and not isinstance(line_number, (int, float)):
raise FileModifierError("Line number %s is not a valid number" % line_number)
try:
self.__write_dict[line_number].append(s)
except KeyError:
self.__write_dict[line_number] = [s]
def writeline(self, s, line_number = 'END'):
self.write('%s\n' % s, line_number)
def writelines(self, s, line_number = 'END'):
for ln in s:
self.writeline(s, line_number)
def __popline(self, index, fp):
try:
ilines = self.__write_dict.pop(index)
for line in ilines:
fp.write(line)
except KeyError:
pass
def close(self):
self.__exit__(None, None, None)
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
with open(self.__filename,'w') as fp:
for index, line in enumerate(self.__tempfile.readlines()):
self.__popline(index, fp)
fp.write(line)
for index in sorted(self.__write_dict):
for line in self.__write_dict[index]:
fp.write(line)
self.__tempfile.close()
Then you can use it this way:
with FileModifier(filename) as fp:
fp.writeline("String 1", 0)
fp.writeline("String 2", 20)
fp.writeline("String 3") # To write at the end of the file
If you know some unix you could try the following:
Notes: $ means the command prompt
Say you have a file my_data.txt with content as such:
$ cat my_data.txt
This is a data file
with all of my data in it.
Then using the os module you can use the usual sed commands
import os
# Identifiers used are:
my_data_file = "my_data.txt"
command = "sed -i 's/all/none/' my_data.txt"
# Execute the command
os.system(command)
If you aren't aware of sed, check it out, it is extremely useful.