I'm writing a function that recursively traverses the file system, and returns a list of all files with the .txt extension.
The pass_test_func parameter is just a function that can be run and checked (i.e. is the file greater than 100 bytes, etc) - The nothing function (set as its default), simply returns its argument.
My implementation:
def visit(dname, pass_test_func=nothing):
directory = os.listdir(dname)
byte_list = []
for file in directory:
file_dir = os.path.join(dname, file)
if os.path.isfile(file_dir) and file_dir.lower().endswith('.txt'):
size = os.path.getsize(file_dir)
if pass_test_func(size):
byte_list.append(str(size) + ' ' + file_dir)
elif os.path.isdir(file_dir):
visit(file_dir, pass_test_func)
return byte_list
My problem is that when I recursively call visit in the following lines
elif os.path.isdir(file_dir):
visit(file_dir, pass_test_func)
the byte_list is cleared to empty again. I understand why this is happening, but have no idea how I would fix it. The list has to be defined within the definition of visit, so whenever I use recursion it will always be reset no matter what right? Maybe some other data structure is better suited, like a tuple or dictionary?
Your function returns byte_list, so just append the returned value when you make your recursive call, instead of throwing it away as you currently do:
elif os.path.isdir(file_dir):
byte_list += visit(file_dir, pass_test_func)
Add an optional argument that can be used in the recursive case:
# Using * makes byte_list keyword-only, so it can't be passed by normal callers by accident
def visit(dname, pass_test_func=nothing, *, byte_list=None):
directory = os.listdir(dname)
# When not passed explicitly, initialize as empty list
if byte_list is None:
byte_list = []
for file in directory:
file_dir = os.path.join(dname, file)
if os.path.isfile(file_dir) and file_dir.lower().endswith('.txt'):
size = os.path.getsize(file_dir)
if pass_test_func(size):
byte_list.append(str(size) + ' ' + file_dir)
elif os.path.isdir(file_dir):
# Pass explicitly to recursive call
visit(file_dir, pass_test_func, byte_list=byte_list)
return byte_list
As an alternative, as suggested by Blorgbeard, since you return byte_list, use that for your visit calls, changing only a single line in your original code:
visit(file_dir, pass_test_func)
to:
byte_list += visit(file_dir, pass_test_func)
This creates additional temporary lists, but that's usually not a big deal.
Related
I'm not sure why I'm seeing this error message: AttributeError: 'generator' object has no attribute 'replace' (on line: modified_file = hex_read_file.replace(batch_to_amend_final, batch_amendment_final).
import binascii, os, re, time
os.chdir(...)
files_to_amend = os.listdir(...)
joiner = "00"
# Allow user to input the text to be replaced, and with what
while True:
batch_to_amend3 = input("\n\nWhat number would you like to amend? \n\n >>> ")
batch_amendment3 = input("\n\nWhat is the new number? \n\n >>> ")
batch_to_amend2 = batch_to_amend3.encode()
batch_to_amend = joiner.encode().join(binascii.hexlify(bytes((i,))) for i in batch_to_amend2)
batch_amendment2 = batch_amendment3.encode()
batch_amendment = joiner.encode().join(binascii.hexlify(bytes((i,))) for i in batch_amendment2)
# Function to translate label files
def lbl_translate(files_to_amend):
with open(files_to_amend, 'rb') as read_file:
read_file2 = read_file.read()
hex_read_file = (binascii.hexlify(bytes((i,))) for i in read_file2)
print(hex_read_file)
modified_file = hex_read_file.replace(batch_to_amend, batch_amendment)
with open(files_to_amend, 'wb') as write_file:
write_file.write(modified_file)
write_file.close()
print("Amended: " + files_to_amend)
# Calling function to modify labels
for label in files_to_amend:
lbl_translate(label)
hex_read_file is a generator comprehension (note the round brackets around the statement) defined here:
hex_read_file = (binascii.hexlify(bytes((i,))) for i in read_file2)
As many already pointed out in the comments, comprehesions don't have a replace method as strings have, so you have two possibilities, depending on your specific use-case:
Turn the comprehension in a bytestring and call replace on that (considering how you use write_file.write(modified_file) afterwards, this is the option that would work with that directly):
hex_read_file = bytes(binascii.hexlify(bytes((int(i),))) for i in read_file2) # note: I added th eadditional int() call to fix the issue mentioned in the comments
Filter and replace directly in the comprehension (and modify how you write out the result):
def lbl_translate(files_to_amend, replacement_map):
with open(files_to_amend, 'rb') as read_file:
read_file2 = read_file.read()
hex_read_file = ( replacement_map.get(binascii.hexlify(bytes((int(i),))), binascii.hexlify(bytes((int(i),)))) for i in read_file2) # see Note below
with open(files_to_amend, 'wb') as write_file:
for b in hex_read_file:
write_file.write(b)
print("Amended: " + files_to_amend)
where replacement_map is a dict that you fill in with the batch_to_amend as key and the batch_amendment value (you can speficy multiple amendments too and it will work just the same). The call would then be:
for label in files_to_amend:
lbl_translate(label,{batch_to_amend:batch_amendment})
NOTE: Using standard python dicts, because of how comprehensions work, you need to call binascii.hexlify(bytes((int(i),))) twice here. A better option uses collections.defaultdict
A better option would use defaultdict, if they were implemented in a sensible way (see here for more context on why I say that). defaltdicts expect a lambda with no parameters generating the value for unknown keys, instead you need to create your own subclass of dict and implement the __missing__ method to obtain the desired behaviour:
hex_read_file = ( replacement_map[binascii.hexlify(bytes((int(i),)))] for i in read_file2) # replacement_map is a collections.defaultdict
and you define replacement_map as:
class dict_with_key_as_default(dict): # find a better name for the type
def __missing__(self, key):
'''if a value is not in the dictionary, return the key value instead.'''
return key
replacement_map = dict_with_key_as_default()
replacement_map[batch_to_amend] = batch_amendment
for label in files_to_amend:
lbl_translate(label, replacement_map)
(class dict_with_key_as_default taken from this answer and renamed for clarity)
Edit note: As mentioned in the comments, the OP has an error in the comprehension where they call hexlify() on some binary string instead of integer values. The solution adds a cast to int for the bytes where relevant, but it's far from the best solution to this problem. Since the OP's intent is not clear, I left it as close to the original as possible, but an alternative solution should be used instead.
Beginner issue here. I am sure there is a more pythonic way of doing this. basically I need to create a string using the contents of a list and a dict, plus I need to insert constants. I am trying to produce a end product for a function call via Eval. My current code works but it smells of 1980s BASIC (which ages me). I've looked at .join, zip and itter package but with no luck.
The list (argumentlist) contains argument names (such as open, close, length) and the dict (self.indicator_paremeters) contains all potential argument names along with their default value. So for example within the dict there is a key 'length' and its default value. In addition I need to concatenate '+' and commas, to create the end string.
Here is code sample to date.
def loop_thru_arguement_list_to_create_end_string(self, argument, resultant_argument_string):
if resultant_argument_string == "":
resultant_argument_string = argument + ' = ' + str(self.indicator_paremeters.get(
argument))
else:
resultant_argument_string = resultant_argument_string + ", " + argument + ' = ' + str(
self.indicator_paremeters.get(
argument))
return resultant_argument_string
This function is called from the loop here (need to rename that function):
def extract_argument_list_from_function(self, fullarguments) -> str:
resultant_argument_string = ""
argumentlist = fullarguments[0]
for argument in argumentlist:
resultant_argument_string = self.loop_thru_arguement_list_to_create_end_string(argument,
resultant_argument_string)
return resultant_argument_string
fullarguments = is a dict from a inspect.getfullargspec call. I only want the args, hence the [0].
All methods above are in a wider class.
self.indicator_paremeters is the dict holding all potential arguments.
The code above works fine but just doesn't feel right. Its the IF statement in particular which doesn't feel pythonic.
In my opinion you just need this:
def extract_argument_list_from_function(self, fullarguments: List) -> str:
res = ''
for arg in fullarguments[0]:
param = self.indicator_paremeters.get(arg)
res = f'{res}, {arg} = {param}' if res else f'{arg} = {param}'
return res
you can delete the loop_thru_arguement_list_to_create_end_string method
I have three similar functions in tld_list.py. I am working out of mainBase.py file.
I am trying to create a variable string which will call the appropriate function by looping through the list of all functions. My code reads from a list of function names, iterates through the list and running the function on each iteration. Each function returns 10 pieces of information from separate websites
I have tried 2 variations annotated as Option A and Option B below
# This is mainBase.py
import tld_list # I use this in conjunction with Option A
from tld_list import * # I use this with Option B
functionList = ["functionA", "functionB", "functionC"]
tldIterator = 0
while tldIterator < len(functionList):
# This will determine which function is called first
# In the first case, the function is functionA
currentFunction = str(functionList[tldIterator])
Option A
currentFunction = "tld_list." + currentFunction
websiteName = currentFunction(x, y)
print(websiteName[1]
print(websiteName[2]
...
print(websiteName[10]
Option B
websiteName = currentFunction(x, y)
print(websiteName[1]
print(websiteName[2]
...
print(websiteName[10]
Even though it is not seen, I continue to loop through the iteration by ending each loop with tldIterator += 1
Both options fail for the same reason stating TypeError: 'str' object is not callable
I am wondering what I am doing wrong, or if it is even possible to call a function in a loop with a variable
You have the function names but what you really want are the function objects bound to those names in tld_list. Since function names are attributes of the module, getattr does the job. Also, it seems like list iteration rather than keeping track of your own tldIterator index would suffice.
import tld_list
function_names = ["functionA", "functionB", "functionC"]
functions = [getattr(tld_list, name) for name in function_names]
for fctn in functions:
website_name = fctn(x,y)
You can create a dictionary to provide a name to function conversion:
def funcA(...): pass
def funcB(...): pass
def funcC(...): pass
func_find = {"Huey": funcA, "Dewey": funcB, "Louie": FuncC}
Then you can call them, e.g.
result = func_find["Huey"](...)
You should avoid this type of code. Try using if's, or references instead. But you can try:
websiteName = exec('{}(x, y)'.format(currentFunction))
UPDATED:
How can I use a function variable within nested function? I've simplified my problem in the following example:
def build():
n = 1 # code to parse rpm package minor version from file
f = min_ver(n) # update minor version
return
def min_ver(n):
n = 2 # this is defined by another process, not set intentionally
s = 1 + n # still need the original value from build()
return s
The actual use case is that I'm grabbing a parsed rpm package minor version value in ex1() from disk called 'n'. When ex2() is executed from ex1(), it deletes the old package, builds a new rpm package with a new minor version. So when it calls for ex1()'s value within the nested function, it's not changed to the new version.
How can I maintain the original 'n' value within the nested function, before passing onto a new value of 'n' post nested function?
A simple way to do this would be to pass the variable as an argument to ex2.
def build():
n = int(1)
f = ex2(n) # pass the value to the next function
n = int(5)
return
def min_ver(n_old):
n = 2
s = 1 + n_old # use the n that was passed in
return s
If you make ex2() actually nested then you can access the outer variables.
def ex1():
n = int(1)
def ex2():
s = 1 + n
return(s)
f = ex2()
n = int(5) # done with nested value, rewrite new value
return()
Also, you probably want to return f or n instead of an empty tuple, I would imagine.
And you don't need to say int(1) you can just say 1. Everything, including integers and strings, is implicitly an object in python.
I am trying to make a decrypter that decrypts code from the encrypter I made. I am getting this type error when I run the code though
getcrypt = ''.join(map(Decrypt.get,split_up_into_sixteen_chars(x_str)))
TypeError: split_up_into_sixteen_cjars() takes 0 positional arguments but 1 was given
I'm fairly new to programming and not sure whats causing this.
heres my code
Decrypt = {'1s25FF5ML10IF7aC' : 'A', 1s2afF5ML10I7ac' : 'a'} #I obviously have more than this but I'm trying to make it as simplified as possible
def split_up_into_sixteen_chars():
while len(x_str)>0:
v = x_str[:16]
print(v)
x_str = (input())
getcrypt = ''.join(map(Decrypt.get,split_up_into_sixteen_chars(x_str)))
print(getcrypt)
You have defined a function that takes no parameters:
def split_up_into_sixteen_chars():
yet you are passing it one:
split_up_into_sixteen_chars(x_str)
You need to tell Python that the function takes one parameter here, and name it:
def split_up_into_sixteen_chars(x_str):
The name used does not have to match the name that you pass in for the function call, but it does have to match what you use inside the function. The following function would also work; all I did was rename the parameter:
def split_up_into_sixteen_chars(some_string):
while len(some_string) > 0:
v = some_string[:16]
print(v)
This works because the parameter some_string becomes a local name, local to the function. It only exists inside of the function, and is gone again once the function completes.
Note that your function creates an infinite loop; the length of some_string will either always be 0, or always be longer than 0. The length does not change in the body of the loop.
The following would work better:
def split_up_into_sixteen_chars(some_string):
while len(some_string) > 0:
v = some_string[:16]
print(v)
some_string = some_string[16:]
because then we replace some_string with a shorter version of itself each time.
Your next problem is that the function doesn't return anything; Python then takes a default return value of None. Printing is something else entirely, print() writes the data to your console or IDE, but the caller of the function does not get to read that information.
In this case, you really want a generator function, and use yield. Generator functions return information in chunks; you can ask a generator for the next chunk one by one, and that is exactly what map() would do. Change the function to:
def split_up_into_sixteen_chars(some_string):
while len(some_string) > 0:
v = some_string[:16]
yield v
some_string = some_string[16:]
or even:
def split_up_into_sixteen_chars(some_string):
while some_string:
yield some_string[:16]
some_string = some_string[16:]
because an empty string is 'false-y' when it comes to boolean tests as used by while and if.
As your map(Decrypt.get, ...) stands, if split_up_into_sixteen_chars() yields anything that is not present as a key in Dycrypt, a None is produced (the default value for dict.get() if the key is not there), and ''.join() won't like that. The latter method can only handle strings.
One option would be to return a string default instead:
''.join(map(lambda chunk: Decrypt.get(chunk, ''), split_up_into_sixteen_chars(x_str)))
Now '', the empty string, is returned for chunks that are not present in Decrypt. This makes the whole script work for whatever string input you have:
>>> x_str='Hello world!'
>>> ''.join(map(lambda chunk: Decrypt.get(chunk, ''), split_up_into_sixteen_chars(x_str)))
''
>>> x_str = '1s25FF5ML10IF7aC'
>>> ''.join(map(lambda chunk: Decrypt.get(chunk, ''), split_up_into_sixteen_chars(x_str)))
'A'