I have a directory of CSV files that I want to import into MySQL. There are about 100 files, and doing a manual import is painful.
My command line is this:
mysqlimport -u root -ppassword --local --fields-terminated-by="|" data PUBACC_FR.dat
The files are all of type XX.dat, i.e. AC.dat, CP.dat, etc. I actually rename them first before processing them (via rename 's/^/PUBACC_/' *.dat). Ideally I'd like to be able to accomplish both tasks in one script: Rename the files, then run the command.
From what I've found reading, something like this:
for filename in os.listdir("."):
if filename.endswith("dat"):
os.rename(filename, filename[7:])
Can anyone help me get started with a script that will accomplish this, please? Read the file names, rename them, then for each one run the mysqlimport command?
Thanks!
I suppose something like the python code below could be used:
import subprocess
import os
if __name__ == "__main__":
for f in os.listdir("."):
if (f.endswith(".dat")):
subprocess.call("echo %s" % f, shell=True)
Obviously, you should change the command from echo to your command instead.
See http://docs.python.org/2/library/subprocess.html for more details of using subprocess, or see the possible duplicate.
Related
I write a python scripts that after execute some db queries, save the result of that queries on different csv files.
Now, it's mandatory to rename this file with the production's timestamps and so every hour i got new file with new name.
The script run with a task scheduler every hour and after save my csv files I need to run automatically the command prompt and execute some command that includes my csv files name in the path....
Is it possible to run the cmd and paste him the path of csv file like a variable? in python I save the file in this way:
date_time_str_csv1 = now.strftime("%Y%m%d_%H%M_csv1")
I don't know how to write automatically the different file name when i call the cmd
If I understand your question correctly, one solution would be to simply execute the command-line command directly from the Python script.
You can use the subprocess module from Python (as also explained here: How do I execute a program or call a system command?).
This could look like this for example:
csv_file_name = date_time_str_csv1 +".csv"
subprocess.run(["cat", csv_file_name)
You can run a system cmd from within Python using os.system:
import os
os.system('command filename.csv')
Since the argument to os.system is a string, you can build it with your created filename above.
you can try using the subprocess library, and get a list of the files in the folder in an array. This example is using the linux shell:
import subprocess
str = subprocess.check_output('ls', shell=True)
arr = str.decode('utf-8').split('\n')
print(arr)
After this you can iterate to find the newest file and use that one as the variable.
I am a newbie to python and linux. I want a solution for listing the files and folders based on the timestamp. I know this is already asked. But I cannot get an insight in what I am doing. I want the code to return the latest created file or folder. I have a script that identifies the type of content(file or folder). I need it to get the latest created content. The content identifier goes like this.
import os
dirlist=[]
dirlist2=[]
for filename in os.listdir('/var/www/html/secure_downloads'):
if (os.path.isdir(os.path.join('/var/www/html/secure_downloads',filename))):
dirlist.append(filename)
else:
dirlist2.append(filename)
print "For Folders",dirlist
print "For Files",dirlist2
Option 1: Use glob
I found this great article here: https://janakiev.com/blog/python-filesystem-analysis/
Option 2: Pipe the output of ls -l to python
The way I initially thought about solving this issue is doing the following...
You can list all the directories and their timestamps with ls -l.
Then you can pipe that output using subprocess like this:
import subprocess
proc=subprocess.Popen('echo "to stdout"', shell=True, stdout=subprocess.PIPE, )
output=proc.communicate()[0]
print(output)
I am trying in a Python script to import a tar.gz file from HDFS and then untar it. The file comes as follow 20160822073413-EoRcGvXMDIB5SVenEyD4pOEADPVPhPsg.tar.gz, it has always the same structure.
In my python script, I would like to copy it locally and the extract the file. I am using the following command to do this:
import subprocess
import os
import datetime
import time
today = time.strftime("%Y%m%d")
#Copy tar file from HDFS to local server
args = ["hadoop","fs","-copyToLocal", "/locationfile/" + today + "*"]
p=subprocess.Popen(args)
p.wait()
#Untar the CSV file
args = ["tar","-xzvf",today + "*"]
p=subprocess.Popen(args)
p.wait()
The import works perfectly but I am not able to extract the file, I am getting the following error:
['tar', '-xzvf', '20160822*.tar']
tar (child): 20160822*.tar: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
put: `reportResults.csv': No such file or directory
Can anyone help me?
Thanks a lot!
Try with the shell option:
p=subprocess.Popen(args, shell=True)
From the docs:
If shell is True, the specified command will be executed through the
shell. This can be useful if you are using Python primarily for the
enhanced control flow it offers over most system shells and still want
convenient access to other shell features such as shell pipes,
filename wildcards, environment variable expansion, and expansion of ~
to a user’s home directory.
And notice:
However, note that Python itself offers implementations of many
shell-like features (in particular, glob, fnmatch, os.walk(),
os.path.expandvars(), os.path.expanduser(), and shutil).
In addition to #martriay answer, you also got a typo - you wrote "20160822*.tar", while your file's pattern is "20160822*.tar.gz"
When applying shell=True, the command should be passed as a whole string (see documentation), like so:
p=subprocess.Popen('tar -xzvf 20160822*.tar.gz', shell=True)
If you don't need p, you can simply use subprocess.call:
subprocess.call('tar -xzvf 20160822*.tar.gz', shell=True)
But I suggest you use more standard libraries, like so:
import glob
import tarfile
today = "20160822" # compute your common prefix here
target_dir = "/tmp" # choose where ever you want to extract the content
for targz_file in glob.glob('%s*.tar.gz' % today):
with tarfile.open(targz_file, 'r:gz') as opened_targz_file:
opened_targz_file.extractall(target_dir)
I found a way to do what I needed, instead of using os command, I used python tar command and it works!
import tarfile
import glob
os.chdir("/folder_to_scan/")
for file in glob.glob("*.tar.gz"):
print(file)
tar = tarfile.open(file)
tar.extractall()
Hope this help.
Regards
Majid
the short of it is that i need a program to upload all txt files from a local directory via sftp, to a specific remote directory. if i run mput *.txt from sftp command line, while im already in the right local directory, then that was what i was shooting for.
Here is the code im trying. No errors when i run it, but no results either when i sftp to the server and ls the upload directory, its empty. i may be barking up the wrong tree all together. i see other solutions like lftp using mget in bash...but i really want this to work with python. either way i have a lot to learn still. this is what ive come up with after a few days reading about what some stackoverflow users suggested, a few libraries that might help. im not sure i can do the "for i in allfiles:" with subprocess.
import os
import glob
import subprocess
os.chdir('/home/submitid/Local/Upload') #change pwd so i can use mget *.txt and glob similarly
pwd = '/Home/submitid/Upload' #remote directory to upload all txt files to
allfiles = glob.glob('*.txt') #get a list of txt files in lpwd
target="user#sftp.com"
sp = subprocess.Popen(['sftp', target], shell=False, stdin=subprocess.PIPE)
sp.stdin.write("chdir %s\n" % pwd) #change directory to pwd
for i in allfiles:
sp.stdin.write("put %s\n" % allfiles) #for each file in allfiles, do a put %filename to pwd
sp.stdin.write("bye\n")
sp.stdin.close()
When you iterate over allfiles, you are not passing the iterator variable sp.stdin.write, but allfiles itself. It should be
for i in allfiles:
sp.stdin.write("put %s\n" % i) #for each file in allfiles, do a put %filename to pwd
You may also need to wait for sftp to authenticate before issuing commands. You could read stdout from the process, or just put some time.sleep delays in your code.
But why not just use scp and build the full command line, then check if it executes successfully? Something like:
result = os.system('scp %s %s:%s' % (' '.join(allfiles), target, pwd))
if result != 0:
print 'error!'
You dont need to iterate over allfiles
sp.stdin.write("put *.txt\n")
is enough. You instruct sftp to put all files at once, instead of one by one.
In my current working directory I have the dir ROOT/ with some files inside.
I know I can exec cp -r ROOT/* /dst and I have no problems.
But if I open my Python console and I write this:
import subprocess
subprocess.call(['cp', '-r', 'ROOT/*', '/dst'])
It doesn't work!
I have this error: cp: cannot stat ROOT/*: No such file or directory
Can you help me?
Just came across this while trying to do something similar.
The * will not be expanded to filenames
Exactly. If you look at the man page of cp you can call it with any number of source arguments and you can easily change the order of the arguments with the -t switch.
import glob
import subprocess
subprocess.call(['cp', '-rt', '/dst'] + glob.glob('ROOT/*'))
Try
subprocess.call('cp -r ROOT/* /dst', shell=True)
Note the use of a single string rather than an array here.
Or build up your own implementation with listdir and copy
The * will not be expanded to filenames. This is a function of the shell. Here you actually want to copy a file named *. Use subprocess.call() with the parameter shell=True.
Provide the command as list instead of the string + list.
The following two commands are same:-
First Command:-
test=subprocess.Popen(['rm','aa','bb'])
Second command:-
list1=['rm','aa','bb']
test=subprocess.Popen(list1)
So to copy multiple files, one need to get the list of files using blob and then add 'cp' to the front of list and destination to the end of list and provide the list to subprocess.Popen().
Like:-
list1=blob.blob("*.py")
list1=['cp']+list1+['/home/rahul']
xx=subprocess.Popen(list1)
It will do the work.