I have a working code to print random lines from a csv column.
#!/usr/bin/python
import csv
from random import shuffle
filename = 'example.csv'
col = 2
sample = 100
with open(filename, 'r') as f:
reader = csv.reader(f)
data = [row[col] for row in reader]
shuffle(data)
print '\n'.join(data[:sample])
How can I parameterize this script by passing filename, col & sample (e.g. 100 values)?
Use argparse module:
The argparse module makes it easy to write user-friendly command-line
interfaces. The program defines what arguments it requires, and
argparse will figure out how to parse those out of sys.argv. The
argparse module also automatically generates help and usage messages
and issues errors when users give the program invalid arguments.
It's pretty powerful: you can specify help messages, make validations, provide defaults..whatever you can imagine about working with command-line arguments.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("-p", "--position", type=int)
parser.add_argument("-s", "--sample", type=int)
args = parser.parse_args()
col = args.position
sample = args.sample
print col
print sample
Here's what on the command-line:
$ python test.py --help
usage: test.py [-h] [-p POSITION] [-s SAMPLE]
optional arguments:
-h, --help show this help message and exit
-p POSITION, --position POSITION
-s SAMPLE, --sample SAMPLE
$ python test.py -p 10 -s 100
10
100
$ python test.py --position 10 --sample 100
10
100
Speaking about the code you've provided:
unused import random statement
move from random import shuffle to the top of the script
no need to call f.close() (especially with ;) - with handles closing the file automagically
Here's how the code would look like after the fixes:
#!/usr/bin/python
import argparse
import csv
from random import shuffle
parser = argparse.ArgumentParser()
parser.add_argument("-p", "--position", type=int)
parser.add_argument("-s", "--sample", type=int)
args = parser.parse_args()
with open('<filename>', 'r') as f:
reader = csv.reader(f)
data = [row[args.position] for row in reader]
shuffle(data)
print '\n'.join(data[:args.sample])
You can use the sys module like this to pass command line arguments to your Python script.
import sys
name_of_script = sys.argv[0]
position = sys.argv[1]
sample = sys.argv[2]
and then your command line would be...
./myscript.py 10 100
I could not install argparse due to lack of permission - meanwhile.
Here's the solution:
#!/usr/bin/python
import csv # This will help us reading csv formated files.
import sys
import optparse #import optparse
from random import shuffle
parser=optparse.OptionParser()
# import options
parser.add_option('-f','--filename',help='Pass the csv filename')
parser.add_option('-p','--position',help='column position in the file',type=int)
parser.add_option('-s','--sample',help='sample size',type=int)
(opts,args) = parser.parse_args() # instantiate parser
# Program to select random values
with open('<filepath>'+opts.filename,'r') as f:
reader=csv.reader(f)
data=[row[opts.position] for row in reader]
shuffle(data)
#print '\n'.join(data[:opts.sample])
# create o/p file
file=open("<opfilename>.txt","w")
file.write('\n'.join(data[:opts.sample]))
file.close()
Related
This is a simple ask but I can't find any information on how to do it: I have a python script that is designed to take in a text file of a specific format and perform functions on it--how do I pipe a test file into the python script such that it is recognized as input()? More specifically, the Python is derived from skeleton code I was given that looks like this:
def main():
N = int(input())
lst = [[int(i) for i in input().split()] for _ in range(N)]
intervals = solve(N, lst)
print_solution(intervals)
if __name__ == '__main__':
main()
I just need to understand how to, from the terminal, input one of my test files to this script (and see the print_solution output)
Use the fileinput module
input.txt
...input.txt contents
script.py
#!/usr/bin/python3
import fileinput
def main():
for line in fileinput.input():
print(line)
if __name__ == '__main__':
main()
pipe / input examples:
$ cat input.txt | ./script.py
...input.txt contents
$ ./script.py < input.txt
...input.txt contents
You can take absolute or relative path in your input() function and then open this path via open()
filename = input('Please input absolute filename: ')
with open(filename, 'r') as file:
# Do your stuff
Please let me know if I misunderstood your question.
You can either:
A) Use sys.stdin (import sys at the top of course)
or
B) Use the ArgumentParser (from argparse import ArgumentParser) and pass the file as an argument.
Assuming A it would look something like this:
python script.py < file.extension
Then in the script it would look like:
fData = []
for line in sys.stdin.readLines():
fData.append(line)
# manipulate fData
There are a number of ways to achieve what you want. This is what I came up with off the top of my head. It may not be the best / efficient way, but it should work. I do a lot of file I/O with python at work and this is one of the ways I've achieved it in the past.
Note: If you want to write the manipulated lines back to the file use the argparse library.
Edit:
from argparse import ArgumentParser
def parseInput():
parser = ArgumentParser(description = "Takes input file to read")
parser.add_argument('-f', type = str, default = None, required =
True, help = "File to perform I/O on")
args = parser.parse_args()
return args
def main():
args = parseInput()
fData = []
# perform rb
with open(args.f, 'r') as f:
for line in f.readlines():
fData.append(line)
# Perform data manipulations
# perform wb
with open(args.f, 'w') as f:
for line in fData:
f.write(line)
if __name__ == "__main__":
main()
Then on command line it would look like:
python yourScript.py -f fileToInput.extension
I currently have the following code which works to produce a PDF output. Is there a better way of writing up the content for the PDF, other than done here? This is a basic pdf, but am hoping to include multiple variables in later versions. I have inserted variable x, defined before the PDF content, into the latex pdf. Many thanks for any advice you can give.
PDF Output - image
import os
import subprocess
x = 7
content = \
r'''\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage[margin=1cm,landscape]{geometry}
\title{Spreadsheet}
\author{}
\date{}
\begin{document}''' + \
r'This is document version: ' + str(x) +\
r'\end{document}'
parser = argparse.ArgumentParser()
parser.add_argument('-c', '--course')
parser.add_argument('-t', '--title')
parser.add_argument('-n', '--name',)
parser.add_argument('-s', '--school', default='My U')
args = parser.parse_args()
with open('doc.tex','w') as f:
f.write(content%args.__dict__)
cmd = ['pdflatex', '-interaction', 'nonstopmode', 'doc.tex']
proc = subprocess.Popen(cmd)
proc.communicate()
retcode = proc.returncode
if not retcode == 0:
os.unlink('doc.pdf')
raise ValueError('Error {} executing command: {}'.format(retcode, ' '.join(cmd)))
os.unlink('doc.tex')
os.unlink('doc.log')```
As explained in this video, I think a better approach would be to export the variables from Python and save them into a .dat file using the following function.
def save_var_latex(key, value):
import csv
import os
dict_var = {}
file_path = os.path.join(os.getcwd(), "mydata.dat")
try:
with open(file_path, newline="") as file:
reader = csv.reader(file)
for row in reader:
dict_var[row[0]] = row[1]
except FileNotFoundError:
pass
dict_var[key] = value
with open(file_path, "w") as f:
for key in dict_var.keys():
f.write(f"{key},{dict_var[key]}\n")
Then you can call the above function and save all the variables into mydata.dat. For example, in Python, you could save a variable and call it document_version using the following line of code:
save_var_latex("document_version", 21)
In LaTeX (in the preamble of your main file), you just have to import the following packages:
% package to open file containing variables
\usepackage{datatool, filecontents}
\DTLsetseparator{,}% Set the separator between the columns.
% import data
\DTLloaddb[noheader, keys={thekey,thevalue}]{mydata}{../mydata.dat}
% Loads mydata.dat with column headers 'thekey' and 'thevalue'
\newcommand{\var}[1]{\DTLfetch{mydata}{thekey}{#1}{thevalue}}
Then in the body of your document just use the \var{} command to import the variable, as follows:
This is document version: \var{document_version}
I am new on python. I am using a python script where I load a json file to set certain values to the script, but my idea is to import that file more dynamically using arguments (I think is the correct use), so I donĀ“t need to always include the name of the json file in the python script, here is may code example:
import json
from pprint import pprint
with open("VariableSettings.json") as json_data:
data = json.load(json_data)
so my idea is to change the code: "with open("VariableSettings.json") as json_data" with args to open the json file dynamically.
I think that on command prompt I can use the command py test.py arg1 (this represent the file path).
So I know that probably my explanation is a bit confusing but if some can help I appreciate it.
You can use sys to do that. In the example below I created a file test.json with the content
{"foo": "bar"}
And modified your code as
import json
import sys
with open(sys.argv[1]) as json_data:
data = json.load(json_data)
print(data)
You need to call execute as
python test.py test.json
and the output will be
{'foo': 'bar'}
More details can be found in this other post
You can also use argparse:
import json
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--filename", required=True, type=str)
args = parser.parse_args()
with open(args.filename) as json_data:
data = json.load(json_data)
print(data)
Which can be called with the alias:
python test.py -f test.json
Or full argument name:
python test.py --filename test.json
And if you don't supply a file, you get:
usage: test.py [-h] -f FILENAME
test.py: error: the following arguments are required: -f/--filename
since I passed required=True. You can remove this if you want the argument to be optional.
Addtionally, you could also extend your program to check that if the JSON file has correct format by catching json.JSONDecodeError with try/except:
with open(args.filename) as json_data:
try:
data = json.load(json_data)
print(data)
except json.JSONDecodeError:
print('Invalid JSON format')
Use the sys module
Ex:
import sys
import json
from pprint import pprint
if len(sys.argv) < 2:
print("Input File Missing")
sys.exit()
with open(sys.argv[1]) as json_data:
data = json.load(json_data)
print(data)
To Run Use
python yourScriptName.py full_path_to.json
I am reading the book Violent Python and one example is a zip file cracker, which tests a dictionary file of potential passwords (a text file) against a zip file.
I am trying to use the docopt library to parse the command line and give me the filenames for these two files. Here is my code.
#!/usr/bin/python
# -*- coding: utf-8 -*-
"""
Basic zip bruteforcer
Usage:
your_script.py (-f <file>) (-z <zip> )
your_script.py -h | --help
Options:
-h --help Show this screen.
-f --file specify dictionary file. (Required argument)
-z --zip specify zip file.(Required argument)
-t --thread Thread count. (Optional)
-v --verbose Turn debug on. (Optional)
"""
from docopt import docopt
import zipfile
from threading import Thread
def extractzip(zfile, password):
try:
zfile.extractall(pwd = password)
print 'password found: ', password
except:
return
def main():
zfile = zipfile.ZipFile(zip)
with open(file, 'r') as pass_file:
for line in pass_file.readlines():
password = line.strip('\n')
t = Thread(target = extractzip, args = (zfile, password))
t.start()
if __name__ == "__main__":
arguments = docopt(__doc__, version='0.1a')
extractzip(arguments['<file>'], ['<zip>'])
Here is my code.
You can do something like
print arguments['<file>']
to get the file name, and similarly the zip file. I personally haven't used extractzip and am not sure how that works. But since arguments is simply a list, you can easily get the values by accessing the required index directly.
I'm trying open a txt file and read it, but I'm getting a type error and I'm unsure why. If you you could please provide a reasoning along with the correct syntax, I'm trying to get a better grasp of what's going on underneath. Here's the code, it's pretty simple I think:
from sys import argv
script = argv
filename = argv
txt = open(filename)
print "Here's your file %r" %filename
print txt.read()
Muchas Gracias
argv is a list, not a string. You want
script = argv[0]
filename = argv[1]
Consider using argparse instead of handing sys.argv directly:
>>> import argparse
>>> parser = argparse.ArgumentParser(description="Print a file.")
>>> parser.add_argument("path", type=str, nargs=1, help="Path to the file to be printed.")
_StoreAction(option_strings=[], dest='path', nargs=1, const=None, default=None, type=<type 'str'>, choices=None, help='Path to the file to be printed.', metavar=None)
>>> args = parser.parse_args()
>>> print args
Namespace(path=[<...>])
It looks much more complicated, but it will make your command-line utility much more flexible, easier to expand, and it will ensure you get proper documentation on the command line.
First of all, argv is a list of arguments. Open doesn't take a list. That's why you're getting a type error.
Second, open (should) take 2 parameters, the filename and the mode (yes mode is optional, but get in the habit of putting it there. Replace with
import sys
script = sys.argv[0]
filename = sys.argv[1]
txt = open(filename, 'r')
print "Here's your file %r" %filename
print txt.read()
argv will be a list, while filename should be a string.
You should probably be using filename = argv[1]