Good morning everyone,
I'm trying to make a mini software where i read a csv, insert it into a variable and then give this variable to a check_call function.
The CSV is a list of databases:
cat test_db.csv
andreadb
billing
fabiodb
And this is what i wrote right now:
from subprocess import *
import csv
#Load the CSV inside the variable data
with open('test_db.csv', 'r') as csvfile:
data = list(csv.reader(csvfile))
#For loop that per each database it shows me the tables and the output saved into risultato.txt
for line in data:
database = line
check_call[("beeline", "-e", "\"SHOW TABLES FROM \"", database, ";" , ">>" , "risultato.txt")]
When i execute it i get the following error:
Traceback (most recent call last):
File "test_query.py", line 10, in <module>
check_call[("beeline", "-e", "\"SHOW TABLES FROM \"", database, ";")]
TypeError: 'function' object has no attribute '__getitem__'
I'm relatively new to python and this is my first project, so any help would be great.
If i didn't explained correctly something, please tell me and i'll edit the post.
Thanks!!
You have mistyped the function call. It should be
check_call(["beeline", "-e", "\"SHOW TABLES FROM \"", database, ";" , ">>" , "risultato.txt"])
The ( was placed after [ in your question. It should be ( first followed by a list of commands and params.
After a lot of tinkering i found a way to concatenate the variable in this check_call:
for line in data:
i=0
database=str(line[i])
check_call(["beeline -e \"SHOW TABLES FROM "+database+"\" >> risultato.txt"])
i+=i
After execution it produces the correct output after saving it in risultato.txt:
+-----------+
| tab_name |
+-----------+
| result |
| results |
+-----------+
+------------------------------------+
| tab_name |
+------------------------------------+
| tab_example_1 |
| tab_example_2 |
+------------------------------------+
+---------------------------------------+
| tab_name |
+---------------------------------------+
| tab_example_3 |
| tab_example_4 |
+---------------------------------------+
Related
I need the arguments for the ghostscript in order to convert a double-up pdf page to a simple column pdf page
the input
+--------+-------+
| | |
| | |
| | |
| 1 | 2 |
| | |
| | |
+--------+--------+
the output
+-------+
| |
| 1 |
| |
| |
| |
| |
+--------+
+-------+
| |
| 2 |
| |
| |
| |
| |
+--------+
So depending on these two posts post1 and post2 I created this code
import sys
import locale
import ghostscript
args = [
"-ooutput.pdf",
"-sDEVICE=pdfwrite",
"-g2807x5950"
"-fpdfFile.pdf"
]
# arguments have to be bytes, encode them
encoding = locale.getpreferredencoding()
args = [a.encode(encoding) for a in args]
ghostscript.Ghostscript(*args)
I expeced a 2 page pdf file but a fatal error was raised
Edit: this is the error message
enter image description here
If you read the text its says "Device pdfwrite requires an output file but no file was specified". So that tells you that -o was ignored, or there was some other problem with it.
I suspect you are using the Ghostscript DLL, rather than forking a process, in which case you have to set argv[0] to a dummy value. The reason is that, when running a C program, argv[0] is the name of the executable. So the args processing skips over the 0th element of the args array.
This is covered in the Ghostscript documentation here
NB there also looks like a missing '.' in your argument list, but I don't speak Python so I could be wrong.
You probably need to change your args to something like:
args = [
"MyApp",
"-o output.pdf",
"-sDEVICE=pdfwrite",
"-g2807x5950",
"-fpdfFile.pdf"
]
I am extracting certain data from a csv file using Ruby and I want to cleanup the extracted string by removing the unwanted characters.
This is how I extract the data so far:
CSV.foreach(data_file, :encoding => 'windows-1251:utf-8', :headers => true) do |row|
#create an array for each page
page_data = []
#For each page, get the data we are interested in and save it to the page_data
page_data.push(row['dID'])
page_data.push(row['xTerm'])
pages_to_import.push(page_data)
Then I output the csv file with the extracted data
The output extracted is exactly as it is on the csv data file:
| ID | Term |
|-------|-----------------------------------------|
| 13241 | ##106#107#my##106#term## |
| 13345 | ##63#hello## |
| 11436 | ##55#rock##20#my##10015#18#world## |
However, My desired result that I want to achieve is:
| ID | Term |
|-------|-----------------------------------------|
| 13241 | my, term |
| 13345 | hello |
| 11436 | rock, my, world |
Any suggestions on how to achieve this?
Libraries that Im using:
require 'nokogiri'
require 'cgi'
require 'csv'
Using a regular expression, I'd do:
%w[
##106#107#term1##106#term2##
##63#term1##
##55#term1##20#term2##10015#18#term3##
##106#107#my##106#term##
##63#hello##
##55#rock##20#my##10015#18#world##
].map{ |str|
str.scan(/[^##]+?)(?=#/)
}
# => [["term1", "term2"], ["term1"], ["term1", "term2", "term3"], ["my", "term"], ["hello"], ["rock", "my", "world"]]
My str is the equivalent of the contents of your row['xTerm'].
The regular expression /[^##]+?(?=#)/ searches for patterns in str that don't contain # or # and end with #.
From the garbage in the string, and your comment that you're using Nokogiri and CSV, and because you didn't show your input data as CSV or HTML, I have to wonder if you're not mangling the incoming data somehow, and trying to wiggle out of it in post-processing. If so, show us what you're actually doing and maybe we can help you get clean data to start.
I'm assuming your terms are bookended and separated by ## and consist of one or more numbers followed by the actual term separated by #. To get the terms into an array:
row['xTerm'].split('##')[1..-1].map { |term| term.split(?#)[-1] }
Then you can join or do whatever you want with it.
I have the following string
string = "OGC Number | LT No | Job /n 9625878 | EPP3234 | 1206545/n" and continues on
I am trying to write it to a .CSV file where it will look like this:
OGC Number | LT No | Job
------------------------------
9625878 | EPP3234 | 1206545
9708562 | PGP43221 | 1105482
9887954 | BCP5466 | 1025454
where each newline in the string is a new row
where each "|" in the sting is a new column
I am having trouble getting the formatting.
I think I need to use:
string.split('/n')
string.split('|')
Thanks.
Windows 7, Python 2.6
Untested:
text="""
OGC Number | LT No | Job
------------------------------
9625878 | EPP3234 | 1206545
9708562 | PGP43221 | 1105482
9887954 | BCP5466 | 1025454"""
import csv
lines = text.splitlines()
with open('outputfile.csv', 'wb') as fout:
csvout = csv.writer(fout)
csvout.writerow(lines[0]) # header
for row in lines[2:]: # content
csvout.writerow([col.strip() for col in row.split('|')])
If you are interested in using a third party module. Prettytable is very useful and has a nice set of features to deal with and print tabular data.
EDIT: Oops, I missunderstood your question!
The code below will use two regular expressions to do the modifications.
import re
str="""OGC Number | LT No | Job
------------------------------
9625878 | EPP3234 | 1206545
9708562 | PGP43221 | 1105482
9887954 | BCP5466 | 1025454
"""
# just setup above
# remove all lines with at least 4 dashes
str=re.sub( r'----+\n', '', str )
# replace all pipe symbols with their
# surrounding spaces by single semicolons
str=re.sub( r' +\| +', ';', str )
print str
I just started learning python scripting yesterday and I've already gotten stuck. :(
So I have a data file with a lot of different information in various fields.
Formatted basically like...
Name (tab) Start# (tab) End# (tab) A bunch of fields I need but do not do anything with
Repeat
I need to write a script that takes the start and end numbers, and add/subtract a number accordingly depending on whether another field says + or -.
I know that I can replace words with something like this:
x = open("infile")
y = open("outfile","a")
while 1:
line = f.readline()
if not line: break
line = line.replace("blah","blahblahblah")
y.write(line + "\n")
y.close()
But I've looked at all sorts of different places and I can't figure out how to extract specific fields from each line, read one field, and change other fields. I read that you can read the lines into arrays, but can't seem to find out how to do it.
Any help would be great!
EDIT:
Example of a line from the data here: (Each | represents a tab character)
| |
V V
chr21 | 33025905 | 33031813 | ENST00000449339.1 | 0 | **-** | 33031813 | 33031813 | 0 | 3 | 1835,294,104, | 0,4341,5804,
chr21 | 33036618 | 33036795 | ENST00000458922.1 | 0 | **+** | 33036795 | 33036795 | 0 | 1 | 177, | 0,
The second and third columns (indicated by arrows) would be the ones that I'd need to read/change.
You can use csv to do the splitting, although for these sorts of problems, I usually just use str.split:
with open(infile) as fin,open('outfile','w') as fout:
for line in fin:
#use line.split('\t'3) if the name of the field can contain spaces
name,start,end,rest = line.split(None,3)
#do something to change start and end here.
#Note that `start` and `end` are strings, but they can easily be changed
#using `int` or `float` builtins.
fout.write('\t'.join((name,start,end,rest)))
csv is nice if you want to split lines like this:
this is a "single argument"
into:
['this','is','a','single argument']
but it doesn't seem like you need that here.
Could somebody help me figure out a simple way of doing this using any script ? I will be running the script on Linux
1 ) I have a file1 which has the following lines :
(Bank8GntR[3] | Bank8GntR[2] | Bank8GntR[1] | Bank8GntR[0] ),
(Bank7GntR[3] | Bank7GntR[2] | Bank7GntR[1] | Bank7GntR[0] ),
(Bank6GntR[3] | Bank6GntR[2] | Bank6GntR[1] | Bank6GntR[0] ),
(Bank5GntR[3] | Bank5GntR[2] | Bank5GntR[1] | Bank5GntR[0] ),
2 ) I need the contents of file1 to be modified as following and written to a file2
(Bank15GntR[3] | Bank15GntR[2] | Bank15GntR[1] | Bank15GntR[0] ),
(Bank14GntR[3] | Bank14GntR[2] | Bank14GntR[1] | Bank14GntR[0] ),
(Bank13GntR[3] | Bank13GntR[2] | Bank13GntR[1] | Bank13GntR[0] ),
(Bank12GntR[3] | Bank12GntR[2] | Bank12GntR[1] | Bank12GntR[0] ),
So I have to:
read each line from the file1,
use "search" using regular expression,
to match Bank[0-9]GntR,
replace \1 with "7 added to number matched",
insert it back into the line,
write the line into a new file.
How about something like this in Python:
# a function that adds 7 to a matched group.
# groups 1 and 2, we grabbed (Bank) to avoid catching the digits in brackets.
def plus7(matchobj):
return '%s%d' % (matchobj.group(1), int(matchobj.group(2)) + 7)
# iterate over the input file, have access to the output file.
with open('in.txt') as fhi, open('out.txt', 'w') as fho:
for line in fhi:
fho.write(re.sub('(Bank)(\d+)', plus7, line))
Assuming you don't have to use python, you can do this using awk:
cat test.txt | awk 'match($0, /Bank([0-9]+)GntR/, nums) { d=nums[1]+7; gsub(/Bank[0-9]+GntR\[/, "Bank" d "GntR["); print }'
This gives the desired output.
The point here is that match will match your data and allows capturing groups which you can use to extract out the number. As awk supports arithmetic, you can then add 7 within awk and then do a replacement on all the values in the rest of the line. Note, I've assumed all the values in the line have the same digit in them.