Before explaining my question I want to mention that I have taken a look at various other questions at StackOverflow but couldn't find any solution related to my problem. So, that's why don't mark this as duplicate, please!
I'm working on a Python(3.6) project in which I need to run a terminal command and parse a value from the output which is in the form of columns.
Here's the command I ran:
output = subprocess.call('kubectl get svc', shell=True)
And here's the output:
b'NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.35.240.1 <none> 443/TCP 28m
node-app1 LoadBalancer 10.35.245.164 35.239.29.239 8000:32249/TCP 26m
Now, I need to get EXTERNAL-IP from the second row and 4th column.
How can I get this value?
You can extract the specific column from shell itself. This way we can avoid overhead created by text processing.
out = subprocess.check_output(["kubectl get svc | awk '{print $3}'"], shell=True)
result = out.decode().split('\n')
print(result[1])
output:
10.0.0.1
The shell is nice for that. How about
output = subprocess.call('kubectl get svc | tr "\t" " " | tr -s " " | cut -d " " -f 4 | tail -1', shell=True)
You could also omit the tail -1, which gives the last line, and do that splitting/filtering in Python.
You can parse the output yourself in python as well:
# Step 1, convert the bytes output into string
output = output.decode('utf-8')
# Step 2, split the string based on the newline character
output = output.split('\n')
# Step 3, split all lines on any whitespace character
output = [o.split() for o in output]
# Step 4, get the correct value as [row][column]
value = output[2][3]
You can use pandas to read the data. Here's an selfcontained example:
from StringIO import StringIO
import pandas
x=b"""NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.35.240.1 <none> 443/TCP 28m
node-app1 LoadBalancer 10.35.245.164 35.239.29.239 8000:32249/TCP 26m"""
dataframe = pandas.read_csv(StringIO(x), sep="\s+")
# print rows
for index, row in dataframe.iterrows():
print (row['NAME'], row['CLUSTER-IP'], row['PORT(S)'])
# search for a row with name node-app1 and print value in PORT(S) column:
print dataframe.loc[dataframe['NAME'] == 'node-app1']['PORT(S)'].to_string(index=False)
Using some string manipulation
Demo:
output = b"""NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.35.240.1 <none> 443/TCP 28m
node-app1 LoadBalancer 10.35.245.164 35.239.29.239 8000:32249/TCP 26m"""
output = iter(output.split("\n"))
next(output) #Skip Header
for i in output:
print(i.split()[3]) #str.split and get index 3
Output:
<none>
35.239.29.239
Related
Experts, i have a simple pipe delimited file from source system which has a free flow text field and for one of the records, i see that "|" character is coming in as part of data. This is breaking my file unevenly and not getting parsed in to correct number of fields. I want to replace the "|" in the data field with a "#".
Record coming in from source system. There are total 9 fields in the file.
OutboundManualCall|H|RTYEHLA HTREDFST|Free"flow|Text|20191029|X|X|X|3456
If you Notice the 4th field - Free"flow|Text , this is complete value from source which has a pipe in it.
i want to change it to- Free"flow#Text and then read the file with a pipe delimiter.
Desired Outcome-
OutboundManualCall|H|RTYEHLA HTREDFST|Free"flow#Text|20191029|X|X|X|3456
I tried few awk/sed combinations, but didn't get the desired output.
Thanks
Since you know there are 9 fields, and the 4th is a problem: take the first 3 fields and the last 5 fields and whatever is left over is the 4th field.
You did tag shell, so here's some bash: I'm sure the python equivalent is close:
line='OutboundManualCall|H|RTYEHLA HTREDFST|Free"flow|Text|20191029|X|X|X|3456'
IFS='|'
read -ra fields <<<"$line"
first3=( "${fields[#]:0:3}" )
last5=( "${fields[#]: -5}" )
tmp=${line#"${first3[*]}$IFS"} # remove the first 3 joined with pipe
field4=${tmp%"$IFS${last5[*]}"} # remove the last 5 joined with pipe
data=( "${first3[#]}" "$field4" "${last5[#]}" )
newline="${first3[*]}$IFS${field4//$IFS/#}$IFS${last5[*]}"
# .......^^^^^^^^^^^^....^^^^^^^^^^^^^^^^^....^^^^^^^^^^^
printf "%s\n" "$line" "$newline"
OutboundManualCall|H|RTYEHLA HTREDFST|Free"flow|Text|20191029|X|X|X|3456
OutboundManualCall|H|RTYEHLA HTREDFST|Free"flow#Text|20191029|X|X|X|3456
with awk, it's simpler: If there are 10 fields, join fields 4 and 5, and shift the rest down one.
echo "$line" | awk '
BEGIN { FS = OFS = "|" }
NF == 10 {
$4 = $4 "#" $5
for (i=5; i<NF; i++)
$i = $(i+1)
NF--
}
1
'
OutboundManualCall|H|RTYEHLA HTREDFST|Free"flow#Text|20191029|X|X|X|3456
You tagged your question with Python so I assume a Python-based answer is acceptable.
I assume not all records in your file have the additional "|" in it, but only some records have the "|" in the free text column.
For a more realistic example, I create an input with some correct records and some erroneous records.
I use StringIO to simulate the file, in your environment read the real file with 'open'.
from io import StringIO
sample = 'OutboundManualCall|H|RTYEHLA HTREDFST|Free"flow|Text|20191029|X|X|X|3456\nOutboundManualCall|J|LALALA HTREDFST|FreeHalalText|20191029|X|X|X|3456\nOutboundManualCall|J|LALALA HTREDFST|FrulaalText|20191029|X|X|X|3456\nOutboundManualCall|H|RTYEHLA HTREDFST|Free"flow|Text|20191029|X|X|X|3456'
infile = StringIO(sample)
outfile = StringIO()
for line in infile.readlines():
cols = line.split("|")
if len(cols) > 9:
print(f"bad colum {cols[3:5]}")
line = "|".join(cols[:3]) + "#".join(cols[3:5]) + "|".join(cols[5:])
outfile.write(line)
print("Corrected file:")
print(outfile.getvalue())
Results in:
> bad colum ['Free"flow', 'Text']
> bad colum ['Free"flow', 'Text']
> Corrected file:
> OutboundManualCall|H|RTYEHLA HTREDFSTFree"flow#Text20191029|X|X|X|3456
> OutboundManualCall|J|LALALA HTREDFST|FreeHalalText|20191029|X|X|X|3456
> OutboundManualCall|J|LALALA HTREDFST|FrulaalText|20191029|X|X|X|3456
> OutboundManualCall|H|RTYEHLA HTREDFSTFree"flow#Text20191029|X|X|X|3456
Python - PING a list of IP Address from database
I have a list of ip addresses consisting of 200 locations, which in that location there are 4 ip addresses that I need to do ping testing. I intend to make a command which when I write the name or code of a particular location then it will directly ping to 4 ip address at that location. I have learned a bit to create a list that contains the ip address I entered through the command input () like this :
import os
import socket
ip = []
y = ['IP 1 : ','IP 2 : ', 'IP 3 : ', 'IP 4 : ']
while True:
for x in y:
server_ip = input(x)
ip.append(server_ip)
break
for x in ip:
print("\n")
rep = os.system('ping ' + x + " -c 3")
please give me a little advice about the command I want to make so that I no longer need to enter the ip address one by one. which still makes me confused, especially on how to make the existing items in the database into a variable x which we will insert into this command;
rep = os.system ('ping' + x + "-c 3")
EDIT: It now iterates over a CSV file rather than a hard-coded Python dictionary.
I believe you will be better off using python dictionaries rather than python lists. Assuming you are using Python 3.X, this is what you want to run:
import os
import csv
# Save the IPs you want to ping inside YOURFILE.csv
# Then iterate over the CSV rows using a For Loop
# Ensure your ip addresses are under a column titled ip_address
with open('YOURFILE.csv', newline='') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
rep = os.system("ping " + row['ip_address'] + " -c 3")
I am fairly new to Python. I have a text file containing many blocks of data in following format along with other unnecessary blocks.
NOT REQUIRED :: 123
Connected Part-1:: A ~$
Connected Part-3:: B ~$
Connector Location:: 100 200 300 ~$
NOT REQUIRED :: 456
Connected Part-2:: C ~$
i wish to extract the info (A,B,C, 100 200 300) corresponding to each property ( connected part-1, Connector location) and store it as list to use it later. I have prepared following code which reads file, cleans the line and store it as list.
import fileinput
with open('C:/Users/file.txt') as f:
content = f.readlines()
for line in content:
if 'Connected Part-1' in line or 'Connected Part-3' in line:
if 'Connected Part-1' in line:
connected_part_1 = [s.strip(' \n ~ $ Connected Part -1 ::') for s in content]
print ('PART_1:',connected_part_1)
if 'Connected Part-3' in line:
connected_part_3 = [s.strip(' \n ~ $ Connected Part -3 ::') for s in content]
print ('PART_3:',connected_part_3)
if 'Connector Location' in line:
# removing unwanted characters and converting into the list
content_clean_1 = [s.strip('\n ~ $ Connector Location::') for s in content]
#converting a single string item in list to a string
s = " ".join(content_clean_1)
# splitting the string and converting into a list
weld_location= s.split(" ")
print ('POSITION',weld_location)
here is the output
PART_1: ['A', '\t\tConnector Location:: 100.00 200.00 300.00', '\t\tConnected Part-3:: C~\t']
POSITION ['d', 'Part-1::', 'A', '\t\tConnector', 'Location::', '100.00', '200.00', '300.00', '\t\tConnected', 'Part-3::', 'C~\t']
PART_3: ['1:: A', '\t\tConnector Location:: 100.00 200.00 300.00', '\t\tConnected Part-3:: C~\t']
From the output of this program, i may conclude that, since 'content' is the string consisting all the characters in the file, the program is not reading an individual line. Instead it is considering all text as single string. Could anyone please help in this case?
I am expecting following output:
PART_1: ['A']
PART_3: ['C']
POSITION: ['100.00', '200.00','300.00']
(Note) When i am using individual files containing single line of data, it works fine. Sorry for such a long question
I will try to make it clear, and show how I would do it without regex. First of all, the biggest issue with the code presented is that when using the string.strip function the entire content list is being read:
connected_part_1 = [s.strip(' \n ~ $ Connected Part -1 ::') for s in content]
Content is the entire file lines, I think you want simply something like:
connected_part_1 = [line.strip(' \n ~ $ Connected Part -1 ::')]
How to parse the file is a bit subjective, but given the file format posted as input, I would do it like this:
templatestr = "{}: {}"
with open('inputreadlines.txt') as f:
content = f.readlines()
for line in content:
label, value = line.split('::')
ltokens = label.split()
if ltokens[0] == 'Connected':
print(templatestr.format(
ltokens[-1], #The last word on the label
value.split()[:-1])) #the split value without the last word '~$'
elif ltokens[0] == 'Connector':
print(value.split()[:-1]) #the split value without the last word '~$'
else: #NOT REQUIRED
pass
You can use the string.strip function to remove the funny characters '~$' instead of removing the last token as in the example.
I need to remove the trailing zero's from an export:
the code is reading original tempFile i need column 2 and 6 which contains:
12|9781624311390|1|1|0|0.0000
13|9781406273687|1|1|0|99.0000
14|9781406273717|1|1|0|104.0000
15|9781406273700|1|1|0|63.0000
the awk command changes the form to comma separated and dumps column 2 and 6 into tempFile2 - and i need to remove the trailing zeros from column 6 so the end result looks like this:
9781624311390,0
9781406273687,99
9781406273717,104
9781406273700,63
i believe this should do the trick but have had no luck implementing it:
awk '{sub("\\.*0+$",""); print}'
Below is the code i need to adjust: $6 is the column to remove zero's
if not isError:
print "Translating SQL output to tab delimited format"
awkRunSuccess = os.system(
"awk -F\"|\" '{print $2 \"\\,\" $6}' %s > %s" %
(tempFile, tempFile2)
)
if awkRunSuccess != 0: isError = True
You can use gsub("\\.*0+$","",$2) to do this, as per the following transcript:
pax> echo '9781624311390|0.0000
9781406273687|99.0000
9781406273717|104.0000
9781406273700|63.0000' | awk -F'|' '{gsub("\\.*0+$","",$2);print $1","$2}'
9781624311390,0
9781406273687,99
9781406273717,104
9781406273700,63
However, given you're already within Python (and it's no slouch when it comes to regexes), you'd probably want to use it natively rather than start up an awk process.
Try this awk command
awk -F '[|.]' '{print $2","$(NF-1)}' FileName
Output:
9781624311390,0
9781406273687,99
9781406273717,104
9781406273700,63
Im still new to python.
I have a text file with a list of numbers, and each number has two 'attributes' along with it:
250 121 6000.654
251 8472 650.15614
252 581 84.2
i want to search for the 1st column and return the 2nd and 3rd columns as separate variables so i can use them later.
cmd = """ cat new.txt | nawk '/'251'/{print $2}' """
os.system(cmd)
This works in that it prints the $2 column, but i want to assign this output to a variable, something like this (however this returns the number of errors AFAIK):
cmdOutput = os.system(cmd)
also i would like to change the nawk'd value based on a variable, something like this:
cmd = """ cat new.txt | nawk '/'$input'/{print $2}' """
If anyone can help, thanks.
Don't use cat and nawk. Please.
Just use Python
import sys
target= raw_input( 'target: ' ) # or target= sys.argv[1]
with open('new.txt','r') as source:
for columns in ( raw.strip().split() for raw in source ):
if column[0] == target: print column[1]
No cat. No nawk.
First of all, to format the cmd string, use
input = '251'
cmd = """ cat new.txt | nawk '/'{input}'/{{print $2}}' """.format(input=input)
But really, you don't need an external command at all.
input = '251'
with open('new.txt', 'r') as f:
for line in file:
lst = line.split()
if lst[0] == input:
column2, column3 = int(lst[1]), float(lst[2])
break
else: # the input wasn't found
column2, column3 = None, None
print(column2, column3)
I think what you're looking for is:
subprocess.Popen(["cat", "new.txt","|","nawk","'/'$input/{print $2}'"], stdout=subprocess.PIPE).stdout