Transposing multiple lines of data into one table in Python

Transposing multiple lines of data into one table in Python - python

I have an output file from a script that parses iwlist scan that looks something like:
Cell: 01 -
Address: XX:XX:XX:XX:XX
ESSID: "My Network Name"
Frequency: 2.412 GHz (Channel 2)
Quality: =XX/100
Signal Level: XX/100
Cell: 02 -
Address:
ESSID:
etc etc for as many wlans that show up on the scan..
My question is how would I go about parsing this list even further, perhaps to a new file, to give it a tabulated view in the output (using python)?
for example, the output would be:
Cell Address ESSID Frequency Quality Signal Level
01 - XX:XX:XX:XX:XX:XX "My Network Name" 2.417 GHz (Channel 2) =XX/100 =XX/100
etc for the rest of the wlans on the scan, without repeating the headers preferably.

This would work, for example.
iwlist = '''Cell: 01 -
Address: XX:XX:XX:XX:XX
ESSID: "My Network Name"
Frequency: 2.412 GHz (Channel 2)
Quality: =XX/100
Signal Level: XX/100
Cell: 02 -
Address:
ESSID:
'''
options = []
values = []
for line in iwlist.split('\n'):
if not line.strip():
continue
line = line.split(':')
options.append(line[0])
values.append(':'.join(line[1:]))
for o in options:
print('{:^20}'.format(o), end="")
print()
for v in values:
print('{:^20}'.format(v), end="")
print()

Related

Get a value from columns in the output of Python subprocess.call

Before explaining my question I want to mention that I have taken a look at various other questions at StackOverflow but couldn't find any solution related to my problem. So, that's why don't mark this as duplicate, please!
I'm working on a Python(3.6) project in which I need to run a terminal command and parse a value from the output which is in the form of columns.
Here's the command I ran:
output = subprocess.call('kubectl get svc', shell=True)
And here's the output:
b'NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.35.240.1 <none> 443/TCP 28m
node-app1 LoadBalancer 10.35.245.164 35.239.29.239 8000:32249/TCP 26m
Now, I need to get EXTERNAL-IP from the second row and 4th column.
How can I get this value?

You can extract the specific column from shell itself. This way we can avoid overhead created by text processing.
out = subprocess.check_output(["kubectl get svc | awk '{print $3}'"], shell=True)
result = out.decode().split('\n')
print(result[1])
output:
10.0.0.1

The shell is nice for that. How about
output = subprocess.call('kubectl get svc | tr "\t" " " | tr -s " " | cut -d " " -f 4 | tail -1', shell=True)
You could also omit the tail -1, which gives the last line, and do that splitting/filtering in Python.

You can parse the output yourself in python as well:
# Step 1, convert the bytes output into string
output = output.decode('utf-8')
# Step 2, split the string based on the newline character
output = output.split('\n')
# Step 3, split all lines on any whitespace character
output = [o.split() for o in output]
# Step 4, get the correct value as [row][column]
value = output[2][3]

You can use pandas to read the data. Here's an selfcontained example:
from StringIO import StringIO
import pandas
x=b"""NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.35.240.1 <none> 443/TCP 28m
node-app1 LoadBalancer 10.35.245.164 35.239.29.239 8000:32249/TCP 26m"""
dataframe = pandas.read_csv(StringIO(x), sep="\s+")
# print rows
for index, row in dataframe.iterrows():
print (row['NAME'], row['CLUSTER-IP'], row['PORT(S)'])
# search for a row with name node-app1 and print value in PORT(S) column:
print dataframe.loc[dataframe['NAME'] == 'node-app1']['PORT(S)'].to_string(index=False)

Using some string manipulation
Demo:
output = b"""NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.35.240.1 <none> 443/TCP 28m
node-app1 LoadBalancer 10.35.245.164 35.239.29.239 8000:32249/TCP 26m"""
output = iter(output.split("\n"))
next(output) #Skip Header
for i in output:
print(i.split()[3]) #str.split and get index 3
Output:
<none>
35.239.29.239

How to store a a number and save it to a list only if a line comes a number of lines after

I have packets flowing through the network and I have the monitoring in a text file, so in the beginning I store the "header time" in a variable but save it in a list only if I found the a specific that says "UI SERVICE MATCH (HJ)" follows like 13 line after it, so it's find match after case.
The data is
******* DCS = 5 ************** 2016-02-05 13:29:13.58 ****
From PC19 to PC02
Network layer link
ESTABLISH INDICATION (88H)
Channel class
- power number : 3
- Timeslot : 0
Link supplier
- Shapi : 0
- Channel type : Duplex
- Normal prio
L3 Information
UI SERVICE MATCH (HJ)
UI SERVICE Type
- channel establishment
******* DCS = 5 ************** 2016-02-05 13:29:18.79 ****
From PC19 to PC02
Network layer link
ESTABLISH INDICATION (88H)
Channel class
- power number : 4
- Timeslot : 0
Call Load
- Slot:32 Busy:1 Access:1
The code is
fh = open("moni.txt")
elements = []
for line in fh:
line = line.rstrip()
if "******* DCS = 5" in line :
U=line.split()
Y = U[6]
if 'UI SERVICE MATCH (HJ)' in line :
elements.append(Y)
print elements
The output:
[]
The desired output:
['13:29:13.58']

You can try to simplify your code with something like :
fh = open("moni.txt")
elements = []
# First split your file in block of data who belongs together :
for block in fh.read().split("******* DCS = 5 **************"):
# Check if there is the desired string :
if 'UI SERVICE MATCH (HJ)' in block :
# Save the time :
elements.append(block.split()[1])
print(elements)
# ['13:29:13.58']

Text File Reading and Structuring data

Text File
• I.D.: AN000015544
DESCRIPTION: 6 1/2 DIGIT DIGITAL MULTIMETER
MANUFACTURER: HEWLETT-PACKARDMODEL NUM.: 34401A CALIBRATION - DUE DATE:6/1/2016 SERIAL NUMBER: MY45027398
• I.D.: AN000016955
DESCRIPTION: TEMPERATURE CALIBRATOR
MANUFACTURER: FLUKE MODEL NUM.: 724 CALIBRATION - DUE DATE:6/1/2016 SERIAL NUMBER: 1189063
• I.D.: AN000017259
DESCRIPTION: TRUE RMS MULTIMETER
MANUFACTURER: AGILENT MODEL NUM.: U1253A CALIBRATION - DUE DATE:6/1/2016 SERIAL NUMBER: MY49420076
Objective
To read the text file and save the ID number and Serial number of each part into the part_data data structure.
Data Structure
part_data = {'ID': [],
'Serial Number': []}
Code
with open("textfile", 'r') as part_info:
lineArray = part_info.read().split('\n')
print(lineArray)
if "• I.D.: AN000015544 " in lineArray:
print("I have found the first line")
ID = [s for s in lineArray if "AN" in s]
print(ID[0])
My code isn't finding the I.D: or the serial number value. I know it is wrong I was trying to use the method I got from this website Text File Reading and Printing Data for parsing the data. Can anyone move me in the right direction for collecting the values?
Update
This solution works with python 2.7.9 not 3.4, thanks to domino - https://stackoverflow.com/users/209361/domino:
with open("textfile", 'r') as part_info:
lineArray = part_info.readlines()
for line in lineArray:
if "I.D.:" in line:
ID = line.split(':')[1]
print ID.strip()
However when I initially asked the question I was using python 3.4, and the solution did not work properly.
Does anyone understand why it doesn't work python 3.4? Thank You!

This should print out all your ID's. I think it should move you in the right direction.
with open("textfile", 'r') as part_info:
lineArray = part_info.readlines()
for line in lineArray:
if "I.D.:" in line:
ID = line.split(':')[1]
print ID.strip()

It won't work in python3 because in python3 print is a function
It should end
print(ID.strip())

How to read in strings into an array that is being used for a zip function in python?

I am parsing an inventory log look like below and I want to read in the "data" into a list and pair them up with "info" for example:
+Hardware information
Processor : Intel(R) Xeon(R) CPU E5-2420 v2 # 2.20GHz (24 cores/threads)
Memory : 81877MB
Controller Slot : 0
BIOS : 3.0a 11/12/2013 3.1
IPMI FW rev : 2.20
Canister firmware : 2.2.26^M
Canister firmware date : Feb 5 2013 20:54:00^M
I need to pair up the info's with the data's for example 'BIOS' being the info and pair it with the data '3.0a 11/12/2013 3.1.
So using a zip function in I want to have the Info and the data side by side each other.I need to find a way to parse the inventory log and put the data into the data array(list) and pair them up to the correct info category (BIOS,Memory, etc,). Any ideas?
Info = ['IPMI FW rev','BIOS','Canister Firmware','Memory','Controller Slot']
Data = ['','','','','']
for I,D in zip(Info,Data):
print('{0}:{1}'.format(I,D))

How about this:
data = {}
with open('log.txt', 'r') as file_obj:
for line in file_obj:
if ':' in line:
pos = line.index(':')
data[line[:pos].strip()] = line[pos+1:].strip()
for key in data: print key,':',data[x]
EDIT:
Like the comment below says this code assumes the file has a consistent structure. If there is any : before the separating:, this will fail.

Read txt file for specific fields and store them in a numpy array

I have a txt file (which is basically a log file) having blocks of text. Each block or paragraph has certain information about the event. What I need is to extract only a certain information from each block and save it as an array or list.
Each paragraph has following format:
id: [id] Name: [name] time: [timestamp] user: [username] ip: [ip_address of the user] processing_time: [processing time in seconds]
A sample paragraph can be:
id: 23455 Name: ymalsen time: 03:20:20 user: ymanlls ip: 230.33.45.32 processing_time: 05
What I need to extract from each block is:
id:[]
Name:[]
processing_time: []
So that my resulting array for each block's result would be:
array = [id, name, processing_time]
An issue is that my text files are fairly large in size and have thousands of these records. What is the best way to do what I need to do in Python (2.7 to be precise). Once I have each array (corresponding to each record), I will save all of them in a single ND numpy array and that is it. Any help will be greatly appreciated.
Here is something I am using to plainly extract all the lines starting with ID:
import string
log = 'log_1.txt'
file = open(log, 'r')
name_array = []
line = file.readlines()
for a in line:
if a.startswith('Name: '):
' '.join(a.split())
host_array.append(a)
But it simply extracts all the blocks and puts them into a single array, which is kind of useless given that I am following the parameters of Id, name, etc.

If the Name field can contain whitespaces, you could to extract the date with regular expression. However, then you will have to convert the values to the according python type yourself. The following program:
import numpy as np
import re
PAT = re.compile(r"""id:\s*(?P<id>\d+)\s*
Name:\s*(?P<name>[0-9A-Za-z ]+?)\s+time:.*
processing_time:\s*(?P<ptime>\d+)""", re.VERBOSE)
values = []
fp = open("proba.txt", "r")
for line in fp:
match = PAT.match(line)
if match:
values.append(( int(match.group("id")),
match.group("name"),
int(match.group("ptime"))))
fp.close()
print values
would print as result:
[(23455, 'y malsen', 5), (23455, 'ymalsen', 5)]
for a file "proba.txt" with the content
id: 23455 Name: y malsen time: 03:20:20 user: ymanlls ip: 230.33.45.32 processing_time: 05
id: 23455 Name: ymalsen time: 03:20:20 user: ymanlls ip: 230.33.45.32 processing_time: 05

You could load your data using numpy's great loadtxt routine into a record array, and extract it from there:
import numpy as np
aa = np.loadtxt("proba.txt", usecols=(1, 3, 11),
dtype={"names": ("id", "name","proctime"),
"formats": ("i4", "a100", "i4")})
print aa["name"]
print aa["id"]
print aa["proctime"]
The example loads your data from proba.txt and stores in aa. The appropriate elements (aa["name"], aa["id"], ȧa["proctime") gives you a list for each of your column if you need them separately, otherwise, you have them already in one numpy array. The code above produces:
['ymalsen' 'ymalsen']
[23455 23455]
[5 5]
for the file proba.txt with following content:
id: 23455 Name: ymalsen time: 03:20:20 user: ymanlls ip: 230.33.45.32 processing_time: 05
id: 23455 Name: ymalsen time: 03:20:20 user: ymanlls ip: 230.33.45.32 processing_time: 05
However, please note that this assumes, that no whitespaces appear in the field contents (within the fields). Whitespaces between the fields are fine, though.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Transposing multiple lines of data into one table in Python - python

Related

Get a value from columns in the output of Python subprocess.call

How to store a a number and save it to a list only if a line comes a number of lines after

Text File Reading and Structuring data

How to read in strings into an array that is being used for a zip function in python?

Read txt file for specific fields and store them in a numpy array

Categories

Resources