Reading log files in python - python

I have a log file (it is named data.log) containing data that I would like to read and manipulate.
The file is structured as follows:
'''
#Comment line 1
#Comment line 2
1.00000000,3.02502604,343260.68655952,384.26845401,-7.70828175,-0.45288215
2.00000000,3.01495320,342124.21684440,767.95286901,-7.71506536,-0.45123853
3.00000000,3.00489957,340989.57100678,1151.05303883,-7.72185550,-0.44959182
'''
I would like to obtain the numbers from the last two columns and convert this into separate arrays or lists, I tried doing this by creating an empty list, but I do not know how to make this from a log file with a certain name. Could someone help me with this as I am a beginner programmer?
The expected output I would like to obtain is:
list1 = [-7.70828175, -7.71506536, -7.71506536]
list2 = [-0.45288215, -0.45123853, -0.44959182]
Thank you in advance!

Try this way. but you have to confirm that each row list length must equal to 6.
list1 = []
list2 = []
with open('example.log') as f:
for i in f.readlines():
if (len(i.split(',')) == 6):
list1.append(i.split(',')[4])
list2.append(i.split(',')[5])
print(list1)
print(list2)

Related

Python - Get item from a list under a list

I have a list like below.
list = [[Name,ID,Age,mark,subject],[karan,2344,23,87,Bio],[karan,2344,23,87,Mat],[karan,2344,23,87,Eng]]
I need to get only the name 'Karan' as output.
How can I get that?
This is a 2D list,
list[i][j]
will give you the 'i'th list within your list and the 'j'th item within that list.
So to get Karen you want list[1][0]
I upvoted Lio Elbammalf, but decided to provide an answer that made a couple of assumptions that should have been clarified in the question:
The First item of the list is the headers, they are actually in the list (and not there as part of the question), and they are provided as part of the list because there is no guarantee that the headers will always be in the same order.
This is probably a CSV file
Ignoring 2 for the moment, what you would want to do is remove the "headers" from the list (so that the rest of the list is uniform), and then find the index of "Name" (your desired output).
myinput = [["Name","ID","Age","mark","subject"],
["karan",2344,23,87,"Bio"],
["karan",2344,23,87,"Mat"],
["karan",2344,23,87,"Eng"]]
## Remove the headers from the list to simplify everything
headers = myinput.pop(0)
## Figure out where to find the person's Name
nameindex = headers.index("Name")
## Return a list of the Name in each row
return [stats[nameindex] for stats in myinput]
If the name is guaranteed to be the same in each row, then you can just return myinput[0][nameindex] like is suggested in the other answer
Now, if 2 is true, I'm assuming you're using the csv module, in which case load the file using the DictReader class and then just access each row using the 'Name' key:
def loadfile(myfile):
with open(myfile) as f:
reader = csv.DictReader(f)
return list(reader)
def getname(rows):
## This is the same return as above, and again you can just
## return rows[0]['Name'] if you know you only need the first one
return [row['Name'] for row in rows]
In Python 3 you can do this
_, [x, _, _, _, _], *_ = ls
Now x will be karan.

in my for loop .split() only works once

So I have this app. It records data from the accelerometer. Here's the format of the data. It's saved in a .csv file.
time, x, y, z
0.000000,0.064553,-0.046095,0.353776
Here's the variables declared at the top of my program.
length = sum(1 for line in open('couchDrop.csv'))
wholeList = [length]
timeList = [length]#holds a list of all the times
xList = [length]
yList = [length]
zList = [length]
I'm trying to create four lists; time, x, y, z. Currently the entire data file is stored in one list. Each line in the list contains 4 numbers representing time, x, y, and z. I need to split wholeList into the four different lists. Here's the subroutine where this happens:
def easySplit():
print "easySplit is go "
for i in range (0, length):
current = wholeList[i]
current = current[:-2] #gets rid of a symbol that may be messing tings up
print current
print i
timeList[i], xList[i], yList[i], zList[i] = current.split(',')
print timeList[i]
print xList[i]
print yList[i]
print zList[i]
print ""
Here's the error I get:
Traceback (most recent call last):
File "/home/william/Desktop/acceleration plotter/main.py", line 105, in <module>
main()
File "/home/william/Desktop/acceleration plotter/main.py", line 28, in main
easySplit()
File "/home/william/Desktop/acceleration plotter/main.py", line 86, in easySplit
timeList[i], xList[i], yList[i], zList[i] = current.split(',')
IndexError: list assignment index out of range`
Another weird thing is that my dot split seems to work fine the first time through the loop.
Any help would be greatly appreciated.
For data manipulation like this, I'd recommend using Pandas. In one line, you can read the data from the CSV into a Dataframe.
import pandas as pd
df = pd.read_csv("couchDrop.csv")
Then, you can easily select each column by name. You can manipulate the data as a pd.Series or manipulate as a np.array or convert to a list. For example:
x_list = list(df.x)
You can find more information about pandas at http://pandas.pydata.org/pandas-docs/stable/10min.html.
EDIT: The error with your original code is that syntax like xList = [length] does not create a list of length length. It creates a list of length one containing a single int element with the value of length.
the code line wholeList = [length] doesnt create a list of length = length. It creates a list with only one element which is integer length, so if you were to print(wholeList) you will only see [3]
Since lists are mutable in python, you can just to wholeList =[] and keep appending elements to it. It doesnt have to be of specific length.
And when you do current.split(',') which is derived from wholeList, its trying to split only data available for first iteration i.e. [3]

read data from file to a 2d array [Python]

I have a file .txt like this:
8.3713312149,0.806817531586,0.979428482338,0.20179159543
5.00263547897,2.33208847046,0.55745770379,0.830205341157
0.0087910592556,4.98708152771,0.56425779093,0.825598658777
and I want data to be saved in a 2d array such as
array = [[8.3713312149,0.806817531586,0.979428482338,0.20179159543],[5.00263547897,2.33208847046,0.55745770379,0.830205341157],[0.0087910592556,4.98708152771,0.56425779093,0.825598658777]
I tried with this code
#!/usr/bin/env python
checkpoints_from_file[][]
def read_checkpoints():
global checkpoints_from_file
with open("checkpoints.txt", "r") as f:
lines = f.readlines()
for line in lines:
checkpoints_from_file.append(line.split(","))
print checkpoints_from_file
if __name__ == '__main__':
read_checkpoints()
but it does not work.
Can you guys tell me how to fix this? thank you
You have two errors in your code. The first is that checkpoints_from_file[][] is not a valid way to initialize a multidimensional array in Python. Instead, you should write
checkpoints_from_file = []
This initializes a one-dimensional array, and you then append arrays to it in your loop, which creates a 2D array with your data.
You are also storing the entries in your array as strings, but you likely want them as floats. You can use the float function as well as a list comprehension to accomplish this.
checkpoints_from_file.append([float(x) for x in line.split(",")])
Reading from your file,
def read_checkpoints():
checkpoints_from_file = []
with open("checkpoints.txt", "r") as f:
lines = f.readlines()
for line in lines:
checkpoints_from_file.append(line.split(","))
print(checkpoints_from_file)
if __name__ == '__main__':
read_checkpoints()
Or assuming you can read this data successfully, using a string literal,
lines = """8.3713312149,0.806817531586,0.979428482338,0.20179159543
5.00263547897,2.33208847046,0.55745770379,0.830205341157
0.0087910592556,4.98708152771,0.56425779093,0.825598658777"""
and a list comprehension,
list_ = [[decimal for decimal in line.split(",")] for line in lines.split("\n")]
Expanded,
checkpoints_from_file = []
for line in lines.split("\n"):
list_of_decimals = []
for decimal in line.split(","):
list_of_decimals.append(decimal)
checkpoints_from_file.append(list_of_decimals)
print(checkpoints_from_file)
Your errors:
Unlike in some languages, in Python you don't initialize a list like, checkpoints_from_file[][], instead, you can initialize a one-dimensional list checkpoint_from_file = []. Then, you can insert more lists inside of it with Python's list.append().

Python Dynamic Data Structures

I am going to read the lines of a given text file and select several chunks of data whose format are (int, int\n) . Every time the number of lines are different so I need a dynamic sized data structure in Python. I also would like to store those chunks in 2D data structures. If you are familiar with MATLAB programming, I'd like to have something like a structure A{n} n = number of chunks of data and each chunk includes several lines of the data mentioned above.
Which type of data structure would you recommend? and how to implement with it?
i.e. A{0} = ([1,2],[2,3],[3,4]) A{1} = ([1,1],[2,2],[5,5],[7,4]) and so on.
Thank you
A python list can contain lists as well any different data type.
l = []
l.append(2) # l is now (2)
l.extend([3,2]) # l is now (2,3,2)
l.append([4,5]) # l is now (2,3,2,[4,5])
list.Append appends whatever it is given as argument to the list
while list.extend makes the given the argument the tail of the list.
I guess you required list would appear somehwhat like this:
l = ([[1,2],[2,3],[3,4]],[[1,1],[2,2],[5,5],[7,4]])
PS: Here's a link to get you jump start learning python
https://learnxinyminutes.com/docs/python/
Just keep in mind that if your are reading data from text file , the format is string , you need to use int() to convert your string to int.
The issue was resolved with 2 steps appending the list.
import numpy as np
file = ('data.txt')
f = open(file)
i = 0
str2 = '.PEN_DOWN\n'
str3 = '.PEN_UP\n'
A = []
B = []
for line in f.readlines():
switch_end = False
if (line == str2) or (~switch_end):
if line[0].isdigit():
A.append(line[:-1])
elif line == str3:
switch_end = True
B.append(A)
A = []
B.append(A)
f.close()
print(np.shape(A))
print(np.shape(B))

Python list index not found in loading list from text file

The assignment was to get a user to input 4 numbers, then store them in a text file, open that text file, show the 4 numbers on different lines, then get the average of those numbers and display it to the user.
Here is my code so far:
__author__ = 'Luca Sorrentino'
numbers = open("Numbers", 'r+')
numbers.truncate() #OPENS THE FILE AND DELETES THE PREVIOUS CONTENT
# Otherwise it prints out all the inputs into the file ever
numbers = open("Numbers", 'a') #Opens the file so that it can be added to
liist = list() #Creates a list called liist
def entry(): #Defines a function called entry, to enable the user to enter numbers
try:
inputt = float(input("Please enter a number")) #Stores the users input as a float in a variable
liist.append(inputt) #Appends the input into liist
except ValueError: #Error catching that loops until input is correct
print("Please try again. Ensure your input is a valid number in numerical form")
entry() #Runs entry function again to enable the user to retry.
x = 0
while x < 4: # While loop so that the program collects 4 numbers
entry()
x = x + 1
for inputt in liist:
numbers.write("%s\n" % inputt) #Writes liist into the text file
numbers.close() #Closes the file
numbers = open("Numbers", 'r+')
output = (numbers.readlines())
my_list = list()
my_list.append(output)
print(my_list)
print(my_list[1])
The problem is loading the numbers back from the text file and then storing each one as a variable so that I can get the average of them.
I can't seem to find a way to specifically locate each number, just each byte which is not what I want.
Your list (my_list) has only 1 item - a list with the items you want.
You can see this if you try print(len(my_list)), so your print(my_list[1]) is out of range because the item with index = 1 does not exist.
When you create an empty list and append output, you are adding one item to the list, which is what the variable output holds for a value.
To get what you want just do
my_list = list(output)
You'll have two main problems.
First, .append() is for adding an individual item to a list, not for adding one list to another. Because you used .append() you've ended up with a list containing one item, and that item is itself a list... not what you want, and the explanation for your error message. For concatenating one list to another .extend() or += would work, but you should ask yourself whether that is even necessary in your case.
Second, your list elements are strings and you want to work with them as numbers. float() will convert them for you.
In general, you should investigate the concept of "list comprehensions". They make operations like this very convenient. The following example creates a new list whose members are the respectively float()ed versions of your .readlines() output:
my_list = [float(x) for x in output]
The ability to add conditionals into a list comprehension is also a real complexity-saver. For example, if you wanted to skip any blank/whitespace lines that had crept into your file:
my_list = [float(x) for x in output if len(x.strip())]
You can change the end of your program a little and it will work:
output = numbers.readlines()
# this line uses a list comprehension to make
# a new list without new lines
output = [i.strip() for i in output]
for num in output:
print(num)
1.0
2.0
3.0
4.0
print sum(float(i) for i in output)
10

Categories

Resources