in my for loop .split() only works once

in my for loop .split() only works once - python

So I have this app. It records data from the accelerometer. Here's the format of the data. It's saved in a .csv file.
time, x, y, z
0.000000,0.064553,-0.046095,0.353776
Here's the variables declared at the top of my program.
length = sum(1 for line in open('couchDrop.csv'))
wholeList = [length]
timeList = [length]#holds a list of all the times
xList = [length]
yList = [length]
zList = [length]
I'm trying to create four lists; time, x, y, z. Currently the entire data file is stored in one list. Each line in the list contains 4 numbers representing time, x, y, and z. I need to split wholeList into the four different lists. Here's the subroutine where this happens:
def easySplit():
print "easySplit is go "
for i in range (0, length):
current = wholeList[i]
current = current[:-2] #gets rid of a symbol that may be messing tings up
print current
print i
timeList[i], xList[i], yList[i], zList[i] = current.split(',')
print timeList[i]
print xList[i]
print yList[i]
print zList[i]
print ""
Here's the error I get:
Traceback (most recent call last):
File "/home/william/Desktop/acceleration plotter/main.py", line 105, in <module>
main()
File "/home/william/Desktop/acceleration plotter/main.py", line 28, in main
easySplit()
File "/home/william/Desktop/acceleration plotter/main.py", line 86, in easySplit
timeList[i], xList[i], yList[i], zList[i] = current.split(',')
IndexError: list assignment index out of range`
Another weird thing is that my dot split seems to work fine the first time through the loop.
Any help would be greatly appreciated.

For data manipulation like this, I'd recommend using Pandas. In one line, you can read the data from the CSV into a Dataframe.
import pandas as pd
df = pd.read_csv("couchDrop.csv")
Then, you can easily select each column by name. You can manipulate the data as a pd.Series or manipulate as a np.array or convert to a list. For example:
x_list = list(df.x)
You can find more information about pandas at http://pandas.pydata.org/pandas-docs/stable/10min.html.
EDIT: The error with your original code is that syntax like xList = [length] does not create a list of length length. It creates a list of length one containing a single int element with the value of length.

the code line wholeList = [length] doesnt create a list of length = length. It creates a list with only one element which is integer length, so if you were to print(wholeList) you will only see [3]
Since lists are mutable in python, you can just to wholeList =[] and keep appending elements to it. It doesnt have to be of specific length.
And when you do current.split(',') which is derived from wholeList, its trying to split only data available for first iteration i.e. [3]

Related

How to iterate a command (loop) through files in a list in Python

I'm new in Python. I'm trying to a write a brief script. I want to run a loop in which I have to read many files and for each file run a command.In particular, I want to do a calculation throught the the two rows of every file and return an output whith a name which is refered to the relative file.
I was able to load the files in a list ('work'). I tried to write the second single loop for the calculation that I have to do whith one of the file in the list and it runs correctly. THe problem is that I'm not able to iterate it over all the files and obtain each 'integr' value from the relative file.
Let me show what I tried to do:
import numpy as np
#I'm loading the files that contain the values whith which I want to do my calculation in a loop
work = {}
for i in range(0,100):
work[i] = np.loadtxt('work{}.txt'.format(i), float).T
#Now I'm trying to write a double loop in which I want to iterate the second loop (the calculation) over the files (that don't have the same length) in the list
integr = 0
for k in work:
for i in range(1, len(k[1,:])):
integr = integr + k[1,i]*(k[0,i] - k[0,i-1])
#I would like to print every 'integr' which come from the calculation over each file
print(integr)
When I try to run this, I obtain this message error:
Traceback (most recent call last):
File "lavoro.py", line 11, in <module>
for i in range(1, len(k[1,:])):
TypeError: 'int' object has no attribute '__getitem__'
Thank you in advance.

I am a bit guessing, but if I understood correctly, you want work to be a list and not a dictionary. Or maybe you don't want it, but surely you can use a list instead of a dictionary, given the context.
This is how you can create your work list:
work = []
for i in range(0,100):
work.append(np.loadtxt('work{}.txt'.format(i), float).T)
Or using the equivalent list comprehension of the above loop (usually the list comprehension is faster):
work = [np.loadtxt('work{}.txt'.format(i), float).T for i in range(100)]
Now you can loop over the work list to do your calculations (I assume they are correct, no way for me to check this):
for k in work:
integr = 0
for i in range(1, len(k[1,:])):
integr = integr + k[1,i]*(k[0,i] - k[0,i-1])
Note that I moved integr = 0 inside the loop, so that is reinitalized to 0 for each file, otherwise each inner loop will add to the result of the previous inner loops.
However if that was the desided behaviour, move integr = 0 outside the loop as your original code.

Guessing from the context you wanted:
for k in work.values():
iterating over dictionary produces only keys, not values.

Python: My directory is not giving individual value output

I have created a code that imports data via .xlrd in two directories in Python.
Code:
import xlrd
#category.clear()
#term.clear()
book = xlrd.open_workbook("C:\Users\Koen\Google Drive\etc...etc..")
sheet = book.sheet_by_index(0)
num_rows = sheet.nrows
for i in range(1,num_rows,1):
category = {i:( sheet.cell_value(i, 0))}
term = {i:( sheet.cell_value(i, 1))}
When I open one of the two directories (category or term), it will present me with a list of values.
print(category[i])
So far, so good.
However, when I try to open an individual value
print(category["2"])
, it will consistently give me an error>>
Traceback (most recent call last):
File "testfile", line 15, in <module>
print(category["2"])
KeyError: '2'
The key's are indeed numbered (as determined by i).
I've already tried to []{}""'', etc etc. Nothing works.
As I need those values later on in the code, I would like to know what the cause of the key-error is.
Thanks in advance for taking a look!

First off, you are reassigning category and term in every iteration of the for loop, this way the dictionary will always have one key at each iteration, finishing with the last index, so if our sheet have 100 lines, the dict will only have the key 99. To overcome this, you need to define the dictionary outside the loop and assign the keys inside the loop, like following:
category = {}
term = {}
for i in range(1, num_rows, 1):
category[i] = (sheet.cell_value(i, 0))
term[i] = (sheet.cell_value(i, 1))
And second, the way you are defining the keys using the for i in range(1, num_rows, 1):, they are integers, so you have to access the dictionary keys like so category[1]. To use string keys you need to cast them with category[str(i)] for example.
I hope have clarifying the problem.

Python Array error: "list indices must be integers or slices, not list"

I'm new to Python and trying to add two strings to a key-value array.
Here's my code:
import os
from numpy import genfromtxt
import re
script_dir = os.path.dirname(r'C:/Users/Kenny/Desktop/pythonReports/')
my_data = genfromtxt('allreports.csv', delimiter=',', dtype=None)
pattern_id = re.compile(r'(?<=eventid\=)(.*)(?=&key)', flags=re.DOTALL)
pattern_key = re.compile(r'(?<=key\=)(.*)(?=&cb)', flags=re.DOTALL)
id_key = {}
for row in my_data:
eventid = pattern_id.findall(row.decode('utf-8'))
eventkey = pattern_key.findall(row.decode('utf-8'))
id_key[eventid] = eventkey
print(id_key)
This basically takes a url, and extracts two things from it. I want to then take those two things and create an associative array (key/value) with those two pieces of information.
Example data is: {123456, 412F5BFE1D8A33BC}
And there are hundreds of urls, therefore the reason for an array.
The error I'm getting is:
Traceback (most recent call last):
File "script.py", line 20, in <module>
id_key[eventid] = [eventkey]
TypeError: list indices must be integers or slices, not list
Thanks for any help with this, and in case it's needed, I'm using Python3.

First of all, you want an associative array, so use a dict instead of a list. Second, findall returns a list and you want the element.
id_key = {} # replaced [] with {}
for row in my_data:
eventid = pattern_id.findall(row.decode('utf-8'))[0] # note added [0]
eventkey = pattern_key.findall(row.decode('utf-8'))[0]
id_key[eventid] = eventkey
But if you're going for style points I'd recommend a dict comprehension
id_key = {pattern_id.findall(row.decode('utf-8'))[0]:
pattern_key.findall(row.decode('utf-8'))[0] for row in my_data}
or one more way
def id_and_key(line):
return (pattern_id.findall(line)[0],
pattern_key.findall(line)[0])
id_key = dict(id_and_key(row.decode('utf-8')) for row in my_data)

.split from a file and putting it in an array

Im reading a file with some information and each part is separated with a # however on each line i want it to be a different array so i did this and im not sure why its not working.
main_file = open("main_file.txt","r")
main_file_info=main_file.readlines()
test=[]
n=0
for line in main_file_info:
test[n]=line.split("#")
test=test[n][1:len(test)-1] # to get rid of empty strings at the start and the end
print(test)# see what comes out
main_file.close()

The way you are inserting the output of line.split("#") in your list is wrong. Your list is not initialized, hence, you can't simply assign anything to any element of the list. So, what you need to do is this :
test.append(line.split("#"))
Or, you can initialize your list as below :
test = [[]]*(len(main_file_info))

test = [None for _ in range(total)]
# instead of test = []
or simply just append to test:
test.append( line.split("#") )

Python list index not found in loading list from text file

The assignment was to get a user to input 4 numbers, then store them in a text file, open that text file, show the 4 numbers on different lines, then get the average of those numbers and display it to the user.
Here is my code so far:
__author__ = 'Luca Sorrentino'
numbers = open("Numbers", 'r+')
numbers.truncate() #OPENS THE FILE AND DELETES THE PREVIOUS CONTENT
# Otherwise it prints out all the inputs into the file ever
numbers = open("Numbers", 'a') #Opens the file so that it can be added to
liist = list() #Creates a list called liist
def entry(): #Defines a function called entry, to enable the user to enter numbers
try:
inputt = float(input("Please enter a number")) #Stores the users input as a float in a variable
liist.append(inputt) #Appends the input into liist
except ValueError: #Error catching that loops until input is correct
print("Please try again. Ensure your input is a valid number in numerical form")
entry() #Runs entry function again to enable the user to retry.
x = 0
while x < 4: # While loop so that the program collects 4 numbers
entry()
x = x + 1
for inputt in liist:
numbers.write("%s\n" % inputt) #Writes liist into the text file
numbers.close() #Closes the file
numbers = open("Numbers", 'r+')
output = (numbers.readlines())
my_list = list()
my_list.append(output)
print(my_list)
print(my_list[1])
The problem is loading the numbers back from the text file and then storing each one as a variable so that I can get the average of them.
I can't seem to find a way to specifically locate each number, just each byte which is not what I want.

Your list (my_list) has only 1 item - a list with the items you want.
You can see this if you try print(len(my_list)), so your print(my_list[1]) is out of range because the item with index = 1 does not exist.
When you create an empty list and append output, you are adding one item to the list, which is what the variable output holds for a value.
To get what you want just do
my_list = list(output)

You'll have two main problems.
First, .append() is for adding an individual item to a list, not for adding one list to another. Because you used .append() you've ended up with a list containing one item, and that item is itself a list... not what you want, and the explanation for your error message. For concatenating one list to another .extend() or += would work, but you should ask yourself whether that is even necessary in your case.
Second, your list elements are strings and you want to work with them as numbers. float() will convert them for you.
In general, you should investigate the concept of "list comprehensions". They make operations like this very convenient. The following example creates a new list whose members are the respectively float()ed versions of your .readlines() output:
my_list = [float(x) for x in output]
The ability to add conditionals into a list comprehension is also a real complexity-saver. For example, if you wanted to skip any blank/whitespace lines that had crept into your file:
my_list = [float(x) for x in output if len(x.strip())]

You can change the end of your program a little and it will work:
output = numbers.readlines()
# this line uses a list comprehension to make
# a new list without new lines
output = [i.strip() for i in output]
for num in output:
print(num)
1.0
2.0
3.0
4.0
print sum(float(i) for i in output)
10

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

in my for loop .split() only works once - python

Related

How to iterate a command (loop) through files in a list in Python

Python: My directory is not giving individual value output

Python Array error: "list indices must be integers or slices, not list"

.split from a file and putting it in an array

Python list index not found in loading list from text file

Categories

Resources