How to access an array that contains float variables in python - python

I have a defined function:
def makeRandomList (values) :
length = len(values)
new_list = []
for i in range(length) :
random_num = random.randint(0, length-1)
new_list.append(values[random_num]*1.0) #take random samples
return new_list
which should just take some samples from an input array values. I have imported such an array as a .csv spreadsheet. Two problems occur:
The array should look like this:
['0', '0']
['1.200408', '29629.0550890999947']
['2.438112', '322162.385751669993']
['3.443816', '511142.915559189975']
['4.500272', '703984.472568470051']
['5.505976', '579295.304300419985']
['6.562432', '703984.472568470051']
['7.568136', '579295.304300419985']
['8.624592', '703984.472568470051']
Which I know through these lines:
import csv
with open('ThruputCSV.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter = ',')
v = []
for row in readCSV:
print(row)
When instead of typing print(row) using v.append(row[1]) the resulting v looks like this:
['',
'0',
'29629.0550890999947',
'322162.385751669993',
'511142.915559189975',
'703984.472568470051',
'579295.304300419985',
'703984.472568470051',
'579295.304300419985',
'703984.472568470051']
which is correct exept for the first entry ? Why is the first entry empty?
Now, when running a code (if you're interested, it has kindly been distributed by one user here) , the makeRandomListfunction given v as the values parameter throws an error: can't multiply sequence by non-int of type float
I cannot figure out what is the error - to me v seems to be an array that contains float values. And this should be fine, because the error occurs in this line: new_list.append(values[random_num]*1.0) in which random_num, some integer value, just gives the index of the v array which I want to access. Does this mean I am not allowed to use append with an array that contains float variables?

You are reading that error wrong... It's not the float value that is an issue
The error is say a 'sequence' cannot be multiplied by a float. Floats and ints can be multiplied with each other. Sequences can NOT be iterated over and multiplied by either floats or ints.
The actual problem is that your array values are strings. Note the ' around each one of them. Those are considered sequences. Convert them to floats and your code will work.
for i in range(length) :
random_num = random.randint(0, length-1)
new_list.append(float(values[random_num])*1.0)
Edit:
It was pointed out that I originally said sequences cannot be multiplied by floats or ints. The clarify. An array of sequences cannot be ITERATED over AND multiply by an int/float at the same time. If you just multiply the whole sequence by an int it will copy all the elements within the array. Useful knowledge in some cases, however that still does not solve this particular question.

Related

Strip all string and make Numpy array from list

I have a list it contains dictionaries that hold string and float data eg. [{a:'DOB', b:'weight', c:height}, {a:12.2, b:12.5, c:11.5}, {a:'DOB', b:33.5, c:33.2}] as such:
I want to convert this to numpy and strip all keys and string values so only the float values pass into the numpy array then I want to use this to work out some stats. eg [[12.2,12.5,11.5], ['', 33.5, 33.2]]
where the whole row is string it will be omitted but where the item in a row is string it should keep a null value.
I'm not sure how to achieve this.
This answer combines all the suggestions in the comments. The procedure loops thru the initial list of dictionaries and does the following:
Creates a new list is using list compression, saving each dictionary value as float, or None for non-floats.
Counts # of float values in the new list and appends if there is at least 1 float.
Creates an array from the new list using np.array().
I added missing quotes to starting code in original post.
Also, in the future posts you should at least make an effort to code something, then ask for help where you get stuck.
test_d = [{'a':'DOB', 'b':'weight', 'c':'height'},
{'a':12.2, 'b':12.5, 'c':11.5},
{'a':'DOB', 'b':33.5, 'c':33.2}]
arr_list = []
for d in test_d:
d_list = [x if isinstance(x, float) else None for x in d.values()]
check = sum(isinstance(x, float) for x in d_list)
if check > 0:
arr_list.append(d_list)
print (arr_list)
arr = np.array(arr_list)
print(arr)
For reference, here is the list compression converted to a standard loop with if/else logic:
for d in test_d:
# List compression converted to a loop with if/else below:
d_list = []
for x in d.values():
if isinstance(x, float):
d_list.append(x)
else:
d_list.append(None)

I am writing a mergesort3 function that is supposed to take an array of numbers or values and divide them into 3 separate arrays for sorting

I am having an issue with my code, as it will not execute and displays a TypeError message in the format:
list indices must be integers or slices, not list.
I believe that my mergesort function is functional, but I am not entirely sure.
I believe that the issue comes from the part of the code where I am trying to read from a file that contains lists of numbers that are unsorted.
with open(filelocation) as fl:
line = fl.readline()
while line:
line = line.split()
for i in range(1, len(line)):
# converting read elements into integer values for sorting
line[i] = int(line[i])
input.append(line)
line = fl.readline()
with open('merge3.txt', 'w') as f:
for i in input:
mergeSort3(input[i], 0, 1 / 3 * len(input), 2 / 3 * len(input), len(input), input())
print(input)
The error pops up on the last 2 lines of this code, or rather, the mergeSort3 statement when I call the parameters of my initial function: def mergeSort3(arr1, low, mid1, mid2, high, arr2).
I am also writing my code in Python, although that is probably not the issue.
Any help is appreciated!
Essentially what I expect from this is to be able to open the file data.txt and read each of the values, and then sort that data via merge sort /3 (dividing the arrays into thirds) and write that data into a new text file called merge3.txt.
When you append line to input, line is a list of integers. Therefore, input is a list of lists.
When you loop through input, i will be one of those lists.
You then attempt to index input with the list. Hence the error telling you that you can't use a list as an index.

How to select a list element and convert it to "float"?

So, I'm currently in the middle of a Python exercise, and I'm struggling with something. I have to make a multiplication between two variables, a and X, but I select a inside a list, and a is returned as a list with a single element (for example [0.546] instead of 0.546.)
I'd like to convert it into a float element (so 0.546, without the brackets, in my example), but float() doesn't accept a list as an argument. It's probably simple to do, but I'm kind of a Python beginner, and I can't find the answer I want on the Internet (it doesn't help that I'm French.)
Here's my code :
for (i,individual) in iris.iterrows():
if pd.isnull(individual["petal_width"]):
a = coeffs["case 1"]['a'] #here I want to select the element 'a' as a float
b = coeffs["case 1"]['b'] #same thing for b
X = individual["petal_length"]
Y = a*X + b
By using different print commands, I know that X is a float, so the problem comes from a and b.
Also, when I want to do the multiplication, it says "can't multiply sequence by non-int of type float"
Thanks in advance for any help, cheers!
Just wrap the variable in with float, like this:
a = float(coeffs["case 1"]['a'])
if you want to convert a whole list, not a dict, then you can do this:
my_float_list = [float(x) for x in my_list]
or
my_float_list = list(map(float, my_list))
EDIT: Seems like coeffs["case 1"]['a'] is a list of floats, then do this:
# if you are sure that there is atleast one
a = coeffs["case 1"]['a'][0] # if you are sure that there is atleast one value
# if you are unsure that there is atleast one
a = coeffs["case 1"]['a'][0] if len(coeffs["case 1"]['a']) > 0 else 1.0

Converting list with 1 variable to float in Python

I have a list variable with one element,
x=['2']
and I want to convert it to a float:
x=2.0
I tried float(x), or int(x) - without success.
Can anyone please help me?
You need to convert the first item in your one-item list to a float. The approaches you tried already are trying to convert the whole list to a float (or an int - not sure where you were going with that!).
Python is zero-indexed (index numbers start from zero) which means that the first item in your list is referred to as x[0].
So the snippet you need is:
x = float(x[0])

List index out of range, with split()

I am learning Python, and am trying to learn data.split(). I found the following in another StackOverflow question (link here), discussing appending a file in Python.
I have created biki.txt per the above link. Here's my code:
import re
import os
import sys
with open("biki.txt","r") as myfile:
mydata = myfile.read()
data = mydata.replace("http","%http")
for m in range (1,1000):
dat1 = data.split("%")[m]
f = open ("new.txt", "a")
f.write(dat1)
f.close()
But when I run the above, I get the error:
dat1 = data.split("%")[m]
IndexError: list index out of range
How come? I can't find documentation as to what that [m] does, but removing it doesn't fix the issue. (If I remove [m], then the error changes and says that f.write(dat1) must be a string, or read only character buffer (?).
Thank you for any help or ideas!
First, you need understand what is happening with m in your code. Assuming:
for m in range(1,1000):
print(m)
In the first loop, the value of m will be equal to 1.
In the next loop (and until m be less than 1000) the value of m will be m+1, I mean, if in the previous loop the value of m was 1, then, in this loop m will be equal to 2.
Second, you need to understand that the expression data.split('%') will split a string where it finds a '%' character, returning a list.
For example, assuming:
data = "one%two%three%four%five"
numbers = data.split('%')
numbers will be a list with five elements like this:
numbers = ['one','two','three','four','five']
To get each element on a list, you must subscript the list, which means to use the fancy [] operators and an index number (actually, you can do a lot more, like slicing):
numbers[0] # will return 'one'
numbers[1] # will return 'two'
...
numbers[4] # will return 'five'
Note that the first element on a list has index 0.
The list numbers has 5 elements, and the indexing starts with 0, so, the last element will have index 4. If you try to subscript with an index higher than 4, the Python Interpreter will raise an IndexError since there is no element at such index.
Your code is generating a list with less elements than the range you created. So, the list index is being exhausted before the for loop is done. I mean, if dat1 has 500 elements, when the value of m is 500 (don't forget that list indexes starts with 0) an IndexError is raised.
If I got what you want to do, you may achieve your objective with this code:
with open("input.txt","r") as file_input:
raw_text = file_input.read()
formated_text = raw_text.replace("http","%http")
data_list = formated_text.split("%")
with open("output.txt","w") as file_output:
for data in data_list:
file_output.write(data+'\n') # writting one URL per line ;)
You should just iterate over data.split():
for dat1 in data.split("%"):
Now you only split once (rather than on every iteration), it doesn't have to contain 1000+ items (which was the cause of the IndexError) and it gives a string to f.write() rather than a list (the source of the other error).

Categories

Resources