Spliting integers from string in python - python

Suppose I have a numpy array like -
A = ['83.56%' '2.74%' '2.74%' '4.11%' '4.11%' '19.18%' '76.71%' '20.55%'
'34.25%' '54.79%']
and I want to split this array as integers array only like -
B = ['83.56' '2.74' '2.74' '4.11' '4.11' '19.18' '76.71' '20.55'
'34.25' '54.79']
How should I do it using Python codes ?

Use-case for str.rstrip:
B = [item.rstrip('%') for item in A]

Related

How can I iterate over a numpy multidimensional array of dataframes?

Im trying to iterate over a multidimensional array on Python but I'm having problems because my array is full of dataframes instead of int().
I have a multidimensional numpy array (12, 11) which contains 12 x 11 different dataframes.
nombres_df = np.array([[df_01_ID4034, df_02_ID4034, df_03_ID4034, df_04_ID4034, df_05_ID4034, df_06_ID4034, df_07_ID4034, df_08_ID4034, df_09_ID4034, df_10_ID4034, df_11_ID4034, df_12_ID4034],
[df_01_ID4035, df_02_ID4035, df_03_ID4035, df_04_ID4035, df_05_ID4035, df_06_ID4035, df_07_ID4035, df_08_ID4035, df_09_ID4035, df_10_ID4035, df_11_ID4035, df_12_ID4035],
[df_01_ID4039, df_02_ID4039, df_03_ID4039, df_04_ID4039, df_05_ID4039, df_06_ID4039, df_07_ID4039, df_08_ID4039, df_09_ID4039, df_10_ID4039, df_11_ID4039, df_12_ID4039],
[df_01_ID4040, df_02_ID4040, df_03_ID4040, df_04_ID4040, df_05_ID4040, df_06_ID4040, df_07_ID4040, df_08_ID4040, df_09_ID4040, df_10_ID4040, df_11_ID4040, df_12_ID4040],
[df_01_ID4041, df_02_ID4041, df_03_ID4041, df_04_ID4041, df_05_ID4041, df_06_ID4041, df_07_ID4041, df_08_ID4041, df_09_ID4041, df_10_ID4041, df_11_ID4041, df_12_ID4041],
[df_01_ID4042, df_02_ID4042, df_03_ID4042, df_04_ID4042, df_05_ID4042, df_06_ID4042, df_07_ID4042, df_08_ID4042, df_09_ID4042, df_10_ID4042, df_11_ID4042, df_12_ID4042],
[df_01_ID4047, df_02_ID4047, df_03_ID4047, df_04_ID4047, df_05_ID4047, df_06_ID4047, df_07_ID4047, df_08_ID4047, df_09_ID4047, df_10_ID4047, df_11_ID4047, df_12_ID4047],
[df_01_ID4049, df_02_ID4049, df_03_ID4049, df_04_ID4049, df_05_ID4049, df_06_ID4049, df_07_ID4049, df_08_ID4049, df_09_ID4049, df_10_ID4049, df_11_ID4049, df_12_ID4049],
[df_01_ID4056, df_02_ID4056, df_03_ID4056, df_04_ID4056, df_05_ID4056, df_06_ID4056, df_07_ID4056, df_08_ID4056, df_09_ID4056, df_10_ID4056, df_11_ID4056, df_12_ID4056],
[df_01_ID4059, df_02_ID4059, df_03_ID4059, df_04_ID4059, df_05_ID4059, df_06_ID4059, df_07_ID4059, df_08_ID4059, df_09_ID4059, df_10_ID4059, df_11_ID4059, df_12_ID4059],
[df_01_ID4075, df_02_ID4075, df_03_ID4075, df_04_ID4075, df_05_ID4075, df_06_ID4075, df_07_ID4075, df_08_ID4075, df_09_ID4075, df_10_ID4075, df_11_ID4075, df_12_ID4075]], dtype="object")
for j in range(len(nombres_df)):
for i in range(len(nombres_df[i])):
print (nombres_df[i][j])
I need to iterate over it and make operations with values inside each dataframe.
The problem is that when I try to iterate as usually, I cannot do it because I'm getting this error:
5
6 for j in range(len(nombres_df)):
7 for i in range(len(nombres_df[i])): <--------
8 print (nombres_df[i][j], end = " ")
IndexError: arrays used as indices must be of integer (or boolean) type
I know the problem is here len(nombres_df[i]) but I don`t know how to solve it.
Thank you very much
I thick the problem is the fact that you are iterating over the wrong index in the second line of your code.
that i inside range(len(nombres_df[i])) shoud be j
also you inverted the indexes in nombres_df[i][j] it shoud be nombres_df[j][i]
this shoud do the trick
for j in range(len(nombres_df)):
for i in range(len(nombres_df[j])):
print (nombres_df[j][i])

converting list into a matrix in Python

If you have a list of elements lets say:
res =
['(18,430)', '(19,430)', '(19,429)', '(19,428)', '(19,427)', '(18,426)', '(17,426)', '(17,425)', '(17,424)', '(17,423)', '(17,422)', '(17,421)', '(17,420)', '(16,421)', '(14,420)', '(11,419)', '(9,417)', '(7,416)', '(4,414)', '(3,414)', '(2,412)', '(1,412)', '(-1,410)', '(-2,409)', '(-2,408)', '(-3,407)', '(-3,406)', '(-3,405)', '(-3,404)', '(-3,403)', '(-3,402)', '(-3,401)', '(-3,400)', '(-4,399)', '(-4,398)', '(-5,398)', '(-6,398)', '(-7,397)', '(-7,396)', '(-6,395)', '(-5,395)', '(-4,393)', '(-3,391)', '(6,384)', '(12,378)', '(24,370)', '(42,358)', '(107,304)', '(151,255)', '(207,196)', '(259,121)', '(389,-28)', '(456,-84)', '(515,-134)', '(569,-182)', '(650,-260)', '(688,-294)', '(723,-317)', '(740,-328)', '(762,-342)', '(767,-347)', '(768,-349)', '(769,-352)', '(769,-357)', '(769,-359)', '(768,-361)', '(768,-364)', '(766,-370)', '(765,-371)', '(764,-374)', '(763,-376)', '(761,-378)', '(760,-381)', '(758,-385)', '(752,-394)', '(747,-401)', '(742,-407)', '(735,-413)', '(724,-421)', '(719,-424)', '(718,-425)', '(717,-425)'], ['(18,430)', '(19,430)', '(19,429)', '(19,428)', '(19,427)', '(18,426)', '(17,426)', '(17,425)', '(17,424)', '(17,423)', '(17,422)', '(17,421)', '(17,420)', '(16,421)', '(14,420)', '(11,419)', '(9,417)', '(7,416)', '(4,414)', '(3,414)', '(2,412)', '(1,412)', '(-1,410)', '(-2,409)', '(-2,408)', '(-3,407)', '(-3,406)', '(-3,405)', '(-3,404)', '(-3,403)', '(-3,402)', '(-3,401)', '(-3,400)', '(-4,399)', '(-4,398)', '(-5,398)', '(-6,398)', '(-7,397)', '(-7,396)', '(-6,395)', '(-5,395)', '(-4,393)', '(-3,391)', '(6,384)', '(12,378)', '(24,370)', '(42,358)', '(107,304)', '(151,255)', '(207,196)', '(259,121)', '(389,-28)', '(456,-84)', '(515,-134)', '(569,-182)', '(650,-260)', '(688,-294)', '(723,-317)', '(740,-328)', '(762,-342)', '(767,-347)', '(768,-349)', '(769,-352)', '(769,-357)', '(769,-359)', '(768,-361)', '(768,-364)', '(766,-370)', '(765,-371)', '(764,-374)', '(763,-376)', '(761,-378)', '(760,-381)', '(758,-385)', '(752,-394)', '(747,-401)', '(742,-407)', '(735,-413)', '(724,-421)', '(719,-424)', '(718,-425)', '(717,-425)']
and we want to make all these values into a matrix where we can update values.
All these values in the list are going to be the values of the rows and columns of a matrix?
Basically:
row1 = '(18,430)', row2 = '(19,430)', row3 = '(19,429)',.....,rown='(717,-425)', column1 = '(18,430)', column2 = '(19,430)', column3 = '(19,429)', ..... ,columnn= '(717,-425)'
How can we do that in Python and later I want to update values in the rows and columns? I tried to do this where I repeat the list and make it into a matrix.
But it does not give me what I want.
Res_List = [res,res]
print(np.array(Res_List))
So I am still wondering how we can do this in Python.
I also tried:
mat = np.array([res,res]).T
print(mat)
and it kind of gives me what I want but not quite.
This gives me:
[['(18,430)' '(18,430)']
['(19,430)' '(19,430)']
['(19,429)' '(19,429)']
['(19,428)' '(19,428)']
['(19,427)' '(19,427)']
['(18,426)' '(18,426)']
['(17,426)' '(17,426)']
['(17,425)' '(17,425)']
['(17,424)' '(17,424)']
['(17,423)' '(17,423)']
['(17,422)' '(17,422)']
['(17,421)' '(17,421)']
['(17,420)' '(17,420)']
['(16,421)' '(16,421)']
['(14,420)' '(14,420)']
['(11,419)' '(11,419)']
['(9,417)' '(9,417)']
['(7,416)' '(7,416)']
['(4,414)' '(4,414)']
['(3,414)' '(3,414)']
['(2,412)' '(2,412)']
['(1,412)' '(1,412)']
['(-1,410)' '(-1,410)']
['(-2,409)' '(-2,409)']
['(-2,408)' '(-2,408)']
['(-3,407)' '(-3,407)']
['(-3,406)' '(-3,406)']
['(-3,405)' '(-3,405)']
['(-3,404)' '(-3,404)']
['(-3,403)' '(-3,403)']
['(-3,402)' '(-3,402)']
['(-3,401)' '(-3,401)']
['(-3,400)' '(-3,400)']
['(-4,399)' '(-4,399)']
['(-4,398)' '(-4,398)']
['(-5,398)' '(-5,398)']
['(-6,398)' '(-6,398)']
['(-7,397)' '(-7,397)']
['(-7,396)' '(-7,396)']
['(-6,395)' '(-6,395)']
['(-5,395)' '(-5,395)']
['(-4,393)' '(-4,393)']
['(-3,391)' '(-3,391)']
['(6,384)' '(6,384)']
['(12,378)' '(12,378)']
['(24,370)' '(24,370)']
['(42,358)' '(42,358)']
['(107,304)' '(107,304)']
['(151,255)' '(151,255)']
['(207,196)' '(207,196)']
['(259,121)' '(259,121)']
['(389,-28)' '(389,-28)']
['(456,-84)' '(456,-84)']
['(515,-134)' '(515,-134)']
['(569,-182)' '(569,-182)']
['(650,-260)' '(650,-260)']
['(688,-294)' '(688,-294)']
['(723,-317)' '(723,-317)']
['(740,-328)' '(740,-328)']
['(762,-342)' '(762,-342)']
['(767,-347)' '(767,-347)']
['(768,-349)' '(768,-349)']
['(769,-352)' '(769,-352)']
['(769,-357)' '(769,-357)']
['(769,-359)' '(769,-359)']
['(768,-361)' '(768,-361)']
['(768,-364)' '(768,-364)']
['(766,-370)' '(766,-370)']
['(765,-371)' '(765,-371)']
['(764,-374)' '(764,-374)']
['(763,-376)' '(763,-376)']
['(761,-378)' '(761,-378)']
['(760,-381)' '(760,-381)']
['(758,-385)' '(758,-385)']
['(752,-394)' '(752,-394)']
['(747,-401)' '(747,-401)']
['(742,-407)' '(742,-407)']
['(735,-413)' '(735,-413)']
['(724,-421)' '(724,-421)']
['(719,-424)' '(719,-424)']
['(718,-425)' '(718,-425)']
['(717,-425)' '(717,-425)']]
but what I want is the columns like how
they are designed but the rows to be the same
as the columns and that we are able to update
and put values into the matrix.
Maybe what you want is a dict:
matrix = {
k: {l: 0 for l in res}
for k in res
}
All the values are initialized to 0.
You can easily update values in matrix; for example, you can increase the value of a 'cell' of one:
matrix['(18,430)']['(19,430)'] += 1
or set it to a specific value:
matrix['(18,430)']['(19,430)'] = 10
and retrieve it:
val = matrix['(18,430)']['(19,430)']
you can use Numpy.
for converting a list to a matrix like array you should write it as list of lists (or tuples). First of all your list contain strings so we first convert strings to tuples as follow:
new_list = [eval(i) for i in res]
I used eval because your strings is in tuple form so we can tell python treat them as a chunk of code.
then lets convert this new_list to array as follow:
import numpy as np
matrix = np.array(new_list )
now you can access your matrix elements as matrix[i, j] where i, j are row and column respectively. for changing a specific value of in certain location just assign it as usual:
matrix[i, j] = new_value

Reading a line with scientific numbers (like 0.4E-03)

I would like to process the following line (output of a Fortran program) from a file, with Python:
74 0.4131493371345440E-03 -0.4592776407685850E-03 -0.1725046324754540
and obtain an array such as:
[74,0.4131493371345440e-3,-0.4592776407685850E-03,-0.1725046324754540]
My previous attempts do not work. In particular, if I do the following :
with open(filename,"r") as myfile:
line=np.array(re.findall(r"[-+]?\d*\.*\d+",myfile.readline())).astype(float)
I have the following error :
ValueError: could not convert string to float: 'E-03'
Steps:
Get list of strings (str.split(' '))
Get rid of "\n" (del arr[-1])
Turn list of strings into numbers (Converting a string (with scientific notation) to an int in Python)
Code:
import decimal # you may also leave this out and use `float` instead of `decimal.Decimal()`
arr = "74 0.4131493371345440E-03 -0.4592776407685850E-03 -0.1725046324754540 \n"
arr = arr.split(' ')
del arr[-1]
arr = [decimal.Decimal(x) for x in arr]
# do your np stuff
Result:
>>> print(arr)
[Decimal('74'), Decimal('0.0004131493371345440'), Decimal('-0.0004592776407685850'), Decimal('-0.1725046324754540')]
PS:
I don't know if you wrote the file that gives the output in the first place, but if you did, you could just think about outputting an array of float() / decimal.Decimal() from that file instead.
#ant.kr Here is a possible solution:
# Initial data
a = "74 0.4131493371345440E-03 -0.4592776407685850E-03 -0.1725046324754540 \n"
# Given the structure of the initial data, we can proceed as follow:
# - split the initial at each white space; this will produce **list** with the last
# the element being **\n**
# - we can now convert each list element into a floating point data, store them in a
# numpy array.
line = np.array([float(i) for i in a.split(" ")[:-1]])

Convert python string into numpy array [duplicate]

This question already has answers here:
what is the fastest way in python to convert a string with formatted numbers in an numpy array
(2 answers)
Closed 4 years ago.
I am hoping someone could help me convert python string into numpy array. Essentially, given that I have a Python string like this:
'[ 0.11591 0.044932 0.66926 -0.67844 0.47253 -0.84737\n 1.0734 -0.075396 -0.22688 0.84021 -0.46608 0.019941\n -0.0020394 -0.13038 0.8911 -0.40015 0.52048 0.69283\n -0.10257 0.54296 -0.416 0.36585 0.96078 0.50816\n 0.50144 0.66489 -0.79224 0.44567 0.90822 -0.67522\n 0.047322 0.48399 -0.53316 0.76157 -0.86072 0.091377\n 0.30159 -1.194 0.8679 -0.58691 0.48712 -0.66167\n -0.24265 -0.18849 -0.19353 0.0014832 0.88768 0.36672\n 0.16211 0.56235 ]'
I want to convert it into a 1x50 dimensional array in Python. Is there any efficient way of doing it? Thanks in advance.
EDIT: How I get that string? It is initially a numpy array as a value in a dictionary. Then I save that into the database with the data type of TEXT. Afterward, I load the text that contains numpy array from the database.
Given you have such a string:
line = '[ 0.11591 0.044932 0.66926 -0.67844 0.47253 -0.84737\n 1.0734 -0.075396 -0.22688 0.84021 -0.46608 0.019941\n -0.0020394 -0.13038 0.8911 -0.40015 0.52048 0.69283\n -0.10257 0.54296 -0.416 0.36585 0.96078 0.50816\n 0.50144 0.66489 -0.79224 0.44567 0.90822 -0.67522\n 0.047322 0.48399 -0.53316 0.76157 -0.86072 0.091377\n 0.30159 -1.194 0.8679 -0.58691 0.48712 -0.66167\n -0.24265 -0.18849 -0.19353 0.0014832 0.88768 0.36672\n 0.16211 0.56235 ]'
Just remove the first and the last element from it, split it and convert the elements to numbers:
map(float, line[1:-2].split())
Or just use the numpy.fromstring function:
numpy.fromstring(line[1:-2], dtype=float, sep=' ')
This is one way to solve it:
import numpy as np
import re
txt = '[ 0.11591 0.044932 0.66926 -0.67844 0.47253 -0.84737\n 1.0734 -0.075396 -0.22688 0.84021 -0.46608 0.019941\n -0.0020394 -0.13038 0.8911 -0.40015 0.52048 0.69283\n -0.10257 0.54296 -0.416 0.36585 0.96078 0.50816\n 0.50144 0.66489 -0.79224 0.44567 0.90822 -0.67522\n 0.047322 0.48399 -0.53316 0.76157 -0.86072 0.091377\n 0.30159 -1.194 0.8679 -0.58691 0.48712 -0.66167\n -0.24265 -0.18849 -0.19353 0.0014832 0.88768 0.36672\n 0.16211 0.56235 ]'
txt = re.sub(r'\n','', txt)
myList = txt.split()[1:-1]
myList2 = list(map(float,myList))
n_arr = np.array(myList)
print(n_arr)

numpy matrix string with python3.4

i'm having trouble with 3.4 using numpy. My question is to know how can i have a numpy matrix with plain string format instead byte-string.
def res(data):
M = np.zeros(data.shape).astype(dtype='|S20')
lines,columns = M.shape
for l in range(lines):
M[l][0] = data[l][1]
M[l][1] = data[l][2]
M[l][2] = data[l][3]
return M
**result python2.7**
[['Ann' '38.72' '-9.133']
['John' '55.68' '12.566']
['Richard' '52.52' '13.411']
['Alex' '40.42' '-3.703']]
**result python3.4**
[[b'Ann' b'38.72' b'-9.133']
[b'John' b'55.68' b'12.566']
[b'Richard' b'52.52' b'13.411']
[b'Alex' b'40.42' b'-3.703']]
In Python3.4 How can i have my Matrix in plain string like in example for python2.7 this is bad because i have functions that expect string values and not byte-strings.
Any help would be great. thanks
in my case the solution were simply to change dtype('|S20') to dtype(str)..I hope this help.

Categories

Resources