Extract a specific number from a string - python

I have this string 553943040 21% 50.83MB/s 0:00:39
The length of the numbers can vary
The percent can contain one or two numbers
The spaces between the start of the string and the first number may vary
I need to extract the first number, in this case 553943040
I was thinking that the method could be to:
1) Replace the percent with a separator. something like:
string=string.replace("..%","|") # where the "." represent any character, even an space.
2) Get the first part of the new string by cutting everything after the separator.
string=string.split("|")
string=string[0]
3) Remove the spaces.
string=string.strip()
I know that the stages 2 and 3 works, but I'm stocked on the first. Also if there is any better method of getting it would be great to know it!

Too much work.
>>> '553943040 21% 50.83MB/s 0:00:39'.split()[0]
'553943040'

Related

Converting String Data Values with two commas from csv or txt files into float in python

I just received a dataset from a HPLC run and the problem I ran into is that the txt data from the software generates two dotted separated values for instance "31.456.234 min". Since I want to plot the data with matplotlib and numpy I can only see the data where the values are not listed with two commas. This is due to every value which is smaller than 1 is represented with one comma like "0.765298" the rest of the values is, as aforementioned, listed with two commas.
I tried to solve this issue with a .split() and .find() method, however, this is rather inconvenient and I was wondering whether there would be a more elegant way to solve this issue, since I need in the end again x and y values for plotting.
Many thanks for any helping answers in advance.
This is not very clear regarding comma and dots.
For the decimal number you say that you have comma but you show a dot : 0.765298
I guess you can not have dots for either thousand separator and decimal...
If you have english notation I guess the numbers are:
"31,456,234 min" and "0.765298"
In this case you can use the replace method :
output = "31,456,234"
number = float(output.replace(',',''))
# result : 31456234.0
EDIT
Not very sure to have understood what you are looking for and the format of the numbers...
However if the second comma in 31.456.234 is unwanted here is a solution :
def conv(n):
i = n.find('.')
return float(n[:i]+'.'+n[i:].replace('.',''))
x = '31.456.234'
y = '0.765298'
print(conv(x)) # 31.456234
print(conv(y)) # 0.765298

I need clarification for the following code in python and what :^38

I need clarification for the following code in python and what :^38
for leaf in [*range(10)]+[2]:
print(f'{"x"*(leaf*2+1):^38}')
Firstly, you want to use backticks ``` to make your snippet readable
for leaf in [*range(10)]+[2]:
print(f'{"x" + str(leaf*2+1):^38}')
Then first to the issue in the code. Your code iterates over the concatenation of two lists. The first one has a range object from 0->9 and the second one has the number two. I used the star to turn the range into it's elements, so you have a list from 0->9 and the number 2.
The part in the squiggly brackets before the colon is the expression to print. Since (leaf*2 + 1) is a number and "x" is not a function (I'd assume) you need to turn it into a string by using str()
The part after the colon c gives the string in the curly braces a space of 38 and aligns it in the middle of it. Alignment in PEP-3101
first line:
for leaf in [*range(10)]+[2]:
you create list of [0,1,2...,10,2]
seconed line
print(f'{"x"*(leaf*2+1):^38}')
you print x (leaf*2+1) times, with spaces beside it that make the all printed string in length of 38

Moving for loop into a reduce method

I am trying to plot the location of ~4k postcodes onto a UK map, I am using a library that can take in the postcode and kick back latitude, longitude etc.., however the postcode must always contain a space before the last 3 characters in the string, for example:
'AF23 4FR' would be viable as the space is before the last 3 chars in the string..
'AF234FR' would not be allowed as there is no space..
I have to go over each item within my list and check there is a space before the n-3 position in the string, I can do this with a simple for loop but I would prefer to do this with a reduce function. I am struggling to workout how I would rework the check and logic of the following into a reduce method, would it even be worth it in this scenario:
for index, p in enumerate(data_set):
if (p.find(' ') == -1):
first = p[:len(p)]
second = p[len(first):]
data_set[index] = first + ' ' + second
You're pretty much there... Create a generator with spaces removed from your string, then apply slicing and formatting, and use a list-comp to generate a new list of foramtted values, eg:
pcs_without_spaces = (pc.replace(' ', '') for pc in data_set)
formatted = ['{} {}'.format(pc[:-3], pc[-3:]) for pc in pcs_without_spaces)
That way, you don't need additional logic on whether it's got a space or not already in it, as long as your postcode is going to be valid after slicing, just removing the spaces and treating everything with the same logic is enough.

Extract dollar amount at multiple places in sentence in python

I have a sentence as below where in I need to extract the dollar amounts with commas to be able to populate into a dictionary.I have tried with few options but couldn't succeed.Please guide.
For par\n
$3,500 single /$7,000 group
For nonpar\n
$7,000 single /$14,000 group
Expected output is :
"rates":{
"single" : "$3,500 (par) / $7,000 (nonpar)",
"group" : "$7,000 (par) / $14,000 (nonpar)"
}
\n here is on a new line
Amount might have decimal points and commas after every 3 values as below.
I was able to write regex for amount alone,but not finding right approach to extend it to my requirement.
re.search(r'^\$\d{1,3}(,\d{3})*(\.\d+)?$','$1,212,500.23')
Edit1:
Went ahead with one more step:
re.findall(r'\$\d{1,3}(?:\,\d{3})*(?:\.\d{2})?', str)
Could get all values in list (but need to have strategy to know which value corresponds to what?)
Edit2:
re.findall(r'\For par\W*(\$\d{1,3}(?:\,\d{3})*(?:\.\d{2})?\W*single)\s*\W*(\$\d{1,3}(?:\,\d{3})*(?:\.\d{2})?\W*group)', str)
Please help me to refine this and make it more generic.
Thanks

Breaking 1 String into 2 Strings based on special characters using python

I am working with python and I am new to it. I am looking for a way to take a string and split it into two smaller strings. An example of the string is below
wholeString = '102..109'
And what I am trying to get is:
a = '102'
b = '109'
The information will always be separated by two periods like shown above, but the number of characters before and after can range anywhere from 1 - 10 characters in length. I am writing a loop that counts characters before and after the periods and then makes a slice based on those counts, but I was wondering if there was a more elegant way that someone knew about.
Thanks!
Try this:
a, b = wholeString.split('..')
It'll put each value into the corresponding variables.
Look at the string.split method.
split_up = [s.strip() for s in wholeString.split("..")]
This code will also strip off leading and trailing whitespace so you are just left with the values you are looking for. split_up will be a list of these values.

Categories

Resources