Convert re.match/re.search to string

Convert re.match/re.search to string - python

I've been looking through having re.match/re.search find a certain int from my file. The int will differ, which is why I'm using regex in the first place. Here is the file:
Money:
*1,000 coins
*2 dollars
And my code:
import re
amount = 2
price = 500 * amount
with open("money.txt", "r") as money:
moneyc = money.read()
moneyc = moneyc.strip("Money:")
moneyc = re.search("(\*[^0,][0-9]{0,3})?(,[0-9]{3})?(,[0-9]{3})?", moneyc)
moneyleft = re.sub("(\*[^0,][0-9]{0,3})?(,[0-9]{3})?(,[0-9]{3})? coins", "*"+str(int(moneyc.replace("*", "").replace(",", "")) - price)+" coins")
money.write("Money\n"+moneyleft)
Returns the error:
Traceback (most recent call last):
File "C:/***/money.py", line 8, in <module>
moneyleft = re.sub("(\*[^0,][0-9]{0,3})?(,[0-9]{3})?(,[0-9]{3})? coins", "*"+str(int(moneyc.replace("*", "").replace(",", "")) - price)+" coins")
AttributeError: '_sre.SRE_Match' object has no attribute 'replace'
And it's just because regex match isn't a string, however since I need to turn it into a string somehow, how would I go about it?
What I want the file to be afterwards is:
Money:
*0 coins
*2 dollars
Due to the fact that the price is 500 * amount, and amount is 2. Why I keep "coins" in my re.sub is because there's also dollars.

You have a couple of issues there:
You open the file only for reading with the modifier r, you should use r+.
Use the locale.atoi function to validate and convert comma-separated integers.
Take a look at this code:
import locale
import re
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
recoins = re.compile(r'\*(\S+) coins')
amount = 2
price = 500 * amount
with open('money.txt', 'r+') as money:
text = money.read()
coins = recoins.search(text).group(1)
newcoins = locale.atoi(coins) - price
money.seek(0)
money.truncate()
money.write(recoins.sub('*{:,} coins'.format(newcoins), text))

def Money1_to_Money2(i) :
Amount = i * 5
print (Amount)
Money1_to_Money2 (10)
This is a simple currency to currency code.
Just add the amount of money you want converted into the lower parentheses, then add the conversion factor where the 5 is. If you want it more organised put your currency to currency names instead of money1 and money2; i equals the amount of money you have in the lower parentheses which is multiplied by conversion factor.

The object returned from re.search function is a match object not a string.
That's why you get the error:
AttributeError: '_sre.SRE_Match' object has no attribute 'replace'
To get the matching string after using re.search try:
moneyc = moneyc.group()
Then moneyc.replace will work.

Related

Formatting strings with integers

I'm trying to increment the video file names every time they get into my folder. I tried the + and the join() method but I can't seem to figure it out. I tried integers without quotation marks but the join method wont let me use an integer so I tried with quotation marks but now it won't increment
Here is my code
VideoNumber += "99"
folderLocation = ("C:/Users/someone/Documents", VideoNumber, ".mp4")
x = "/".join(folderLocation)
print(x)

You can format integers into a string using an f-string or the format() method on strings.
video_number += 99
video_path = f"C:/Users/someone/Documents/{video_number}.mp4"
print(video_path)
Just as an example of how to make your original code work, you could keep your number as an integer and then convert it to a string using str() (though note this has a bug because you will have an extra / between the number and .mp4).
VideoNumber += 99
folderLocation = ("C:/Users/someone/Documents", str(VideoNumber), ".mp4")
x = "/".join(folderLocation)
print(x)

You can cast the integer into string, so your code will be like this
folderLocation = ("C:/Users/someone/Documents", str(VideoNumber), ".mp4")

Read only the float in a file

I am working with file handling exercise.
So my txt file have this content:
List of Sales
Day 1 : 1250.25
Day 2 : 2560.25
Day 3 : 3241.10
Day 4 : 1530.20
Day 5 : 1247.27
Day 6 : 1646.22
Day 7 : 850.25
I want to only get the amount per day and sum it.
OFile = open('sales.txt','r')
file_content = OFile.read()
print(file_content)
import re
get = re.findall(r'[.]', file_content)
amount = []
for n in range(7):
amount.append(get)
total = sum(amount)
print("Total sales Amount: ", "Php", total)
I keep getting Total sales Amount 0

keep it simple and use str.split and str.strip instead of using regex!
In your case (with the input file you have attached)
Exception may raised from the conversion to float (if you have
invalid line or some string that can not be converted to float!
Or line that have no ":" (e.g. the first line in the file) which causes
the split() call to return the same input string as a list of one string (the line)
without spaces.In both cases you want to
skip and continue to next line!
total_sum = 0
with open('sales.txt','r') as fp:
for line in fp:
try:
current_float_num = line.strip().split(":")[1]
current_float_num = float(current_float_num)
# do work on float_num
# for example add it to the accumulative total_sum
total_sum += current_float_num
except (IndexError,ValueError):
continue

How to strip a comma in the middle of a large number?

I want to convert a str number into a float or int numerical type. However, it is throwing an error that it can't, so I am removing the comma. The comma will not be removed, so I need to find a way of finding a way of designating the location in the number space like say fourth.
power4 = power[power.get('Number of Customers Affected') != 'Unknown']
power5 = power4[pd.notnull(power4['Number of Customers Affected'])]
power6 = power5[power5.get('NERC Region') == 'RFC']
power7 = power6.get('Number of Customers Affected').loc[1]
power8 = power7.strip(",")
power9 = float(power8)
ValueError Traceback (most recent call last) <ipython-input-70- 32ca4deb9734> in <module>
6 power7 = power6.get('Number of Customers Affected').loc[1]
7 power8 = power7.strip(",")
----> 8 power9 = float(power8)
9
10
ValueError: could not convert string to float: '127,000'

Use replace()
float('127,000'.replace(',',''))

Have you tried pandas.to_numeric?
import pandas as pd
a = '1234'
type(a)
a = pd.to_numeric(a)
type(a)

In the
power8 = power7.strip(",")
line, do
power8 = power7.replace(',', '')
strip() will not work here. What is required is replace() method of string. You may also try
''.join(e for e in s if e.isdigit())
Or,
s = ''.join(s.split(','))
RegeEx can also be a way to solve this, or you can have a look at this answer : https://stackoverflow.com/a/266162/9851541

how do I differentiate between str and int and is this what ValueError: invalid literal for int() with base 10: error means?

choice = input (" ")
choice = int(choice)
if choice == 2:
print ("What class are you in? Please choose (class) 1, 2 or 3.")
Class = int(input ())
#class 1 file
if Class == 1:
c1 = open('class1.csv', 'a+')
ScoreCount = str(ScoreCount)
c1.write(myName + "-" + ScoreCount)
c1.write("\n")
c1.close()
read_c1 = open('class1.csv', 'r')
print (read_c1)
if choice == 3:
row[1]=int(row[1]) #converts the values into int.
row[2]=int(row[2])
row[3]=int(row[3])
row[4]=int(row[4])
if choice == 4:
WMCI= 1
print ("Thank You. Bye!")
So when this code is actually run it outputs an error which I don't understand:
ValueError: invalid literal for int() with base 10:#(myName-score)
How to fix this and what does this error mean in simple terms?

You've got 2 bugs.
The first is when you store your score into the csv, you're converting ScoreCount into a string, and keeping it that way. You need to let the conversion be temporary for just the job:
#class 1 file
if Class == 1:
c1 = open('class1.csv', 'a+')
c1.write(myName + "-" + str(ScoreCount))
c1.write("\n")
c1.close()
read_c1 = open('class1.csv', 'r')
print (read_c1)
That'll fix it with Class 1, you'll need to do 2 & 3. Your second bug is when you're reading the scores from the file, you've stored them as: "Name-5" if the person called Name had scored 5. That means you can't convert them as a whole entity into a number. You'll need to split the number part off. So in min max, where you've got:
row[0] = int (row[0])
It needs to become:
row[0] = int(row[0].split("-")[1])
But from there I can't figure out your logic or what you're trying to achieve in that section of code. It'll get rid of the current error, but that part of your code needs more work.
Explaining the right hand side of the above line of code by building it up:
row[0] # For our example, this will return 'Guido-9'
row[0].split("-") # Splits the string to return ['Guido','9']
row[0].split("-")[1] # Takes the second item and returns '9'
int(row[0].split("-")[1]) # Turns it into a number and returns 9
split("-") is the part you're likely to not have met, it breaks a string up into a list, splitting it at the point of the "-" in our example, but would be at the spaces if the brackets were left empty: split() or any other character you put in the brackets.

You are probably trying to convert something that can't be converted to an int for example
int("h")
This would give the base 10 error which you are getting

Python import txt formatting

I have an Excel file with a list of numbers and I saved it as a .txt file and then went to say:
open_file = open('list_of_numbers.txt','r')
for number in open_file:
number = int(number)
while x < 20000:
if (x > number):
print number
x = x + 100
y = y + 100
And I received this error message:
ValueError: invalid literal for int() with base 10: '2100.00\r\n'
How can I strip the ' and the \r\n'?
My ultimate goal is to create another column next to the column of numbers and, if the number is 145 for example,
145, '100-199'
167, '100-199'
1167, '1100-1199'
that sort of output.

Let's put it as an answer. The problem is not \r\n. The problem is that you try to parse string that contains a float value as an integer. See (no line feed, new line characters):
>>> int("2100.00")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '2100.00'
(as you can see, the quotation marks ' are not part of the value, they just indicate that you are dealing with a string)
whereas
>>> int("2100\r\n")
2100
The documentation says:
If the argument is a string, it must contain a possibly signed decimal number representable as a Python integer, possibly embedded in whitespace.
where the Python integer literal definition can be found here.
Solution:
Use float:
>>> float("2100.00\r\n")
2100.0
then you can convert it to an integer if you want to (also consider round):
>>> int(float("2100.00\r\n"))
2100
Converting a float value to integer works (from the documentation):
Conversion of floating point numbers to integers truncates (towards zero).

To address your immediate problem, go with the answer by #Felix Kling.
If you are interested in your FUTURE problems, please read on.
(1) That \r is not part of the problem IN THIS PARTICULAR CASE, but is intriguing: Are you creating the file on Windows and reading it on Linux/OSX/etc? If so, you should open your text file with "rU" (universal newlines), so that the input line on Python has only the \n.
(2) In any case, it's a very good idea to do line = line.rstrip('\n') ... otherwise, depending on how you split up the lines, you may end up with your last field containing an unwanted \n.
(3) You may prefer to use xlrd to read from an Excel file directly -- this saves all sorts of hassles. [Dis]claimer: I'm the author of xlrd.

Try this:
number = int(number.strip(string.whitespace + "'"))
You will need to add import string to the beginning of the your script. See also: http://docs.python.org/library/stdtypes.html#str.strip

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Convert re.match/re.search to string - python

The object returned from re.search function is a match object not a string. That's why you get the error: AttributeError: '_sre.SRE_Match' object has no attribute 'replace' To get the matching string after using re.search try: moneyc = moneyc.group() Then moneyc.replace will work.

Related

Formatting strings with integers

Read only the float in a file

How to strip a comma in the middle of a large number?

how do I differentiate between str and int and is this what ValueError: invalid literal for int() with base 10: error means?

Python import txt formatting

Categories

Resources