Extract Date and Currency value(separated by comma) from file - python

Objective:
Extract String data, Currency value , [type of currency] and date.
Content of file:
[["1234567890","Your previous month subscription point is <RS|$|QR|#> 5,200.33.Your current month month subscription point is <RS|$|QR|#> 1,15,200.33, Last Year total point earned <RS|$|QR|#> 5589965.26 and point lost in game is <RS|$|QR|#> 11520 your this year subscription will expire on 19-04-2013. 9. Back"],["1234567890","Your previous month subscription point is <RS|$|QR|#> 5,200.33.Your current month month subscription point is <RS|$|QR|#> 1,15,200.33, Last Year total point earned <RS|$|QR|#> 5589965.26 and point lost in game is <RS|$|QR|#> 11520 your this year subscription will expire on 19-04-2013. 9. Back"]]
What I have done so far:
def read_file():
fp = open('D:\\ReadData2.txt', 'rb')
content = fp.read()
data = eval(content)
l1 = ["%s" % x[1] for x in data]
return l1
def check_currency(l2):
import re
for i in range(l2.__len__()):
newstr2 = l2[i]
val_currency = []
val_currency.extend(re.findall(r'([+-]?\d+(?:\,\d+)*?\d+(?:\.\d+)?)',newstr2))
print " List %s " % val_currency
for i in range(len(val_currency)):
val2 = val_currency[i]
remove_commas = re.compile(r',(?=\d+)*?')
val3 = remove_commas.sub('', val2)
print val3
if __name__=="__main__":main()
EDIT UDP
I am able to extract the currency value but with the currency of -ve value are conflicting with date format(dd-mm-yyyy). And during extracting string value its also extracting [.|,|] how not to read these characters.
Ouput of check_currency:
>List ['5,200.33', '1,15,200.33', '5589965.26', '11520', '19', '-04', '-2013']
>5200.33
>115200.33
>5589965.26
>11520
>19
>-04
>-2013
Expected Ouput of check_currency:
>List ['5,200.33', '1,15,200.33', '5589965.26', '11520']
>5200.33
>115200.33
>5589965.26
>11520

I added this <RS|$|QR|#>\s* at the first part of your regular expression so as
to be used as prefix for the currency value you want to match.
You can change your code to this one:
def check_currency(l2):
import re
for i in range(l2.__len__()):
newstr2 = l2[i]
val_currency = []
val_currency.extend(re.findall(r'<RS|$|QR|#>\s*([+-]?\d+(?:\,\d+)*?\d+(?:\.\d+)?)',newstr2))
# skip empty strings and remove comma characters
val_currency = [v.replace(',', '') for v in val_currency if v]
print " List %s " % val_currency$
for i in range(len(val_currency)):
val2 = val_currency[i]
remove_commas = re.compile(r',(?=\d+)*?')
val3 = remove_commas.sub('', val2)
print val3
Output:
List ['5200.33', '115200.33', '5589965.26', '11520']
5200.33
115200.33
5589965.26
11520
aditions in the code:
val_currency.extend(re.findall(r'<RS|$|QR|#>\s*([+-]?\d+(?:\,\d+)*?\d+(?:\.\d+)?)',newstr2))
val_currency = [v.replace(',', '') for v in val_currency if v]

Related

How to order a python dictionary containing a list of values

I'm not sure I am approaching this in the right way.
Scenario:
I have two SQL tables that contain rent information. One table contains rent due, and the other contains rent received.
I'm trying to build a rent book which takes the data from both tables for a specific lease and generates a date ordered statement which will be displayed on a webpage.
I'm using Python, Flask and SQL Alchemy.
I am currently learning Python, so I'm not sure if my approach is the best.
I've created a dictionary which contains the keys 'Date', 'Payment type' and 'Payment Amount', and in each of these keys I store a list which contains the data from my SQL queries. The bit im struggling on is how to sort the dictionary so it sorts by the date key, keeping the values in the other keys aligned to their date.
lease_id = 5
dates_list = []
type_list = []
amounts_list = []
rentbook_dict = {}
payments_due = Expected_Rent_Model.query.filter(Expected_Rent_Model.lease_id == lease_id).all()
payments_received = Rent_And_Fee_Income_Model.query.filter(Rent_And_Fee_Income_Model.lease_id == lease_id).all()
for item in payments_due:
dates_list.append(item.expected_rent_date)
type_list.append('Rent Due')
amounts_list.append(item.expected_rent_amount)
for item in payments_received:
dates_list.append(item.payment_date)
type_list.append(item.payment_type)
amounts_list.append(item.payment_amount)
rentbook_dict.setdefault('Date',[]).append(dates_list)
rentbook_dict.setdefault('Type',[]).append(type_list)
rentbook_dict.setdefault('Amount',[]).append(amounts_list)
I was then going to use a for loop within the flask template to iterate through each value and display it in a table on the page.
Or am I approaching this in the wrong way?
so I managed to get this working just using zipped list. Im sure there is a better way for me to accomplish this but im pleased I've got it working.
lease_id = 5
payments_due = Expected_Rent_Model.query.filter(Expected_Rent_Model.lease_id == lease_id).all()
payments_received = Rent_And_Fee_Income_Model.query.filter(Rent_And_Fee_Income_Model.lease_id == lease_id).all()
total_due = 0
for debit in payments_due:
total_due = total_due + int(debit.expected_rent_amount)
total_received = 0
for income in payments_received:
total_received = total_received + int(income.payment_amount)
balance = total_received - total_due
if balance < 0 :
arrears = "This account is in arrears"
else:
arrears = ""
dates_list = []
type_list = []
amounts_list = []
for item in payments_due:
dates_list.append(item.expected_rent_date)
type_list.append('Rent Due')
amounts_list.append(item.expected_rent_amount)
for item in payments_received:
dates_list.append(item.payment_date)
type_list.append(item.payment_type)
amounts_list.append(item.payment_amount)
payment_data = zip(dates_list, type_list, amounts_list)
sorted_payment_data = sorted(payment_data)
tuples = zip(*sorted_payment_data)
list1, list2, list3 = [ list(tuple) for tuple in tuples]
return(render_template('rentbook.html',
payment_data = zip(list1,list2,list3),
total_due = total_due,
total_received = total_received,
balance = balance))

How to find the average of a split float list from a text file?

Here is my code:
chosen_year = int(input("Enter the year of interest: "))
print("")
min_life_exp = 999.9
max_life_exp = -1.1
min_life_entity = ""
max_life_entity = ""
min_year = 9999
max_year = -1
chosen_min_life_exp = 999.9
chosen_max_life_exp = -1.1
chosen_min_life_entity = ""
chosen_max_life_entity = ""
with open("life-expectancy.csv") as life_file:
for line in life_file:
parts = line.split(",")
entity = parts[0]
code = parts[1]
year = int(parts[2])
life_exp = float(parts[3])
if max_life_exp < life_exp:
max_life_exp = life_exp
max_life_entity = entity
max_year = year
if min_life_exp > life_exp:
min_life_exp = life_exp
min_life_entity = entity
min_year = year
if chosen_year == year:
avg_life_exp = sum(life_exp) / len(life_exp)
if chosen_max_life_exp < life_exp:
chosen_max_life_exp = life_exp
chosen_max_life_entity = entity
if chosen_min_life_exp > life_exp:
chosen_min_life_exp = life_exp
chosen_min_life_entity = entity
print(f"The overall max life expectancy is: {max_life_exp} from {max_life_entity} in {max_year}")
print(f"The overall max life expectancy is: {min_life_exp} from {min_life_entity} in {min_year}")
print("")
print(f"For the year {chosen_year}:")
print(f"The average life expectancy across all countries was {avg_life_exp:.2f}")
print(f"The max life expectancy was in {chosen_max_life_entity} with {chosen_max_life_exp}")
print(f"The max life expectancy was in {chosen_min_life_entity} with {chosen_min_life_exp}")
I get this error when I run it:
line 38, in <module>
avg_life_exp = sum(life_exp) / len(life_exp)
TypeError: 'float' object is not iterable
how do I change my avg_life_exp to get an average of the life expectancies in the year the user asks for?
To fix this current error, you just need to remove the sum() function Since life_exp is a currently a float. However, I’m guessing you want all of the life_exp in a list, then get the sum of them. To fo this, have the “sum” chunk of the code after reading the lines from the file, since you want to add everything up /after/ you get all the life_exp in each line in the list. If you’d like more specifics, please clarify your question and the expected output.

Function send a mail Python

Let me explain my problem I am looking to make a program that compares user absences in 2 different tools. The result of their absence is returned in 2 Excel file or in which I make a sum of each absence for each user thanks to a dictionary in python then I compare the 2 dictionary in order to find an error and suddenly the program returns the name of the user for whom the number of absences is not equal. And so I would like to know how to make my program send an email to the user concerned.
Sum of absences :
for row in range(1,253):
id2.append(feuille_2.cell_value(row, 2))
absence2.append(float(feuille_2.cell_value(row, 9)))
result = {}
for name in set(id2):
result[name] = 0
for i in range(len(id2)):
hours = float(absence2[i])
name = id2[i]
result[name] += hours
for name, hours in result.items():
print(name + ":\t" + str(hours))
id4 = [id1]
absence = []
for row in range(1,361):
absence.append(feuille_1.cell_value(row, 10))
id4.append(id1)
print(absence)
result2 = {}
for name2 in set(id4):
result2[name2] = 0
for i in range(len(id4)):
hours2 = absence[i]
name2 = id4[i]
result2[name2] += hours2
print(result2)
Comparaison of two dictionaries :
print("Seulement sur Sugar:", ", ".join(set(result).difference(result2)))
print("Seulement sur Chrnos:", ", ".join(set(result2).difference(result)))
for key in set(result).intersection(result2):
if result[key]!=result2[key]:
print("%s n'a pas declarer ses congée"% (key))
And I want help i want a function who send an email to each user concerned. After the comparaison

IMDBpy getting screen information

import imdb
ia = imdb.IMDb()
avatar = ia.get_movie("0120667")
ia.update(avatar, 'business')
print avatar['business']
That returns the whole list of gross, aswell as screenings for each country. But how do i get out the screening information only? And for only 1 country. in this example the information i want to get is (USA) (10 July 2005) (3,602 Screens)
import imdb
import re
ia = imdb.IMDb()
avatar = ia.get_movie("0120667")
ia.update(avatar, 'business')
opening_weekends = avatar['business']['opening weekend']
def parseDate(date):
result = {}
if re.match(".*\d{4}$", date):
result['year'] = date[-4:]
m = re.match(".*(?P<month>January|February|March|April|May|June|July|"
"August|September|October|November|December).*", date, re.I)
if m:
result['month'] = m.group('month').lower()
# try to grab date too then
daymatch = re.match("^(?P<day>\d{1,2}).*", date)
if daymatch:
result['day'] = daymatch.group('day')
return result
def parseBudget(amount):
"""
assumptions:
- currency is always before the number
- no fractions
"""
# find index first number
for i in range(len(amount)):
if amount[i] in "0123456789":
amount_idx = i
break
currency = amount[:amount_idx].strip()
amount = re.sub("\D", "", amount[amount_idx:])
return amount, currency
def parseWeekendGross(gross_text):
g = gross_text.split(' (')
if not len(g) == 4:
return ""
amount, currency = parseBudget(g[0])
country = g[1].lstrip('(').rstrip(')')
date = parseDate(g[2].lstrip('(').rstrip(')'))
day, month, year = date['day'], date['month'], date['year']
screens = re.sub("\D", "", g[3])
if not screens:
screens = "''"
return amount, currency, country, day, month, year, screens
for entry in opening_weekends:
amount, currency, country, day, month, year, screens = parseWeekendGross(entry)
if country == "USA":
print("Country: %s" % country)
print("Date: %s %s %s" % (day, month, year))
print("Screens: %s" % screens)
break
The above code gives me the following result:
Country: USA
Date: 10 july 2005
Screens: 3602
The functions to parse the data are copied from this project: pyIRDG

returning different time frames from datetime

I am parsing a file this way :
for d in csvReader:
print datetime.datetime.strptime(d["Date"]+"-"+d["Time"], "%d-%b-%Y-%H:%M:%S.%f").date()
date() returns : 2000-01-08, which is correct
time() returns : 06:20:00, which is also correct
How would I go about returning informations like "date+time" or "date+hours+minutes"
EDIT
Sorry I should have been more precise, here is what I am trying to achieve :
lmb = lambda d: datetime.datetime.strptime(d["Date"]+"-"+d["Time"], "%d-%b-%Y-%H:%M:%S.%f").date()
daily_quotes = {}
for k, g in itertools.groupby(csvReader, key = lmb):
lowBids = []
highBids = []
openBids = []
closeBids = []
for i in g:
lowBids.append(float(i["Low Bid"]))
highBids.append(float(i["High Bid"]))
openBids.append(float(i["Open Bid"]))
closeBids.append(float(i["Close Bid"]))
dayMin = min(lowBids)
dayMax = max(highBids)
open = openBids[0]
close = closeBids[-1]
daily_quotes[k.strftime("%Y-%m-%d")] = [dayMin,dayMax,open,close]
As you can see, right now I'm grouping values by day, I would like to group them by hour ( for which I would need date + hour ) or minutes ( date + hour + minute )
thanks in advance !
Don't use the date method of the datetime object you're getting from strptime. Instead, apply strftime directly to the return from strptime, which gets you access to all the member fields, including year, month, day, hour, minute, seconds, etc...
d = {"Date": "01-Jan-2000", "Time": "01:02:03.456"}
dt = datetime.datetime.strptime(d["Date"]+"-"+d["Time"], "%d-%b-%Y-%H:%M:%S.%f")
print dt.strftime("%Y-%m-%d-%H-%M-%S")

Categories

Resources