I'm generating a bot that should receive telegram messages and forward them to another group after applying regex to the received message.
Regex tested in PHP and sites around the web, in theory it should work.
regex = r"⏰ Timeframe M5((\s+(.*)\s+){1,})⏰ Timeframe M15"
text_example = '''⏰ Timeframe M5
04:05 - EURUSD - PUT
05:15 - EURJPY - PUT
06:35 - EURGBP - PUT
07:10 - EURUSD - PUT
08:15 - EURJPY - PUT
⏰ Timeframe M15
06:35 - EURGBP - PUT
07:10 - EURUSD - PUT
08:15 - EURJPY - PUT '''
reg = re.findall(regex, example_text)
print(reg)
return me
[ ]
I've run out of attempts..
I've used regex in other situations and had no problems, in this one I don't know why it works
The pattern does not work for the example data because you are repeating 1 or more leading AND trailing whitespace characters after .*. This will not match if you have 2 consecutive lines without for example extra spaces at the end that you do seem to have in this regex example https://regex101.com/r/gvulvK/3
Note that the name of the initial variable is text_example
What you might do is match the start and the end of the pattern, and in between match all lines that do not start with for example the ⏰
⏰ Timeframe M5\s*((?:\n(?!⏰).*)+)\n⏰ Timeframe M15\b
See this regex101 demo and this regex101 demo
import re
regex = r"⏰ Timeframe M5\s*((?:\n(?!⏰).*)+)\n⏰ Timeframe M15\b"
text_example = '''⏰ Timeframe M5
04:05 - EURUSD - PUT
05:15 - EURJPY - PUT
06:35 - EURGBP - PUT
07:10 - EURUSD - PUT
08:15 - EURJPY - PUT
⏰ Timeframe M15
06:35 - EURGBP - PUT
07:10 - EURUSD - PUT
08:15 - EURJPY - PUT '''
reg = re.findall(regex, text_example)
print(reg)
Output
['\n04:05 - EURUSD - PUT\n05:15 - EURJPY - PUT\n06:35 - EURGBP - PUT\n07:10 - EURUSD - PUT\n08:15 - EURJPY - PUT\n']
If you slightly change your regex and also run it in dot all mode, it should work:
regex = r'⏰ Timeframe M5.*?⏰ Timeframe M15'
matches = re.findall(regex, text_example, flags=re.S)
print(matches)
This prints:
⏰ Timeframe M5
04:05 - EURUSD - PUT
05:15 - EURJPY - PUT
06:35 - EURGBP - PUT
07:10 - EURUSD - PUT
08:15 - EURJPY - PUT
⏰ Timeframe M15
Related
Good evening Everyone.
i wrote a small piece of code (Got help from this stackoverflow Search)
I could able to get the list of files from the directory.
I tried to store in Excel Spreadsheet. I could able to store only two lines of the file in excel, but not all the file names.
Please check the code and output.
Help me a solution to list all file names in the excel.
Thanks a million.
code:-
import os
import pandas as pd
path = "//home//halovivek//Downloads//education//Jimi Kwik - Super Brain//"
list = []
for (root,dirs, file) in os.walk(path):
for f in file:
print(f)
my_data = pd.DataFrame(file)
my_data.to_excel("outputfile.xlsx", index = False, header= False)
output:-
/home/halovivek/PycharmProjects/yearcoding/venv/bin/python
/home/halovivek/PycharmProjects/yearcoding/26082002_listfilesfromdirectory.py
Day 12 - Implementation Day - Juggling Exercise.mp4 Day 12 -
Implementation Day - Juggling Exercise.MP3 Part 2 - Preparing For The
Quest.mp4 Part 1 - Welcome To Your 30-Day Superbrain Quest.mp4 Part 5
10 Morning Habits Geniuses Use To Jump Start The Brain.mp4 Part 3 - The FAST Method For Learning Anything.mp4 Part 2 - Preparing For The
Quest.MP3 Part 1 - Welcome To Your 30-Day Superbrain Quest.MP3 Part 5
10 Morning Habits Geniuses Use To Jump Start The Brain.MP3 Part 4 - How To Take Notes.mp4 Part 3 - The FAST Method For Learning
Anything.MP3 Part 4 - How To Take Notes.MP3 Day 27 - The Ancient
Alphanumeric Code Of Memory Part 2 - Application.MP3 Day 27 - The
Ancient Alphanumeric Code Of Memory Part 2 - Application.mp4 Day 29 -
The 5 Levels Of Transformation.MP3 Day 29 - The 5 Levels Of
Transformation.mp4 Day 14 - Memory Is As Easy As PIE.MP3 Day 14 -
Memory Is As Easy As PIE.mp4 Day 15 - The FDR Technique.MP3 Day 15 -
The FDR Technique.mp4 Day 31 - Overcoming Procrastination.mp4 Day 31 -
Overcoming Procrastination.MP3 Day 9 - Chain Linking - Part 1.mp4 Day
9 - Chain Linking - Part 1.MP3 Day 8 - Implementation Day - Morning
Routine.mp4 Day 8 - Implementation Day - Morning Routine.MP3 Day 24 -
Implementation Day - Crossovers.MP3 Day 24 - Implementation Day -
Crossovers.mp4 Day 19 - Learning Foreign Languages.mp4 Day 19 -
Learning Foreign Languages.MP3 Day 13 - BE SUAVE Remembering Names.MP3
Day 13 - BE SUAVE Remembering Names.mp4 Day 34 - Speed Reading.MP3 Day
34 - Speed Reading.mp4 Day 22 - The Location Method.MP3 Day 22 - The
Location Method.mp4 Day 5 - Nutrition _ Your Body Folders.MP3 Day 5 -
Nutrition _ Your Body Folders.mp4 Day 23 - Memorize Word For Word.MP3
Day 23 - Memorize Word For Word.mp4 Day 30 - Q&A Session with Jim.mp4
Day 30 - The 5 Levels Of Learning.mp4 Day 30 - The 5 Levels Of
Learning.MP3 Day 30 - Q&A with Jim.MP3 Day 28 - Implementation Day -
Phonetic Number Code.MP3 Day 28 - Implementation Day - Phonetic Number
Code.mp4 How To Become A Super Learner Masterclass - Jim Kwik.mp4
how_to_become_a_super_learner_by_jim_kwik_workbook_nsp.pdf Day 18 -
Keyword Substitution Method.MP3 Day 18 - Keyword Substitution
Method.mp4 Day 32 - Your 8 C's To Muscle Memory.mp4 Day 32 - Your 8
C's To Muscle Memory.MP3 Day 4 - Implementation Day - Spaced
Repetition Concept.mp4 Day 4 - Implementation Day - Spaced Repetition
Concept.MP3 Day 26 - The Ancient Alphanumeric Code Of Memory Part 1 -
The Sounds.MP3 Day 26 - The Ancient Alphanumeric Code Of Memory Part 1
The Sounds.mp4 Day 2 - The Sun Is Up.mp4 Day 2 - The Sun Is Up.MP3 Day 16 - Implementation Day - Superbrain Yoga.MP3 Day 16 -
Implementation Day - Superbrain Yoga.mp4 Day 10 - Chain Linking - Part
2.MP3 Day 10 - Chain Linking - Part 2.mp4 .getxfer.14810.259.mega Day 7 - Sleep _ Stress Management.MP3 Day 7 - Sleep _ Stress
Management.mp4 Day 33 - Remembering Your Dreams.mp4 Day 33 -
Remembering Your Dreams.MP3 .getxfer.14709.10.mega
.getxfer.28947.5.mega Day 21 - How To Give A Speech Without Notes.MP3
.getxfer.27388.5.mega Day 3 - The 10 Keys To Unlocking Your
Superbrain.mp4 .getxfer.10040.5.mega Day 6 - Environments _ Killing
ANTs.MP3
_Groupinsiders.com.url Day 6 - Environments _ Killing ANTs.mp4 Day 25 - Numbers - The Basics.mp4 Day 25 - Numbers - The Basics.MP3 Day 1 - M.O.M. Can Help You Remember.MP3 Day 1 - M.O.M. Can Help You
Remember.mp4 Day 20 - Implementation Day - Counting To 10 In
Japanese.MP3 Day 20 - Implementation Day - Counting To 10 In
Japanese.mp4 Day 11 - The Peg Memory Method.MP3 Day 11 - The Peg
Memory Method.mp4 Day 17 - The Ultimate TIP To Remembering
Anything.MP3 Day 17 - The Ultimate TIP To Remembering Anything.mp4
Process finished with exit code 0
You're issues here... 'list' is a keyword in python.. so can't be used as a variable name when you try to declare a list..
You're not actually doing anything with your values in your loop
Glob is slightly easier for working through directories
import glob
list_of_files = []
for file_or_dir in glob.glob("//home//halovivek//Downloads//education//Jimi Kwik - Super Brain//**", recursive = True):
if os.path.isfile(file_or_dir)
list_of_mp3s_mp4s.append(file_or_dir)
my_data = pd.DataFrame(list_of_mp3s)
my_data.to_excel("outputfile.xlsx", index = False, header= False)
import os
import pandas as pd
#path of the file you want to enemurate
path = "//home//halovivek//Downloads//"
directory =[]
filename=[]
for (root,dirs, file) in os.walk(path):
for f in file:
directory.append(root)
filename.append(f)
print(f)
#column name of the sheet
df=pd.DataFrame(list(zip(directory,filename)),columns=['Directory',"filename"])
#change the file of exccl sheet
df.to_csv("all.csv")
It will list all files and store it in the spreadsheet in csv format.
I tested and its working fine.
I've been trying to do a script that takes date inputs like 3/14/2015, 03-14-2015,
and 2015/3/14 (using pyperclip to copy and paste) and modifies them to a single format. So far this is what I've accomplished:
import re,pyperclip
dateRegex_0 = re.compile(r'''(
#0) 3/14/2015
(\d{1,2})
(-|\/|\.)
(\d{2})
(-|\/|\.)
(\d{4})
)''',re.VERBOSE)
dateRegex_1 = re.compile(r'''(
#1)03-14-2015
(\d{2})
(-|\/|\.)
(\d{2})
(-|\/|\.)
(\d{4})
)''',re.VERBOSE)
dateRegex_2 = re.compile(r'''(
#2)2015/3/14 , format YYYY/MM/DD
(\d{4})
(-|\/|\.)
(\d{1,2})
(-|\/|\.)
(\d{1,2})
)''',re.VERBOSE)
text=str(pyperclip.paste())
matches = []
for groups in dateRegex_0.findall(text):
cleanDate = '-'.join([groups[3],groups[1],groups[5]])
matches.append(cleanDate)
for groups in dateRegex_1.findall(text):
cleanDate = '-'.join([groups[3],groups[1],groups[5]])
matches.append(cleanDate)
for groups in dateRegex_2.findall(text):
cleanDate = '-'.join([groups[5],groups[3],groups[1]])
matches.append(cleanDate)
if len(matches)>0:
pyperclip.copy('\n'.join(matches))
print('Copied to clipboard:')
print('\n'.join(matches))
else:
print('There are no dates in your text!')
I managed to create a regex for each date type, and the code transforms the data to this format:
DD-MM-YYYY.
However I have 2 problems:
When I try to clean this type of date: 3/14/2015, 03-14-2015 I get this output:
14-3-2015 , 14-03-2015. I want to get rid of that 0 that appears before the single digit months, or add a 0 before everyone of them (basically I want all of my cleaned dates to have the same format).
How can I write a Regex for my date types that doesn't require 3 separate ones? I want a single Regex to identify all of the date types(instead of having dateRegex_0, dateRegex_1, dateRegex_2).
One idea...
import re
#pip install dateparser (if required)
import dateparser
# quite crude pattern; just 1-4 number, then either / or -, then repeated a couple of times
pattern = r'(\d{1,4}(?:/|-)\d{1,4}(?:/|-)\d{1,4})'
# this is just seen as text (could be from the clipboard)...
data = '''
import dateparser
dates = ['1/14/2016', '05-14-2017', '2014/3/18', '2015-06-14 00:00:00', '13-13-2000000']
for date in dates:
print(dateparser.parse(date))
'''
# pull out a list of dates matching the above pattern to a list
extracted_dates = re.findall(pattern, data)
# print out the matched strings if dateparser thinks they are a date
# '3-13-2000000' would match the regex but for dateparser it returns None
for date in extracted_dates:
if dateparser.parse(date) is not None:
print(dateparser.parse(date))
Outputs:
2016-01-14 00:00:00
2017-05-14 00:00:00
2014-03-18 00:00:00
2015-06-14 00:00:00
I have scraped some data from the yellow pages using scrapy.
The hours of the business provided from scraping are in a 12-hour format and I need to convert it into 24 hours.
The format for the business hours I scraped are:
Mon - Fri:,10:00 am - 7:00 pm.
I need to extract the two values for opening and closing time, convert them both into 24-hour format and then concatenate the string back together again.
As a result, I need to devise a regex that will extract the time and then change it into a 24 hour format.
The final string should (as per previous example) should be:
Mon - Fri:,10:00 - 19:00
I have tried different regex. I tried the following:
import re
txt = 'Mon - Fri:,10:00 am - 7:00 pm'
data = re.findall(r'\s(\d{2}\:\d{2}\s?(?:AM|PM|am|pm))', txt)
print(data)
i am not python developer but we can in this way in javascript. you can convert logic into python
in this way you can convert this time to miltary time (24 hour)
https://jsfiddle.net/1hxojLdf/2/
let text='Mon - Fri:,10:00 am - 7:00 pm';
const regex=/(\w+\s-\s\w+:.)(\d{1,2}:\d{1,2}\s(am|pm))\s-\s(\d{1,2}:\d{1,2}\s(am|pm))/;
const result=text.match(regex);
let timeone=result[2];
let timetwo=result[4];
timeone= moment(timeone,"h:mm A").format('HH:mm');
timetwo= moment(timetwo,"h:mm A").format('HH:mm');
text=result[1]+timeone+"-"+timetwo;
alert(text)
I want to do like this. Do you know a good way?
import re
if __name__ == '__main__':
sample = "eventA 12:30 - 14:00 5200yen / eventB 15:30 - 17:00 10200yen enjoy!"
i_want_to_generate = "eventA 12:30 - 14:00 5,200yen / eventB 15:30 - 17:00 10,200yen enjoy!"
replaced = re.sub("(\d{1,3})(\d{3})", "$1,$2", sample) # Wrong.
print(replaced) # eventA 12:30 - 14:00 $1,$2yen / eventB 15:30 - 17:00 $1,$2yen enjoy!
You're not using the correct notation for your back-reference(s). You could also add a positive lookahead assertion containing the currency to ensure only those after the 'yen' are changed:
replaced = re.sub(r"(\d{1,3})(\d{3})(?=yen)", r"\1,\2", sample) # Wrong.
print(replaced)
# eventA 12:30 - 14:00 5,200yen / eventB 15:30 - 17:00 10,200yen enjoy!
Use \1 instead of $1 for substitution
Check: https://regex101.com/r/T2sbD2/1
I have this string:
Sun 10:00am - 10:00pm<br>Mon 10:00am - 10:00pm<br>Tue 10:00am - 10:00pm<br>Wed 10:00am - 10:00pm<br>Thu 10:00am - 10:00pm<br>Fri 10:00am - 10:00pm<br>Sat 10:00am - 10:00pm
And I want to extract only the 2 first hours appearing (which would be 10:00am and 10:00pm)
I am trying with slicing and with spliting, but without success.
Regex:
(?<=\s)\d{2}:\d{2}[ap]m
will get all the HH:MM matches and you need to get the first two using e.g. list slicing [:2] when using re.findall.
Without Regex:
Split on <br> tag, then again by whitespace, get the second and last elements:
str_.split('<br>')[0].split()
[out[1], out[-1]]
Example:
In [56]: str_ = 'Sun 10:00am - 10:00pm<br>Mon 10:00am - 10:00pm<br>Tue 10:00am - 10:00pm<br>Wed 10:00am - 10:00pm<br>Thu 10:00am - 10:00pm<br>Fri 10:00am - 10:00pm<br>Sat 10:00am - 10:00pm'
In [57]: re.findall(r'(?<=\s)\d{2}:\d{2}[ap]m', str_)[:2]
Out[57]: ['10:00am', '10:00pm']
In [58]: out = str_.split('<br>')[0].split()
In [59]: [out[1], out[-1]]
Out[59]: ['10:00am', '10:00pm']
I thought this regex would do:
import re
s= 'Sun 10:00am - 10:00pm<br>Mon 10:00am - 10:00pm<br>Tue 10:00am - 10:00pm<br>Wed 10:00am - 10:00pm<br>Thu 10:00am - 10:00pm<br>Fri 10:00am - 10:00pm<br>Sat 10:00am - 10:00pm'
pattern = r'\d{2}:\d{2}[AaPp][Mm]'
timestamps = re.findall(pattern, s)[:2]
print(timestamps)
No need for regex:
s = "Sun 10:00am - 10:00pm<br>Mon 10:00am - 10:00pm<br>Tue 10:00am - 10:00pm<br>Wed 10:00am - 10:00pm<br>Thu 10:00am - 10:00pm<br>Fri 10:00am - 10:00pm<br>Sat 10:00am - 10:00pm"
spl = s.split("<br>") # split at <br> into the days
d={} # empty dict
for s in spl: # for each day
d.setdefault(s.split(" ")[0],[]).extend([x for x in s.split(" ")
if x!= '-'][1:])
print(d)
Output:
{'Wed': ['10:00am', '10:00pm'],
'Sun': ['10:00am', '10:00pm'],
'Fri': ['10:00am', '10:00pm'],
'Tue': ['10:00am', '10:00pm'],
'Mon': ['10:00am', '10:00pm'],
'Thu': ['10:00am', '10:00pm'],
'Sat': ['10:00am', '10:00pm']}
It splits your data into days (at <br>) and splits each day into its weekday (as key) and the both times into a list, omitting the day which we already took as key for the dict and the - that is between the times.
You get to the list of times of Tue by tueTime = d['Tue'] and can access it by [0] or [1] or by decomposing open,close = tueTime.
If you only need the first one, do use: spl = s.split("<br>")[0] - the dict is unordered and you wont know which one was first in your data string.