Trigger to automatically remove EOL whitespace? - python

Can one write a perforce trigger to automatically remove whitespace at submission time? Preferably in python? What would that look like? Or can you not modify files as they're being submitted?

To my knowledge this cannot be done, since you cannot put the modified file-content back to the server. The only two trigger types that allow you to see the file-content with p4 print are change-content and change-commit. For the latter, the files are already submitted on the server and for the former, while you can see the (unsubmitted) file content, there is no way to modify it and put it back on the server.
The only trigger that is possible is to reject files with EOL whitespace to be submitted, so that the submitters can fix the files on their own. Here is an excerpt of a similar one that checks for tabs in files, please read the docu on triggers and look at the Perforce site for examples:
def fail(sComment):
print sComment
sys.exit(1)
return
sCmd = "p4 -G files //sw/...#=%s" % sChangeNr
stream = os.popen(sCmd, 'rb')
dictResult = []
try:
while 1:
dictResult.append(marshal.load(stream))
except EOFError:
pass
stream.close()
failures = []
# check all files for tabs
for element in dictResult:
depotFile = element['depotFile']
sCmd = "p4 print -q %s#=%s" % (depotFile,sChangeNr)
content = os.popen(sCmd, 'rb').read()
if content.find('\t') != -1:
failures.append(depotFile)
if len(failures) != 0:
error = "Files contain tabulators (instead of spaces):\n"
for i in failures:
error = error + str(i) + "\n"
fail(error)

Related

Can you help me figure out why this function execution is too bad?

I am trying to use threads in Python to read some files (big files, some of the might be over a Gig size) and parse the file to find specific info, I am using the re module for that.
The problem is that I'm seeing very high execution times.
Reading over 4 files, then parsing the files for my data takes me over 30 seconds. Is this expected or there's any recommendation you can provide me with to improvde these times?
I apologize in advance, I'm sure that this has been asked in the forum already, i really tried to find this myself but could not find the right words to search for this problem.
Below is my current code:
def get_hostname(file: str) -> str:
"""
Get the hostname from show tech/show running file
:param file: show tech/show running string
:return: the hostname as a string
"""
hostname = re.findall('hostname.*', file, flags=re.IGNORECASE)
if len(hostname) > 0:
return hostname[0].split(' ')[1]
else:
print('Could not find a valid hostname on file ' + file)
def set_file_dictionary():
threads_list = []
def set_file_dictionary_thread(file_name: str):
thread_set_file_dict_time = time.time()
current_file = open(path + file_name, encoding='utf8', errors='ignore').read()
files_dir[get_hostname(current_file)] = current_file
print('set_file_dictionary_thread is ' + str(time.time() - thread_set_file_dict_time))
for file in list_files:
threads_list.append(threading.Thread(target=set_file_dictionary_thread, args=(file, )))
for thread in threads_list:
thread.start()
for thread in threads_list:
thread.join()
The result is
set_file_dictionary_thread is 12.55484390258789
set_file_dictionary_thread is 13.184206008911133
set_file_dictionary_thread is 16.15609312057495
set_file_dictionary_thread is 16.19360327720642
Main exec time is 16.1940469741821
Thanks for reading me
NOTE - The indentation is ok, for some reason it gets messed up when copying from Pycharmand
Firstly, running regex in multiple python threads won't help much. (see https://stackoverflow.com/a/9984414/14035728)
Secondly, you can improve your get_hostname function by:
compiling regex beforehand
using search instead of findall, since apparently you only need the first match
using groups to capture the hostname, instead of string split
Here's my suggested get_hostname function:
hostname_re = re.compile('hostname ([^ ]*)', flags=re.IGNORECASE)
def get_hostname(file: str) -> str:
match = hostname_re.search(file)
if match:
return match.groups()[0]
else:
print('Could not find a valid hostname on file ' + file)

Changing output of speedtest.py and speedtest-cli to include IP address in output .csv file

I added a line in the python code “speedtest.py” that I found at pimylifeup.com. I hoped it would allow me to track the internet provider and IP address along with all the other speed information his code provides. But when I execute it, the code only grabs the next word after the find all call. I would also like it to return the IP address that appears after the provider. I have attached the code below. Can you help me modify it to return what I am looking for.
Here is an example what is returned by speedtest-cli
$ speedtest-cli
Retrieving speedtest.net configuration...
Testing from Biglobe (111.111.111.111)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by GLBB Japan (Naha) [51.24 km]: 118.566 ms
Testing download speed................................................................................
Download: 4.00 Mbit/s
Testing upload speed......................................................................................................
Upload: 13.19 Mbit/s
$
And this is an example of what it is being returned by speediest.py to my .csv file
Date,Time,Ping,Download (Mbit/s),Upload(Mbit/s),myip
05/30/20,12:47,76.391,12.28,19.43,Biglobe
This is what I want it to return.
Date,Time,Ping,Download (Mbit/s),Upload (Mbit/s),myip
05/30/20,12:31,75.158,14.29,19.54,Biglobe 111.111.111.111
Or may be,
05/30/20,12:31,75.158,14.29,19.54,Biglobe,111.111.111.111
Here is the code that I am using. And thank you for any help you can provide.
import os
import re
import subprocess
import time
response = subprocess.Popen(‘/usr/local/bin/speedtest-cli’, shell=True, stdout=subprocess.PIPE).stdout.read().decode(‘utf-8’)
ping = re.findall(‘km]:\s(.*?)\s’, response, re.MULTILINE)
download = re.findall(‘Download:\s(.*?)\s’, response, re.MULTILINE)
upload = re.findall(‘Upload:\s(.*?)\s’, response, re.MULTILINE)
myip = re.findall(‘from\s(.*?)\s’, response, re.MULTILINE)
ping = ping[0].replace(‘,’, ‘.’)
download = download[0].replace(‘,’, ‘.’)
upload = upload[0].replace(‘,’, ‘.’)
myip = myip[0]
try:
f = open(‘/home/pi/speedtest/speedtestz.csv’, ‘a+’)
if os.stat(‘/home/pi/speedtest/speedtestz.csv’).st_size == 0:
f.write(‘Date,Time,Ping,Download (Mbit/s),Upload (Mbit/s),myip\r\n’)
except:
pass
f.write(‘{},{},{},{},{},{}\r\n’.format(time.strftime(‘%m/%d/%y’), time.strftime(‘%H:%M’), ping, download, upload, myip))
Let me know if this works for you, it should do everything you're looking for
#!/usr/local/env python
import os
import csv
import time
import subprocess
from decimal import *
file_path = '/home/pi/speedtest/speedtestz.csv'
def format_speed(bits_string):
""" changes string bit/s to megabits/s and rounds to two decimal places """
return (Decimal(bits_string) / 1000000).quantize(Decimal('.01'), rounding=ROUND_UP)
def write_csv(row):
""" writes a header row if one does not exist and test result row """
# straight from csv man page
# see: https://docs.python.org/3/library/csv.html
with open(file_path, 'a+', newline='') as csvfile:
writer = csv.writer(csvfile, delimiter=',', quotechar='"')
if os.stat(file_path).st_size == 0:
writer.writerow(['Date','Time','Ping','Download (Mbit/s)','Upload (Mbit/s)','myip'])
writer.writerow(row)
response = subprocess.run(['/usr/local/bin/speedtest-cli', '--csv'], capture_output=True, encoding='utf-8')
# if speedtest-cli exited with no errors / ran successfully
if response.returncode == 0:
# from the csv man page
# "And while the module doesn’t directly support parsing strings, it can easily be done"
# this will remove quotes and spaces vs doing a string split on ','
# csv.reader returns an iterator, so we turn that into a list
cols = list(csv.reader([response.stdout]))[0]
# turns 13.45 ping to 13
ping = Decimal(cols[5]).quantize(Decimal('1.'))
# speedtest-cli --csv returns speed in bits/s, convert to bytes
download = format_speed(cols[6])
upload = format_speed(cols[7])
ip = cols[9]
date = time.strftime('%m/%d/%y')
time = time.strftime('%H:%M')
write_csv([date,time,ping,download,upload,ip])
else:
print('speedtest-cli returned error: %s' % response.stderr)
$/usr/local/bin/speedtest-cli --csv-header > speedtestz.csv
$/usr/local/bin/speedtest-cli --csv >> speedtestz.csv
output:
Server ID,Sponsor,Server Name,Timestamp,Distance,Ping,Download,Upload,Share,IP Address
Does that not get you what you're looking for? Run the first command once to create the csv with header row. Then subsequent runs are done with the append '>>` operator, and that'll add a test result row each time you run it
Doing all of those regexs will bite you if they or a library that they depend on decides to change their debugging output format
Plenty of ways to do it though. Hope this helps

Using win32com to download attachments through outlook with python

I've written a short code to download and rename files from a specific folder in my outlook account. The code works great, the only problem is that I typically need to run the code several times to actually download all of the messages. It seems the code is just failing to acknowledge some of the messages, there are no errors when I run through it.
I've tried a few things like walking through each line step by step in the python window, running the code with outlook closed or opened, and trying to print the files after they're successfully saved to see if there are specific messages that are causing the problem.
Here's my code
#! python3
# downloadAttachments.py - Downloads all of the weight tickets from Bucky
# Currently saves to desktop due to instability of I: drive connection
import win32com.client, os, re
#This line opens the outlook application
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
#Not exactly sure why the inbox is default folder 6 but it works
inbox = outlook.GetDefaultFolder(6)
#box where the messages are to save
TicketSave = inbox.Folders('WDE').Folders('SAVE').Folders('TicketSave')
#box where the messages are moved to
done = inbox.Folders('WDE').Folders('CHES').Folders('Weight Tickets')
ticketMessages = TicketSave.Items
#Key is used to verify the subject line is correct. This script only works if the person sends
# their emails with a consistent subject line (can be altered for other cases)
key = re.compile(r'wde load \d{3}') #requires regulars expressions (i.e. 'import re')
for message in ticketMessages:
#will skip any message that does not match the correct subject line format (non-case sensitive)
check = str(message.Subject).lower()
if key.search(check) == None:
continue
attachments = message.Attachments
tic = attachments.item(1)
ticnum = str(message.Subject).split()[2]
name = str(tic).split()[0] + ' ticket ' + ticnum + '.pdf' #changes the filename
tic.SaveAsFile('C:\\Users\\bhalvorson\\Desktop\\Attachments' + os.sep + str(name))
if message.UnRead == True:
message.UnRead = False
message.Move(done)
print('Ticket pdf: ' + name + ' save successfully')
Alright I found the answer to my own question. I'll post it here in case any other youngster runs into the same problem as me.
The main problem is the "message.Move(done)" second from the bottom.
Apparently the move function alters the current folder thus altering the number of loops that the for loop will go through. So, the way it's written above, the code only ever processes half of the items in the folder.
An easy work around is to switch the main line of the for loop to "for message in list(ticketMessages):" the list is not affected by the Move function and therefore you'll be able to loop through every message.
Hope this helps someone.

Message sent over socket missing the \n

I am generating a protocol for a tcpip socket between python and matlab. While trying to setup some sort of a protocol, I ran into a problem. It has to do with this set of code below
FPath = Path('c:/test/dogg.jpg')
HASH = Commons.get_file_md5_hash((FPath))
msg = ('IDINFO'+FPath.name+'HASH'+ HASH+'\n')
generates the message
IDINFOdogg.jpgHASH7ad1a930dab3c099939b66267b5c57f8
I have in the message: IDINFO which will tell the server the name of the file and HASH which will tell the file details.
After this I open up the file using
f = open(FPath,"rb")
chunk = f.read(1020)
and build a package with the tag DATA in front
msg = b`DATA` + chunk + b'\n'
Problem is that the b'\n' is not the same as in the first message. as Matlab cannot read the delimiter and won't continue grabbing data chunks.
Matlab code for Below reference. This isn't the entire object just the part that is potentially causing trouble.
To setup a callback.
set(gh.tcpipServer, 'BytesAvailableFcnMode','Terminator');
set(gh.tcpipServer, 'BytesAvailableFcn', #(h,e)gh.Serverpull(h,e));
The Function for looking at the bytes
function Serverpull(gh,h,e)
gh.msg = fread(gh.tcpipServer,gh.tcpipServer.BytesAvailable);
gh.msgdecode = char(transpose(gh.msg));
if strfind(gh.msgdecode,'IDINFO')
Hst = strfind(gh.msgdecode,'HASH');
gh.Fname = gh.msgdecode(7:Hst-1);
gh.HASH = gh.msgdecode(Hst+4:end);
fwrite(gh.tcpipServer,'GoodToGo');
gh.PrepareforDataAq()
elseif strfind(gh.msgdecode,'DATA')
fwrite(gh.fileID,gh.msg(5:end),'double');
elseif strfind(gh.msgdecode,'EOF')
fclose(gh.fileID);
display('File Transfer Complete')
end
end
function PrepareforDataAq(gh)
path = fullfile('c:\temp\',gh.Fname);
gh.fileID = fopen(path,'w');
end
For the TLDR,
How to make the string '\n' the same as b \n when building a tcp message from binary instead of strings before encoding?

How do I run python '__main__' program file from bash prompt in Windows10?

I am trying to run a python3 program file and am getting some unexpected behaviors.
I'll start off first with my PATH and env setup configuration. When I run:
which Python
I get:
/c/Program Files/Python36/python
From there, I cd into the directory where my python program is located to prepare to run the program.
Roughly speaking this is how my python program is set up:
import modulesNeeded
print('1st debug statement to show program execution')
# variables declared as needed
def aFunctionNeeded():
print('2nd debug statement to show fxn exe, never prints')
... function logic...
if __name__ == '__main__':
aFunctionNeeded() # Never gets called
Here is a link to the repository with the code I am working with in case you would like more details as to the implementation. Keep in mind that API keys are not published, but API keys are in local file correctly:
https://github.com/lopezdp/API.Mashups
My question revolves around why my 1st debug statements inside the files are printing to the terminal, but not the 2nd debug statements inside the functions?
This is happening in both of the findRestaurant.py file and the geocode.py file.
I know I have written my if __name__ == '__main__': program entry point correctly as this is the same exact way I have done it for other programs, but in this case I may be missing something that I am not noticing.
If this is my output when I run my program in my bash terminal:
$ python findRestaurant.py
inside geo
inside find
then, why does it appear that my aFunctionNeeded() method shown in my pseudo code is not being called from the main?
Why do both programs seem to fail immediately after the first debug statements are printed to the terminal?
findRestaurant.py File that can also be found in link above
from geocode import getGeocodeLocation
import json
import httplib2
import sys
import codecs
print('inside find')
sys.stdout = codecs.getwriter('utf8')(sys.stdout)
sys.stderr = codecs.getwriter('utf8')(sys.stderr)
foursquare_client_id = "..."
foursquare_client_secret = "..."
def findARestaurant(mealType,location):
print('inside findFxn')
#1. Use getGeocodeLocation to get the latitude and longitude coordinates of the location string.
latitude, longitude = getGeocodeLocation(location)
#2. Use foursquare API to find a nearby restaurant with the latitude, longitude, and mealType strings.
#HINT: format for url will be something like https://api.foursquare.com/v2/venues/search?client_id=CLIENT_ID&client_secret=CLIENT_SECRET&v=20130815&ll=40.7,-74&query=sushi
url = ('https://api.foursquare.com/v2/venues/search?client_id=%s&client_secret=%s&v=20130815&ll=%s,%s&query=%s' % (foursquare_client_id, foursquare_client_secret,latitude,longitude,mealType))
h = httplib2.Http()
result = json.loads(h.request(url,'GET')[1])
if result['response']['venues']:
#3. Grab the first restaurant
restaurant = result['response']['venues'][0]
venue_id = restaurant['id']
restaurant_name = restaurant['name']
restaurant_address = restaurant['location']['formattedAddress']
address = ""
for i in restaurant_address:
address += i + " "
restaurant_address = address
#4. Get a 300x300 picture of the restaurant using the venue_id (you can change this by altering the 300x300 value in the URL or replacing it with 'orginal' to get the original picture
url = ('https://api.foursquare.com/v2/venues/%s/photos?client_id=%s&v=20150603&client_secret=%s' % ((venue_id,foursquare_client_id,foursquare_client_secret)))
result = json.loads(h.request(url, 'GET')[1])
#5. Grab the first image
if result['response']['photos']['items']:
firstpic = result['response']['photos']['items'][0]
prefix = firstpic['prefix']
suffix = firstpic['suffix']
imageURL = prefix + "300x300" + suffix
else:
#6. if no image available, insert default image url
imageURL = "http://pixabay.com/get/8926af5eb597ca51ca4c/1433440765/cheeseburger-34314_1280.png?direct"
#7. return a dictionary containing the restaurant name, address, and image url
restaurantInfo = {'name':restaurant_name, 'address':restaurant_address, 'image':imageURL}
print ("Restaurant Name: %s" % restaurantInfo['name'])
print ("Restaurant Address: %s" % restaurantInfo['address'])
print ("Image: %s \n" % restaurantInfo['image'])
return restaurantInfo
else:
print ("No Restaurants Found for %s" % location)
return "No Restaurants Found"
if __name__ == '__main__':
findARestaurant("Pizza", "Tokyo, Japan")
geocode.py File that can also be found in link above
import httplib2
import json
print('inside geo')
def getGeocodeLocation(inputString):
print('inside of geoFxn')
# Use Google Maps to convert a location into Latitute/Longitute coordinates
# FORMAT: https://maps.googleapis.com/maps/api/geocode/json?address=1600+Amphitheatre+Parkway,+Mountain+View,+CA&key=API_KEY
google_api_key = "..."
locationString = inputString.replace(" ", "+")
url = ('https://maps.googleapis.com/maps/api/geocode/json?address=%s&key=%s' % (locationString, google_api_key))
h = httplib2.Http()
result = json.loads(h.request(url,'GET')[1])
latitude = result['results'][0]['geometry']['location']['lat']
longitude = result['results'][0]['geometry']['location']['lng']
return (latitude,longitude)
The reason you're not seeing the output of the later parts of your code is that you've rebound the standard output and error streams with these lines:
sys.stdout = codecs.getwriter('utf8')(sys.stdout)
sys.stderr = codecs.getwriter('utf8')(sys.stderr)
I'm not exactly sure why those lines are breaking things for you, perhaps your console does not expect utf8 encoded output... But because they don't work as intended, you're not seeing anything from the rest of your code, including error messages, since you rebound the stderr stream along with the stdout stream.

Categories

Resources