I am recording responses during a simple calculation task in Python, and I am storing these in a string. I would like to use the numerical part of the keyboard, but these give for instance 'num_1' instead of '1'. It probably has something to do that I store the input as a Text Stimulus in PsychoPy.. Any way to get around this?
CapturedResponseString = visual.TextStim(myWin,
units='norm',height = 0.2,
pos=(0,-0.40), text='',
alignHoriz = 'center',alignVert='center', color=[-1,-1,-1])
captured_string = '' #key presses will be captured in this string
If all your responses are preceded by "num_" you can just amputate them. For example int(CapturedResponseString[4:]) will grab the numerical portion and turn it into an integer.
Python has lots of string processing tools that are much more sophisticated than this, and they are all available to you when using Psychopy. For example you could also split at the underscore. CapturedResponseString.split('_') will return a list with the stuff before the underscore in the first position and the rest in the second (assuming only one underscore).
Related
I have a script that processes the output of a command (the aws help cli command).
I step through the output line-by-line and don't start the actual real parsing until I encounter the text "AVAILABLE COMMANDS" at which point I set a flag to true and start further processing on each line.
I've had this working fine - BUT on Ubuntu we encounter a problem which is this :
The CLI highlights the text in a way I have not seen before:
The output is very long, so I've grep'd the particular line in question - see below:
># aws ec2 help | egrep '^A'
>AVAILABLE COMMANDS
># aws ec2 help | egrep '^A' | cat -vet
>A^HAV^HVA^HAI^HIL^HLA^HAB^HBL^HLE^HE C^HCO^HOM^HMM^HMA^HAN^HND^HDS^HS$
What I haven't seen before is that each letter that is highligted is in the format X^HX.
I'd like to apply a simple transformation of the type X^HX --> X (for all a-zA-Z).
What have I tried so far:
well my workaround is this - first I remove control characters like this:
String = re.sub(r'[\x00-\x1f\x7f-\x9f]','',String)
but I still have to search for 'AAVVAAIILLAABBLLEE' which is totally ugly. I considered using a further regex to turn doubles to singles but that will catch true doubles and get messy.
I started writing a function with an iteration across a constructed list of alpha characters to translate as described, and I used hexdump to try to figure out the exact \x code of the control characters in question but could not get it working - I could remove H but not the ^.
I really don't want to use any additional modules because I want to make this available to people without them having to install extras. In conclusion I have a workaround that is quite ugly, but I'm sure someone must know a quick an easy way to do this translation. It's odd that it only seems to show up on Ubuntu.
After looking at this a little further I was able to put in place a solution:
from string import ascii_lowercase
from string import ascii_uppercase
def RemoveUbuntuHighlighting(String):
for Char in ascii_uppercase + ascii_lowercase:
Match = Char + '\x08' + Char
String = re.sub(Match,Char,String)
return(String)
I'm still a little confounded to see characters highlighted in the format (X\x08X), the arrangement does seem to repeat the same information unnecessarily.
The other thing I would advise to anyone not familiar with reading hexcode is that each pair of hexes is swapped around with respect to the order of their appearance.
A much simpler and more reliable fix is to replace a backspace and duplicate of any character.
I have also augmented this to handle underscores using the same mechanism (character, backspace, underscore).
String = re.sub(r'(.)\x08(\1|_)', r'\1', String)
Demo: https://ideone.com/yzwd2V
This highlighting was standard back when output was to a line printer; backspacing and printing the same character again would add pigmentation to produce boldface. (Backspacing and printing an underscore would produce underlining.)
Probably the AWS CLI can be configured to disable this by setting the TERM variable to something like dumb. There is also a utility col which can remove this formatting (try col-b; maybe see also colcrt). Though perhaps really the best solution would be to import the AWS Python code and extract the help message natively.
Few weeks ago I needed a crawler for data collection and sorting so I started learning python.
Same day I wrote a simple crawler but the code looked ugly as hell. Mainly because I don't know how to do certain things and I don't know how to properly google them.
Example:
Instead of deleting [, ] and ' in one line I did
extra_nr = extra_nr.replace("'", '')
extra_nr = extra_nr.replace("[", '')
extra_nr = extra_nr.replace("]", '')
extra_nr = extra_nr.replace(",", '')
Because I couldn't do stuff to list object and when I did str(list object) It looked like ['this', 'and this'].
Now I'm creating discord bot that will upload data that I feed to it to google spreadsheet. The code is long and ugly. And it takes like 2-3 secs to start the bot (idk if this is normal, I think the more I write the more time it takes to start it which makes me think that code is garbage). Sometimes it works, sometimes it doesn't.
My question is how do I know that I wrote something good? And if I just keep adding stuff like in the example, how will it affect my program? If I have a really long code do I split it and call the parts of it only when they are needed or how does it work?
tl;dr to get good at Python and write good code, write a lot of Python and read other people's code. Learn multiple approaches to different problem types and get a feel for which to use and when. It's something that comes over time with a lot of practice. As far as resources, I highly recommend the book "Automate the Boring Stuff with Python".
As for your code sample, you could use translate for this:
def strip(my_string):
bad_chars = [*"[],'"]
return my_string.translate({ord(c): None for c in bad_chars})
translate does a character by character translation of the string given a translation table, so you create a small translation table with the characters you don't want set to None.
The list of characters you don't want is created by unpacking (splatting) a string of the characters.
>>> [*"abc"] == ["a", "b", "c"]
True
Another option would be using comprehensions:
def strip(my_string):
bad_chars = [*"[],'"]
return "".join(c for c in my_string if c not in bad_chars)
Here we use the comprehension format [x for x in y] to build a new list of xs from y, just specifying to drop the character if it appears in bad_chars. We then join the remaining list of characters into a string that doesn't have the specified characters in it.
You will definitely improve quickly from reading (or listening) up on Python best practices from resources like Real Python and Talk Python To Me.
Meanwhile, I'd recommend starting using some code analysers like pylint and bandit as part of your regular workflow.
In any case, welcome to the world of Python and enjoy! :-)
You can use maketrans() to define characters to remove (3rd parameter):
def clean(S): return S.translate(str.maketrans("","","[],'"))
clean("A['23']") # 'A23'
A quick question. I have no idea how to google for an answer.
I have a python program for taking input from users and generating string output.
I then use this string variables to populate text boxes on other software (Illustrator).
The string is made out of: number + '%' + text, e.g.
'50% Cool', '25% Alright', '25% Decent'.
These three elements are imputed into one Text Box (next to one another), and as it is with text boxes if one line does not fit the whole text, the text is moved down to another line as soon as it finds a white space ' '. Like So:
50% Cool 25% Alright 25%
Decent
I need to keep this feature in (Where text gets moved down to a lower line if it does not fit) but I need it to move the whole element and not split it.
Like So:
50% Cool 25% Alright
25% Decent
The only way I can think of to stop this from happening; is to use some sort of invisible ASCII code which connects each element together (while still retaining human visible white spaces).
Does anyone know of such ASCII connector that could be used?
So, understand first of all that what you're asking about is encoding specific. In ASCII/ANSI encoding (or Latin1), a non-breaking space can either be character 160 or character 255. (See this discussion for details.) Eg:
non_breaking_space = ord(160)
HOWEVER, that's for encoded ASCII binary strings. Assuming you're using Python 3 (which you should consider if you're not), your strings are all Unicode strings, not encoded binary strings.
And this also begs the question of how you plan on getting these strings into Illustrator. Does the user copy and paste them? Are they encoded into a file? That will affect how you want to transmit a non-breaking space.
Assuming you're using Python 3 and not worrying about encoding, this is what you want:
'Alright\u002025%'
\u0020 inserts a Unicode non-breaking space.
I'm beginner to Python ... I'd like to format the characters in Python using basic concepts and operations of Tuples and Lists as below ...
I enter 10 digit number and except last 4 digits remaining all the numbers should be replaced by 'X'. For e.g.
number = 1234567890
Expecting output as -
number = XXXXXX7890
How to mask entered characters / numbers in Python using Tuples/Lists concept not using by importing any modules or existing high functions. Is it possible ?
For e.g. entered some characters , those should be masked using * (asterisk) or # (hashed) while entering. For e.g.
password : pa55w0rd
Expecting output while entering password as -
password : ********
OR
password: ########
It is always better to use built-in modules for things sensitive like password. One way of doing is following:
import getpass
number = 1234567890
first = 'X' * max(0,len(str(number)[:-4]))
last = str(number)[-4:]
n = first + last
print(n)
# part 2
p = getpass.getpass(prompt='Enter the number : ')
if int(p) == 123:
print('Welcome..!!!')
else:
print('Please enter correct number..!!!')
If you don't want to display typed password just print:
print('######')
It does not have to be of the same length you just have to print something.
Break down what's needed: you need to convert to a string, to figure out how many characters to replace, generate a replacement string of that length, then include the tail of the original string. Also you need to be robust against, eg, strings too short to have any characters replaced.
'X' * max(0, len(str(number)) - 4) + str(number)[-4:]
For the second part: use a library.
Doing this directly is more complicated than it might seem to a beginner, because you're having to communicate with the systems which take text entry. It's going to depend upon the operating system, Windows vs "roughly everything else". For text entry outside of a web-browser or a GUI, most systems are emulating ancient text-only terminal devices because there's not yet enough reason to change that. Those devices have modes of text input (character at a time, line at a time, raw, etc) and changing them to not immediately "echo" the character typed involves some intricate system calls, and then other programming to echo a different character instead.
Thus you're going to want to use a library to take care of all those intricate details for you. Something around password entry. Given the security implications, using tested and hardened code instead of rolling your own is something I strongly encourage. Be aware that there are all sorts of issues around password handling too (constant time comparisons, memory handling, etc) such that as much as possible, you should avoid doing it at all, or move it to another program, and when you do handle it, use the existing libraries.
If you can, stick to the Python standard library and use getpass which won't echo anything for passwords, instead of printing stars.
If you really want the stars, then search https://pypi.org/ for getpass and see all the variants people have produced. Most of the ones I saw in a quick look didn't inspire confidence; pysectools seemed better than the others, but I've not used it.
I have a large set of strings, and am looking to extract a certain part of each of the strings. Each string contains a sub string like this:
my_token:[
"key_of_interest"
],
This is the only part in each string it says my_token. I was thinking about getting the end index position of ' my_token:[" ' and after that getting the beginning index position of ' "], ' and getting all the text between those two index positions.
Is there a better or more efficient way of doing this? I'll be doing this for string of length ~10,000 and sets of size 100,000.
Edit: The file is a .ion file. From my understanding it can be treated as a flat file - as it is text based and used for describing metadata.
How can this can possibly be done the "dumbest and simplest way"?
find the starting position
look on for the ending position
grab everything indiscriminately between the two
This is indeed what you're doing. Thus any further inprovement can only come from the optimization of each step. Possible ways include:
narrow down the search region (requires additional constraints/assumptions as per comment56995056)
speed up the search operation bits, which include:
extracting raw data from the format
you already did this by disregarding the format altogether - so you have to make sure there'll never be any incorrect parsing (e.g. your search terms embedded in strings elsewhere or matching a part of a token) as per comment56995034
elementary pattern comparison operation
unlikely to attain in pure Python since str.index is implemented in C already and the implementation is probably already as simple as can possibly be
The underlying requirement shows through when you clarify:
I was thinking about getting the end index position of ' my_token:[" ' and after that getting the beginning index position of ' "], ' and getting all the text between those two index positions.
That sounds like you're trying to avoid the correct approach: use a parser for whatever language is in the string.
There is no good reason to build directly on top of string primitives for parsing, unless you are interested in writing yet another parsing framework.
So, use libraries written by people who have dealt with the issues before you.
If it's JSON, use the standard library json module; ditto if it's some other language with a parser already in the Python standard library.
If it's some other widely-implemented standard: get whichever already-existing third-party Python library knows how to parse that properly.
If it's not already implemented: write a custom parser using pyparsing or some other well-known solid library.
So to make a good choice you need to know what is the data format (this is not answered by “what are the file names”; rather, you need to know what is the data format of the content of those files). Then you'll be able to search for a parser library that knows about that data format.
Well, as already mentioned - a parser seems the best option.
But to answer your question without all this extra advice ... if you're just looking at speed, a parser isn't really the best method of doing this. The faster method is you already have a string like this would be to use regex.
matches = re.match(r"my_token:\[\s*"(.*)"\s*\]\.",str)
key_of_interest = matches.groups()[0]
There are other issues that come up. For example what if your key has a " inside it ? strinified JSON will automatically use an escape character there and that will be captures by the regex too. And therefore this gets a bit too complicated.
And JSON is not regex parsable in itself (is-json-a-regular-language). So, use at your own risk. But with the appropriate restrictions and assumptions regex would be faster than a json parser.