Constructing a long Python string using variables - python

I have a python string which is basically a concatenation of 3 variables.I am using f-strings to make it a string. It looks like this now:
my_string = f'{getattr(RequestMethodVerbMapping, self.request_method).value} {self.serializer.Meta.model.__name__} {self.request_data['name']}'
which gives me the output:
Create IPGroup test-create-demo-098
Exactly the output that I want. However, as is obvious the line is too long and now Pylint starts complaining so I try to break up the line using multiline f-strings as follows:
my_string = f'''{getattr(RequestMethodVerbMapping, self.request_method).value}
{self.serializer.Meta.model.__name__} {self.request_data['name']}'''
Pylint is now happy but my string looks like this:
Create
IPGroup test-create-demo-098
What is the best way of doing this so that I get my string in one line and also silence Pylint for the line being longer than 120 chars?

This is a long line by itself and, instead of trying to fit a lot into a single line, I would break it down and apply the "Extract Variable" refactoring method for readability:
request_method_name = getattr(RequestMethodVerbMapping, self.request_method).value
model_name = self.serializer.Meta.model.__name__
name = self.request_data['name']
my_string = f'{request_method_name} {model_name} {name}'
I think the following pieces of wisdom from the Zen of Python fit our problem here well:
Sparse is better than dense.
Readability counts.

It is possible to concat f-strings with just whitespace in between, just like ordinary strings. Just parenthesize the whole thing.
my_string = (f'{getattr(RequestMethodVerbMapping, self.request_method).value}'
f' {self.serializer.Meta.model.__name__}
f' {self.request_data["name"]}')
It will compile to exactly the same code as your one-line string.
It is even possible to interleave f-strings and non-f-strings:
>>> print(f"'foo'" '"{bar}"' f'{1 + 42}')
'foo'"{bar}"43

Related

How to convert a regular string to a raw string? [duplicate]

I have a string s, its contents are variable. How can I make it a raw string? I'm looking for something similar to the r'' method.
i believe what you're looking for is the str.encode("string-escape") function. For example, if you have a variable that you want to 'raw string':
a = '\x89'
a.encode('unicode_escape')
'\\x89'
Note: Use string-escape for python 2.x and older versions
I was searching for a similar solution and found the solution via:
casting raw strings python
Raw strings are not a different kind of string. They are a different way of describing a string in your source code. Once the string is created, it is what it is.
Since strings in Python are immutable, you cannot "make it" anything different. You can however, create a new raw string from s, like this:
raw_s = r'{}'.format(s)
As of Python 3.6, you can use the following (similar to #slashCoder):
def to_raw(string):
return fr"{string}"
my_dir ="C:\data\projects"
to_raw(my_dir)
yields 'C:\\data\\projects'. I'm using it on a Windows 10 machine to pass directories to functions.
raw strings apply only to string literals. they exist so that you can more conveniently express strings that would be modified by escape sequence processing. This is most especially useful when writing out regular expressions, or other forms of code in string literals. if you want a unicode string without escape processing, just prefix it with ur, like ur'somestring'.
For Python 3, the way to do this that doesn't add double backslashes and simply preserves \n, \t, etc. is:
a = 'hello\nbobby\nsally\n'
a.encode('unicode-escape').decode().replace('\\\\', '\\')
print(a)
Which gives a value that can be written as CSV:
hello\nbobby\nsally\n
There doesn't seem to be a solution for other special characters, however, that may get a single \ before them. It's a bummer. Solving that would be complex.
For example, to serialize a pandas.Series containing a list of strings with special characters in to a textfile in the format BERT expects with a CR between each sentence and a blank line between each document:
with open('sentences.csv', 'w') as f:
current_idx = 0
for idx, doc in sentences.items():
# Insert a newline to separate documents
if idx != current_idx:
f.write('\n')
# Write each sentence exactly as it appared to one line each
for sentence in doc:
f.write(sentence.encode('unicode-escape').decode().replace('\\\\', '\\') + '\n')
This outputs (for the Github CodeSearchNet docstrings for all languages tokenized into sentences):
Makes sure the fast-path emits in order.
#param value the value to emit or queue up\n#param delayError if true, errors are delayed until the source has terminated\n#param disposable the resource to dispose if the drain terminates
Mirrors the one ObservableSource in an Iterable of several ObservableSources that first either emits an item or sends\na termination notification.
Scheduler:\n{#code amb} does not operate by default on a particular {#link Scheduler}.
#param the common element type\n#param sources\nan Iterable of ObservableSource sources competing to react first.
A subscription to each source will\noccur in the same order as in the Iterable.
#return an Observable that emits the same sequence as whichever of the source ObservableSources first\nemitted an item or sent a termination notification\n#see ReactiveX operators documentation: Amb
...
Just format like that:
s = "your string"; raw_s = r'{0}'.format(s)
With a little bit correcting #Jolly1234's Answer:
here is the code:
raw_string=path.encode('unicode_escape').decode()
s = "hel\nlo"
raws = '%r'%s #coversion to raw string
#print(raws) will print 'hel\nlo' with single quotes.
print(raws[1:-1]) # will print hel\nlo without single quotes.
#raws[1:-1] string slicing is performed
The solution, which worked for me was:
fr"{orignal_string}"
Suggested in comments by #ChemEnger
I suppose repr function can help you:
s = 't\n'
repr(s)
"'t\\n'"
repr(s)[1:-1]
't\\n'
Just simply use the encode function.
my_var = 'hello'
my_var_bytes = my_var.encode()
print(my_var_bytes)
And then to convert it back to a regular string do this
my_var_bytes = 'hello'
my_var = my_var_bytes.decode()
print(my_var)
--EDIT--
The following does not make the string raw but instead encodes it to bytes and decodes it.

Nested f-strings

Thanks to David Beazley's tweet, I've recently found out that the new Python 3.6 f-strings can also be nested:
>>> price = 478.23
>>> f"{f'${price:0.2f}':*>20s}"
'*************$478.23'
Or:
>>> x = 42
>>> f'''-{f"""*{f"+{f'.{x}.'}+"}*"""}-'''
'-*+.42.+*-'
While I am surprised that this is possible, I am missing on how practical is that, when would nesting f-strings be useful? What use cases can this cover?
Note: The PEP itself does not mention nesting f-strings, but there is a specific test case.
I don't think formatted string literals allowing nesting (by nesting, I take it to mean f'{f".."}') is a result of careful consideration of possible use cases, I'm more convinced it's just allowed in order for them to conform with their specification.
The specification states that they support full Python expressions* inside brackets. It's also stated that a formatted string literal is really just an expression that is evaluated at run-time (See here, and here). As a result, it only makes sense to allow a formatted string literal as the expression inside another formatted string literal, forbidding it would negate the full support for Python expressions.
The fact that you can't find use cases mentioned in the docs (and only find test cases in the test suite) is because this is probably a nice (side) effect of the implementation and not it's motivating use-case.
Actually, with two exceptions: An empty expression is not allowed, and a lambda expression must be surrounded by explicit parentheses.
I guess this is to pass formatting parameters in the same line and thus simplify f-strings usage.
For example:
>>> import decimal
>>> width = 10
>>> precision = 4
>>> value = decimal.Decimal("12.34567")
>>> f"result: {value:{width}.{precision}}"
'result: 12.35'
Of course, it allows programmers to write absolutely unreadable code, but that's not the purpose :)
I've actually just come across something similar (i think) and thought i'd share.
My specific case is a big dirty sql statement where i need to conditionally have some very different values but some fstrings are the same (and also used in other places).
Here is quick example of what i mean. The cols i'm selecting are same regardless (and also used in other queries elsewhere) but the table name depends on the group and is not such i could just do it in a loop.
Having to include mycols=mycols in str2 each time felt a little dirty when i have multiple such params.
I was not sure this would work but was happy it did. As to how pythonic its is i'm not really sure tbh.
mycols='col_a,col_b'
str1 = "select {mycols} from {mytable} where group='{mygroup}'".format(mycols=mycols,mytable='{mytable}',mygroup='{mygroup}')
group = 'group_b'
if group == 'group_a':
str2 = str1.format(mytable='tbl1',mygroup=group)
elif group == 'group_b':
str2 = str1.format(mytable='a_very_different_table_name',mygroup=group)
print(str2)
Any basic use case is where you need a string to completely describe the object you want to put inside the f-string braces {}. For example, you need strings to index dictionaries.
So, I ended up using it in an ML project with code like:
scores = dict()
scores[f'{task}_accuracy'] = 100. * n_valid / n_total
print(f'{task}_accuracy: {scores[f"{task}_accuracy"]}')
Working on a pet project I got sidetracked by writing my own DB library. One thing I discovered was this:
>>> x = dict(a = 1, b = 2, d = 3)
>>> z = f"""
UPDATE TABLE
bar
SET
{", ".join([ f'{k} = ?' for k in x.keys() ])} """.strip()
>>> z
'UPDATE TABLE
bar
SET
a = ?, b = ?, d = ? '
I was also surprised by this and honestly I am not sure I would ever do something like this in production code BUT I have also said I wouldn't do a lot of other things in production code.
I found nesting to be useful when doing ternaries. Your opinion will vary on readability, but I found this one-liner very useful.
logger.info(f"No program name in subgroups file. Using {f'{prg_num} {prg_orig_date}' if not prg_name else prg_name}")
As such, my tests for nesting would be:
Is the value reused? (Variable for expression re-use)
Is the expression clear? (Not exceeding complexity)
I use it for formatting currencies. Given values like:
a=1.23
b=45.67
to format them with a leading $ and with the decimals aligned. e.g.
$1.23
$45.67
formatting with a single f-string f"${value:5.2f}" you can get:
$ 1.23
$45.67
which is fine sometimes but not always. Nested f-strings f"${f'${value:.2f}':>6}" give you the exact format:
$1.23
$45.67
A simple example of when it's useful, together with an example of implementation: sometimes the formatting is also a variable.
num = 3.1415
fmt = ".2f"
print(f"number is {num:{fmt}}")
Nested f-strings vs. evaluated expressions in format specifiers
This question is about use-cases that would motivate using an f-string inside of some evaluated expression of an "outer" f-string.
This is different from the feature that allows evaluated expressions to appear within the format specifier of an f-string. This latter feature is extremely useful and somewhat relevant to this question since (1) it involves nested curly braces so it might be why people are looking at this post and (2) nested f-strings are allowed within the format specifier just as they are within other curly-expressions of an f-string.
F-string nesting can help with one-liners
Although certainly not the motivation for allowing nested f-strings, nesting may be helpful in obscure cases where you need or want a "one-liner" (e.g. lambda expressions, comprehensions, python -c command from the terminal). For example:
print('\n'.join([f"length of {x/3:g}{'.'*(11 - len(f'{x/3:g}'))}{len(f'{x/3:g}')}" for x in range(10)]))
If you do not need a one-liner, any syntactic nesting can be replaced by defining a variable previously and then using the variable name in the evaluated expression of the f-string (and in many if not most cases, the non-nested version would likely be more readable and easier to maintain; however it does require coming up with variable names):
for x in range(10):
to_show = f"{x/3:g}"
string_length = len(to_show)
padding = '.' * (11 - string_length)
print(f"length of {to_show}{padding}{string_length}")
Nested evaluated expressions (i.e. in the format specifier) are useful
In contrast to true f-string nesting, the related feature allowing evaluated expressions within the "format specifier" of an f-string can be extremely useful (as others have pointed out) for several reasons including:
formatting can be shared across multiple f-strings or evaluated expressions
formatting can include computed quantities that can vary from run to run
Here is an example that uses a nested evaluated expression, but not a nested f-string:
import random
results = [[i, *[random.random()] * 3] for i in range(10)]
format = "2.2f"
print("category,precision,recall,f1")
for cat, precision, recall, f1 in results:
print(f"{cat},{precision:{format}},{recall:{format}},{f1:{format}}")
However, even this use of nesting can be replaced with more flexible (and maybe cleaner) code that does not require syntactic nesting:
import random
results = [[i, *[random.random()] * 3] for i in range(10)]
def format(x):
return f"{x:2.2f}"
print("category,precision,recall,f1")
for cat, precision, recall, f1 in results:
print(f"{cat},{format(precision)},{format(recall)},{format(f1)}")
The following nested f-string one-liner does a great job in constructing a command argument string
cmd_args = f"""{' '.join([f'--{key} {value}' for key, value in kwargs.items()])}"""
where the input
{'a': 10, 'b': 20, 'c': 30, ....}
gets elegantly converted to
--a 10 --b 20 --c 30 ...
`
In F-string the open-paren & close-paren are reserved key characters.
To use f-string to build json string you have to escape the parentheses characters.
in your case only the outer parens.
f"\{f'${price:0.2f}':*>20s\}"
If some fancy formatting is needed, such nesting could be, perhaps, useful.
for n in range(10, 1000, 100):
print(f"{f'n = {n:<3}':<15}| {f'|{n:>5}**2 = {n**2:<7_}'} |")
You could use it for dynamicism. For instance, say you have a variable set to the name of some function:
func = 'my_func'
Then you could write:
f"{f'{func}'()}"
which would be equivalent to:
'{}'.format(locals()[func]())
or, equivalently:
'{}'.format(my_func())

Python insert a line break in a string after character "X"

What is the python syntax to insert a line break after every occurrence of character "X" ? This below gave me a list object which has no split attribute error
for myItem in myList.split('X'):
myString = myString.join(myItem.replace('X','X\n'))
myString = '1X2X3X'
print (myString.replace ('X', 'X\n'))
You can simply replace X by "X\n"
myString.replace("X","X\n")
Python 3.X
myString.translate({ord('X'):'X\n'})
translate() allows a dict, so, you can replace more than one different character at time.
Why translate() over replace() ? Check translate vs replace
Python 2.7
myString.maketrans('X','X\n')
A list has no split method (as the error says).
Assuming myList is a list of strings and you want to replace 'X' with 'X\n' in each once of them, you can use list comprehension:
new_list = [string.replace('X', 'X\n') for string in myList]
Based on your question details, it sounds like the most suitable is str.replace, as suggested by #DeepSpace. #levi's answer is also applicable, but could be a bit of a too big cannon to use.
I add to those an even more powerful tool - regex, which is slower and harder to grasp, but in case this is not really "X" -> "X\n" substitution you actually try to do, but something more complex, you should consider:
import re
result_string = re.sub("X", "X\n", original_string)
For more details: https://docs.python.org/2/library/re.html#re.sub

Python : splitting and splitting

I need some help;
I'm trying to program a sort of command prompt with python
I need to split a text file into lines then split them into strings
example :
splitting
command1 var1 var2;
command2 (blah, bleh);
command3 blah (b bleh);
command4 var1(blah b(bleh * var2));
into :
line1=['command1','var1','var2']
line2=['command2']
line2_sub1=['blah','bleh']
line3=['blah']
line3_sub1=['b','bleh']
line4=['command4']
line4_sub1=['blah','b']
line4_sub2=['bleh','var2']
line4_sub2_operand=['*']
Would that be possible at all?
If so could some one explain how or give me a piece of code that would do it?
Thanks a lot,
It's been pointed out, that there appears to be no reasoning to your language. All I can do is point you to pyparsing, which is what I would use if I were solving a problem similar to this, here is a pyparsing example for the python language.
Like everyone else is saying, your language is confusingly designed and you probably need to simplify it. But I'm going to give you what you're looking for and let you figure that out the hard way.
The standard python file object (returned by open()) is an iterator of lines, and the split() method of the python string class splits a string into a list of substrings. So you'll probably want to start with something like:
for line in command_file
words = line.split(' ')
http://docs.python.org/3/library/string.html
you could use this code to read the file line by line and split it by spaces between words.
a= True
f = open(filename)
while a:
nextline=f.readline()
wordlist= nextline.split("")
print(wordlist)
if nextline=="\n":
a= False
What you're talking about is writing a simple programming language. It's not extraordinarily difficult if you know what you're doing, but it is the sort of thing most people take a full semester class to learn. The fact that you've got multiple different types of lexical units with what looks to be a non-trivial, recursive syntax means that you'll need a scanner and a parser. If you really want to teach yourself to do this, this might not be a bad place to start.
If you simplify your grammar such that each command only has a fixed number of arguments, you can probably get away with using regular expressions to represent the syntax of your individual commands.
Give it a shot. Just don't expect it to all work itself out overnight.

Munging non-printable characters to dots using string.translate()

So I've done this before and it's a surprising ugly bit of code for such a seemingly simple task.
The goal is to translate any non-printable character into a . (dot). For my purposes "printable" does exclude the last few characters from string.printable (new-lines, tabs, and so on). This is for printing things like the old MS-DOS debug "hex dump" format ... or anything similar to that (where additional whitespace will mangle the intended dump layout).
I know I can use string.translate() and, to use that, I need a translation table. So I use string.maketrans() for that. Here's the best I could come up with:
filter = string.maketrans(
string.translate(string.maketrans('',''),
string.maketrans('',''),string.printable[:-5]),
'.'*len(string.translate(string.maketrans('',''),
string.maketrans('',''),string.printable[:-5])))
... which is an unreadable mess (though it does work).
From there you can call use something like:
for each_line in sometext:
print string.translate(each_line, filter)
... and be happy. (So long as you don't look under the hood).
Now it is more readable if I break that horrid expression into separate statements:
ascii = string.maketrans('','') # The whole ASCII character set
nonprintable = string.translate(ascii, ascii, string.printable[:-5]) # Optional delchars argument
filter = string.maketrans(nonprintable, '.' * len(nonprintable))
And it's tempting to do that just for legibility.
However, I keep thinking there has to be a more elegant way to express this!
Here's another approach using a list comprehension:
filter = ''.join([['.', chr(x)][chr(x) in string.printable[:-5]] for x in xrange(256)])
Broadest use of "ascii" here, but you get the idea
>>> import string
>>> ascii="".join(map(chr,range(256)))
>>> filter="".join(('.',x)[x in string.printable[:-5]] for x in ascii)
>>> ascii.translate(filter)
'................................ !"#$%&\'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~.................................................................................................................................'
If I were golfing, probably use something like this:
filter='.'*32+"".join(map(chr,range(32,127)))+'.'*129
for actual code-golf, I imagine you'd avoid string.maketrans entirely
s=set(string.printable[:-5])
newstring = ''.join(x for x in oldstring if x in s else '.')
or
newstring=re.sub('[^'+string.printable[:-5]+']','',oldstring)
I don't find this solution ugly. It is certainly more efficient than any regex based solution. Here is a tiny bit shorter solution. But only works in python2.6:
nonprintable = string.maketrans('','').translate(None, string.printable[:-5])
filter = string.maketrans(nonprintable, '.' * len(nonprintable))

Categories

Resources