I often work with groups of materials and my file/materials are named as alphanumeric strings. Is there a library to turn a string like r"Mxene - Ti3C2" to latex styled r"Mxene - Ti$_\mathrm{3}$C$_\mathrm{2}$"?
I usually use a dictionary but going through every name is a hassle and prone to error because materials can always be added or removed from the study.
I know that I can use str.maketrans() to generate subscripts but I haven't had very consistent results using the output with matplotlib so I'd much rather use latex.
I've ultimately created this solution in case anyone else needs it. Since my problem is mostly to create subscripts, the following code will look for numbers and replace them with a latex equivalent to create one.
def latexify(s):
import re
nums = re.findall(r'\d+', s)
pos = [[m.start(0), m.end(0)] for m in re.finditer(r'\d+', s)]
numpos = list(zip(nums, pos))
for num, pos in numpos:
string = f"$_\mathrm{{{num}}}$"
s = s[:pos[0]] + string + s[pos[1]:]
for ind, (n, [p_st, p_end]) in enumerate(numpos):
if p_st > pos[1]:
numpos[ind][1][0] += len(string)-len(num)
numpos[ind][1][1] += len(string)-len(num)
pass
return s
latexify("Ti32C2")
Returns:
'Ti$_\\mathrm{32}$C$_\\mathrm{2}$'
Related
I'm writing this little program learning Python and I have faced a problem. I use tabulate with number formatting set to 5 numbers after separator, to make everything look nice, and it works, until I print text in the table. After text is printed (stating that you cannot divide by 0), formatting on that column seems to be gone.
The code is:
if skaiciuoti == True:
while bp <= bg:
if (bp-a != 0):
y = float(a / (bp - a))
sk1.append(a)
sk2.append(bp)
sk3.append(y)
bp = bp + bz
elif (bp - a == 0):
sk1.append(a)
sk2.append(bp)
sk3.append('Veiksmas negalimas (dalyba is 0)')
bp = bp + bz
lentele = ['A reiksme', 'B reiksme', 'Y reiksme']
duomenys = zip(sk1, sk2, sk3)
print(tabulate (duomenys, headers=lentele, floatfmt=".5f", tablefmt="grid"))
Here are pictures to better illustrate my problem:
Working one
Broken one
I have tried formatting the number before appending it to the list, but it didn't work.
Any suggestions and ideas are welcome.
Okay, I've managed to fix it myself. It was really not that hard, I just didn't think about it.
The solution that worked for me was to format number when appending it to the list (not before the list) like so
sk3.append(format(y, '.5f'))
floatfmt must be a list or a tuple, specifying one format for each column.
In your case, use e.g.:
floatfmts = ('.5f', '.5f', '.5f')
print(tabulate (duomenys, headers=lentele, floatfmt=floatfmts, tablefmt="grid"))
(See e.g. https://bitbucket.org/astanin/python-tabulate/issues/96/floatfmt-option-is-ignored-in-python3)
I'm writing a program which needs a user input for an polynomial function of x. I'm using Tkinter and python 2.5.
I have a parser method which so far takes the inputted equation and splits it into terms without dropping the signs.
I want to take each term and parse it to get a tuple of the (coefficient, degree). For example, -2x^3 returns (-2,3). I can then add these to an array and manipulate them accordingly in the program.
Is there a way or standard module that can do this?
Here is the beginning of the parse method.
def parse(a):
termnum=[]
terms=[]
hi=[]
num1=0
num=0
f=list(a)
count=0
negative=False
coef=0.0
deg=0.0
codeg=[]
for item in f:
if (item=='-' or item=='+') and count!=0:
termnum.append(count)
count+=1
for item in termnum:
num1=num
num=item
current=''
while num1<num:
current=current+f[num1]
num1+=1
terms.append(current)
num1=num
num=len(f)
current=''
while num1<num:
current=current+f[num1]
num1+=1
terms.append(current)
print terms
parse('-x^2+3x+2x^3-x')
Thanks!
P.S I don't want to use external packages.
you can use regular expressions,
import re
test = '-x^2+3x+2x^3-x'
for m in re.finditer( r'(-{0,1}\d*)x\^{0,1}(-{0,1}\d*)', test ):
coef, expn = list( map( lambda x: x if x != '' and x != '-' else x + '1' ,
m.groups( ) ))
print ( 'coef:{}, exp:{}'.format( coef, expn ))
output:
coef:-1, exp:2
coef:3, exp:1
coef:2, exp:3
coef:-1, exp:1
Look for "recursive descent parser". It's the canonical method for analysis of strings where some operator precedence is involved.
It looks like you're implementing something that already exists, in python and other math languages. See for example:
http://www.gnu.org/software/octave/doc/interpreter/Solvers.html
http://stat.ethz.ch/R-manual/R-devel/library/base/html/solve.html
I'm writing a program which needs a user input for an polynomial function of x. I'm using Tkinter and python 2.5.
I have a parser method which so far takes the inputted equation and splits it into terms without dropping the signs.
I want to take each term and parse it to get a tuple of the (coefficient, degree). For example, -2x^3 returns (-2,3). I can then add these to an array and manipulate them accordingly in the program.
Is there a way or standard module that can do this?
Here is the beginning of the parse method.
def parse(a):
termnum=[]
terms=[]
hi=[]
num1=0
num=0
f=list(a)
count=0
negative=False
coef=0.0
deg=0.0
codeg=[]
for item in f:
if (item=='-' or item=='+') and count!=0:
termnum.append(count)
count+=1
for item in termnum:
num1=num
num=item
current=''
while num1<num:
current=current+f[num1]
num1+=1
terms.append(current)
num1=num
num=len(f)
current=''
while num1<num:
current=current+f[num1]
num1+=1
terms.append(current)
print terms
parse('-x^2+3x+2x^3-x')
Thanks!
P.S I don't want to use external packages.
you can use regular expressions,
import re
test = '-x^2+3x+2x^3-x'
for m in re.finditer( r'(-{0,1}\d*)x\^{0,1}(-{0,1}\d*)', test ):
coef, expn = list( map( lambda x: x if x != '' and x != '-' else x + '1' ,
m.groups( ) ))
print ( 'coef:{}, exp:{}'.format( coef, expn ))
output:
coef:-1, exp:2
coef:3, exp:1
coef:2, exp:3
coef:-1, exp:1
Look for "recursive descent parser". It's the canonical method for analysis of strings where some operator precedence is involved.
It looks like you're implementing something that already exists, in python and other math languages. See for example:
http://www.gnu.org/software/octave/doc/interpreter/Solvers.html
http://stat.ethz.ch/R-manual/R-devel/library/base/html/solve.html
I'm currently transitioning from Java to Python and have taken on the task of trying to create a calculator that can carry out symbolic operations on infix-notated mathematical expressions (without using custom modules like Sympy). Currently, it's built to accept strings that are space delimited and can only carry out the (, ), +, -, *, and / operators. Unfortunately, I can't figure out the basic algorithm for simplifying symbolic expressions.
For example, given the string '2 * ( ( 9 / 6 ) + 6 * x )', my program should carry out the following steps:
2 * ( 1.5 + 6 * x )
3 + 12 * x
But I can't get the program to ignore the x when distributing the 2. In addition, how can I handle 'x * 6 / x' so it returns '6' after simplification?
EDIT: To clarify, by "symbolic" I meant that it will leave letters like "A" and "f" in the output while carrying out the remaining calculations.
EDIT 2: I (mostly) finished the code. I'm posting it here if anyone stumbles on this post in the future, or if any of you were curious.
def reduceExpr(useArray):
# Use Python's native eval() to compute if no letters are detected.
if (not hasLetters(useArray)):
return [calculate(useArray)] # Different from eval() because it returns string version of result
# Base case. Returns useArray if the list size is 1 (i.e., it contains one string).
if (len(useArray) == 1):
return useArray
# Base case. Returns the space-joined elements of useArray as a list with one string.
if (len(useArray) == 3):
return [' '.join(useArray)]
# Checks to see if parentheses are present in the expression & sets.
# Counts number of parentheses & keeps track of first ( found.
parentheses = 0
leftIdx = -1
# This try/except block is essentially an if/else block. Since useArray.index('(') triggers a KeyError
# if it can't find '(' in useArray, the next line is not carried out, and parentheses is not incremented.
try:
leftIdx = useArray.index('(')
parentheses += 1
except Exception:
pass
# If a KeyError was returned, leftIdx = -1 and rightIdx = parentheses = 0.
rightIdx = leftIdx + 1
while (parentheses > 0):
if (useArray[rightIdx] == '('):
parentheses += 1
elif (useArray[rightIdx] == ')'):
parentheses -= 1
rightIdx += 1
# Provided parentheses pair isn't empty, runs contents through again; else, removes the parentheses
if (leftIdx > -1 and rightIdx - leftIdx > 2):
return reduceExpr(useArray[:leftIdx] + [' '.join(['(',reduceExpr(useArray[leftIdx+1:rightIdx-1])[0],')'])] + useArray[rightIdx:])
elif (leftIdx > -1):
return reduceExpr(useArray[:leftIdx] + useArray[rightIdx:])
# If operator is + or -, hold the first two elements and process the rest of the list first
if isAddSub(useArray[1]):
return reduceExpr(useArray[:2] + reduceExpr(useArray[2:]))
# Else, if operator is * or /, process the first 3 elements first, then the rest of the list
elif isMultDiv(useArray[1]):
return reduceExpr(reduceExpr(useArray[:3]) + useArray[3:])
# Just placed this so the compiler wouldn't complain that the function had no return (since this was called by yet another function).
return None
You need much more processing before you go into operations on symbols. The form you want to get to is a tree of operations with values in the leaf nodes. First you need to do a lexer run on the string to get elements - although if you always have space-separated elements it might be enough to just split the string. Then you need to parse that array of tokens using some grammar you require.
If you need theoretical information about grammars and parsing text, start here: http://en.wikipedia.org/wiki/Parsing If you need something more practical, go to https://github.com/pyparsing/pyparsing (you don't have to use the pyparsing module itself, but their documentation has a lot of interesting info) or http://www.nltk.org/book
From 2 * ( ( 9 / 6 ) + 6 * x ), you need to get to a tree like this:
*
2 +
/ *
9 6 6 x
Then you can visit each node and decide if you want to simplify it. Constant operations will be the simplest ones to eliminate - just compute the result and exchange the "/" node with 1.5 because all children are constants.
There are many strategies to continue, but essentially you need to find a way to go through the tree and modify it until there's nothing left to change.
If you want to print the result then, just walk the tree again and produce an expression which describes it.
If you are parsing expressions in Python, you might consider Python syntax for the expressions and parse them using the ast module (AST = abstract syntax tree).
The advantages of using Python syntax: you don't have to make a separate language for the purpose, the parser is built in, and so is the evaluator. Disadvantages: there's quite a lot of extra complexity in the parse tree that you don't need (you can avoid some of it by using the built-in NodeVisitor and NodeTransformer classes to do your work).
>>> import ast
>>> a = ast.parse('x**2 + x', mode='eval')
>>> ast.dump(a)
"Expression(body=BinOp(left=BinOp(left=Name(id='x', ctx=Load()), op=Pow(),
right=Num(n=2)), op=Add(), right=Name(id='x', ctx=Load())))"
Here's an example class that walks a Python parse tree and does recursive constant folding (for binary operations), to show you the kind of thing you can do fairly easily.
from ast import *
class FoldConstants(NodeTransformer):
def visit_BinOp(self, node):
self.generic_visit(node)
if isinstance(node.left, Num) and isinstance(node.right, Num):
expr = copy_location(Expression(node), node)
value = eval(compile(expr, '<string>', 'eval'))
return copy_location(Num(value), node)
else:
return node
>>> ast.dump(FoldConstants().visit(ast.parse('3**2 - 5 + x', mode='eval')))
"Expression(body=BinOp(left=Num(n=4), op=Add(), right=Name(id='x', ctx=Load())))"
How would I write a function in Python to determine if a list of filenames matches a given pattern and which files are missing from that pattern? For example:
Input ->
KUMAR.3.txt
KUMAR.4.txt
KUMAR.6.txt
KUMAR.7.txt
KUMAR.9.txt
KUMAR.10.txt
KUMAR.11.txt
KUMAR.13.txt
KUMAR.15.txt
KUMAR.16.txt
Desired Output-->
KUMAR.5.txt
KUMAR.8.txt
KUMAR.12.txt
KUMAR.14.txt
Input -->
KUMAR3.txt
KUMAR4.txt
KUMAR6.txt
KUMAR7.txt
KUMAR9.txt
KUMAR10.txt
KUMAR11.txt
KUMAR13.txt
KUMAR15.txt
KUMAR16.txt
Desired Output -->
KUMAR5.txt
KUMAR8.txt
KUMAR12.txt
KUMAR14.txt
You can approach this as:
Convert the filenames to appropriate integers.
Find the missing numbers.
Combine the missing numbers with the filename template as output.
For (1), if the file structure is predictable, then this is easy.
def to_num(s, start=6):
return int(s[start:s.index('.txt')])
Given:
lst = ['KUMAR.3.txt', 'KUMAR.4.txt', 'KUMAR.6.txt', 'KUMAR.7.txt',
'KUMAR.9.txt', 'KUMAR.10.txt', 'KUMAR.11.txt', 'KUMAR.13.txt',
'KUMAR.15.txt', 'KUMAR.16.txt']
you can get a list of known numbers by: map(to_num, lst). Of course, to look for gaps, you only really need the minimum and maximum. Combine that with the range function and you get all the numbers that you should see, and then remove the numbers you've got. Sets are helpful here.
def find_gaps(int_list):
return sorted(set(range(min(int_list), max(int_list))) - set(int_list))
Putting it all together:
missing = find_gaps(map(to_num, lst))
for i in missing:
print 'KUMAR.%d.txt' % i
Assuming the patterns are relatively static, this is easy enough with a regex:
import re
inlist = "KUMAR.3.txt KUMAR.4.txt KUMAR.6.txt KUMAR.7.txt KUMAR.9.txt KUMAR.10.txt KUMAR.11.txt KUMAR.13.txt KUMAR.15.txt KUMAR.16.txt".split()
def get_count(s):
return int(re.match('.*\.(\d+)\..*', s).groups()[0])
mincount = get_count(inlist[0])
maxcount = get_count(inlist[-1])
values = set(map(get_count, inlist))
for ii in range (mincount, maxcount):
if ii not in values:
print 'KUMAR.%d.txt' % ii