Add page number to a Word document using Python

Add page number to a Word document using Python - python

Is there a way to add page numbers to the lower right corner of a Word document using Python win32com? I am able to add headers and footers, but I can't find a way to add page numbers in the format PageNumber of TotalPages (for example: 1 of 5)
Below is the code to add centered headers and footers to a page
from win32com.client import Dispatch as MakeDoc
filename = name + '.doc'
WordDoc = MakeDoc("Word.Application")
WordDoc = WordDoc.Documents.Add()
WordDoc.Sections(1).Headers(1).Range.Text = name
WordDoc.Sections(1).Headers(1).Range.ParagraphFormat.Alignment = 1
WordDoc.Sections(1).Footers(1).Range.Text = filename
WordDoc.Sections(1).Footers(1).Range.ParagraphFormat.Alignment = 1
Thanks

To insert page numbers use the following statements:
WordDoc.Sections(1).Footers(1).PageNumbers.Add(2,True)
WordDoc.Sections(1).Footers(1).PageNumbers.NumberStyle = 57
However, the format of the page number is -page number-. Documentation for inserting the page number is here, and the one for the number style is here

I know this is an old question but I was banging my head against a wall trying to figure out the same thing and ended up working out a solution that's rather ugly, but gets the job done. Note that I had to redefine activefooter after inserting wdFieldPage or else the resulting footer would look like of 12 rather than 1 of 2.
The answer to this vba question was helpful when I was trying to figure out the formatting.
I'm using Python 3.4, testdocument.doc is just an existing .doc file with some random text spread across two pages and no existing footer.
w = win32com.client.gencache.EnsureDispatch("Word.Application")
w.Visible = 0
adoc = w.Documents.Open("C:\\temp1\\testdocument.doc")
activefooter = adoc.Sections(1).Footers(win32com.client.constants.wdHeaderFooterPrimary).Range
activefooter.ParagraphFormat.Alignment = win32com.client.constants.wdAlignParagraphRight
activefooter.Collapse(0)
activefooter.Fields.Add(activefooter,win32com.client.constants.wdFieldPage)
activefooter = adoc.Sections(1).Footers(win32com.client.constants.wdHeaderFooterPrimary).Range
activefooter.Collapse(0)
activefooter.InsertAfter(Text = ' of ')
activefooter.Collapse(0)
activefooter.Fields.Add(activefooter,win32com.client.constants.wdFieldNumPages)
adoc.Save()
adoc.Close()
w.Quit()

Related

WIkipedia API get text under headers

I can scripe a wikipedia usein wikipedia api
import wikipedia
import re
page = wikipedia.page("Albert Einstein")
text = page.content
regex_result = re.findall("==\s(.+?)\s==", text)
print(regex_result)
and I can from every element in a regex_result(Wikipedia headers ) get a text bellow and append it to another list. I dug the internet and I do not know how to do that with some function in Wikipedia API.
Second chance to get it in get a text and with some module extract a text between headers more here: find a some text in string bettwen some specific characters
I have tried this:
l = 0
for n in regex_result:
try:
regal = re.findall(f"==\s{regex_result[l]}\s==(.+?)\s=={regex_result[l+1]}\s==", text)
l+=2
except Exception:
continue
But I am not working:
output is only []

You don't want to call re twice, but rather iterate directly through the results provided by regex_result. Named groups in the form of (?P<name>...) make it even easier to extract the header name without the surrounding markup.
import wikipedia
import re
page = wikipedia.page("Albert Einstein")
text = page.content
# using the number 2 for '=' means you can easily find sub-headers too by increasing the value
regex_result = re.findall("\n={2}\s(?P<header>.+?)\s={2}\n", text)
regex_result will then be a list of strings of the all the top-level section headers.
Here's what I use to make a table of contents from a wiki page. (Note: f-strings require Python 3.6)
def get_wikiheader_regex(level):
'''The top wikiheader level has two = signs, so add 1 to the level to get the correct number.'''
assert isinstance(level, int) and level > -1
header_regex = f"^={{{level+1}}}\s(?P<section>.*?)\s={{{level+1}}}$"
return header_regex
def get_toc(raw_page, level=1):
'''For a single raw wiki page, return the level 1 section headers as a table of contents.'''
toc = []
header_regex = get_wikiheader_regex(level=level)
for line in raw_page.splitlines():
if line.startswith('=') and re.search(header_regex, line):
toc.append(re.search(header_regex, line).group('section'))
return toc
>>> get_toc(text)

How to remove the last 2 numbers from a string?

I am trying to take a Display Name / Keypad code from an excel document and add it into my companies website. My problem is when I parse the data from the excel document, the document will show 4240, but when it goes to add it into the website it picks it up at 4240.0. How can I remove the ".0" when I parse the data?
This is the code I currently have, the only problem with this is for some reason it will not picking up the "0" if it is in the front or end of a code.
For example, if the code is 0420, it only picks up 42 and doesn't apply the leading and ending 0. I tried changing the excel format to text that way it doesn't pick it up as a number but that didn't help either.
I think the best method would be to remove the last 2 pieces of information with index?
def addCodesA():
workbook = xlrd.open_workbook(path)
sheet = workbook.sheet_by_index(0)
for y in range(sheet.nrows):
names = []
codes = []
convertedcodes = []
names.append(str(sheet.cell_value(y, 0)))
codes.append(str(sheet.cell_value(y, 1)))
for strippedcode in codes:
convertedcodes.append(strippedcode.strip('.0'))
print(names)
print(codes)
driver.find_element_by_xpath('//*[#id="device_keypad_relay"][#value="0"]').click()
time.sleep(1)
codeadd = driver.find_element_by_name('keypad_code_1')
nameadd = driver.find_element_by_name('keypad_code_1_display')
codeadd.clear()
nameadd.clear()
codeadd.send_keys(convertedcodes)
nameadd.send_keys(names)
driver.find_element_by_class_name('btn-form-end').send_keys(Keys.SHIFT,Keys.ENTER)
time.sleep(6)
driver.get(customercodes)

Rewriting Single Words in a .txt with Python

I need to create a Database, using Python and a .txt file.
Creating new items is no Problem,the inside of the Databse.txt looks like this:
Index Objektname Objektplace Username
i.e:
1 Pen Office Daniel
2 Saw Shed Nic
6 Shovel Shed Evelyn
4 Knife Room6 Evelyn
I get the index from a QR-Scanner (OpenCV) and the other informations are gained via Tkinter Entrys and if an objekt is already saved in the Database, you should be able to rewrite Objektplace and Username.
My Problems now are the following:
If I scan the Code with the index 6, how do i navigate to that entry, even if it's not in line 6, without causing a Problem with the Room6?
How do I, for example, only replace the "Shed" from Index 4 when that Objekt is moved to f.e. Room6?
Same goes for the Usernames.
Up until now i've tried different methods, but nothing worked so far.
The last try looked something like this
def DBChange():
#Removes unwanted bits from the scanned code
data2 = data.replace("'", "")
Index = data2.replace("b","")
#Gets the Data from the Entry-Widgets
User = Nutzer.get()
Einlagerungsort = Ort.get()
#Adds a whitespace at the end of the Entrys to seperate them
Userlen = len(User)
User2 = User.ljust(Userlen)
Einlagerungsortlen = len(Einlagerungsort)+1
Einlagerungsort2 = Einlagerungsort.ljust(Einlagerungsortlen)
#Navigate to the exact line of the scanned Index and replace the words
#for the place and the user ONLY in this line
file = open("Datenbank.txt","r+")
lines=file.readlines()
for word in lines[Index].split():
List.append(word)
checkWords = (List[2],List[3])
repWords = (Einlagerungsort2, User2)
for line in file:
for check, rep in zip(checkWords, repWords):
line = line.replace(check, rep)
file.write(line)
file.close()
Return()
Thanks in advance

I'd suggest using Pandas to read and write your textfile. That way you can just use the index to select the approriate line. And if there is no specific reason to use your text format, I would switch to csv for ease of use.
import pandas as pd
def DBChange():
#Removes unwanted bits from the scanned code
# I haven't changed this part, since I guess you need this for some input data
data2 = data.replace("'", "")
Indexnr = data2.replace("b","")
#Gets the Data from the Entry-Widgets
User = Nutzer.get()
Einlagerungsort = Ort.get()
# I removed the lines here. This isn't necessary when using csv and Pandas
# read in the csv file
df = pd.read_csv("Datenbank.csv")
# Select line with index and replace value
df.loc[Indexnr, 'Username'] = User
df.loc[Indexnr, 'Objektplace'] = Einlagerungsort
# Write back to csv
df.to_csv("Datenbank.csv")
Return()
Since I can't reproduce your specific problem, I haven't tested it. But something like this should work.
Edit
To read and write text-file, use ' ' as the seperator. (I assume all values do not contain spaces, and your text file now uses 1 space between values).
reading:
df = pd.read_csv('Datenbank.txt', sep=' ')
Writing:
df.to_csv('Datenbank.txt', sep=' ')

First of all, this is a terrible way to store data. My suggestion is not particularily well code, don't do this in production! (edit
newlines = []
for line in lines:
entry = line.split()
if entry[0] == Index:
#line now is the correct line
#Index 2 is the place, index 0 the ID, etc
entry[2] = Einlagerungsort2
newlines.append(" ".join(entry))
# Now write newlines back to the file

Webscraping NSE Options prices using Python BeautifulSoup, regarding encoding correction

Dec 2020 update:
I have:
Achieved full automation, minute level data collection for entire FnO universe.
Auto adapts to changing FnO universe, exits and new entries.
Shuts down in non-market hours.
Shuts down on holidays, including newly declared holidays.
Starts automatically during yearly Muhurat Trading data.
I am a bit new to web scraping and not used to 'tr' & 'td' stuff and thus this doubt. I am trying to replicate this Python 2.7 code in my Python 3 from this thread 'https://www.quantinsti.com/blog/option-chain-extraction-for-nse-stocks-using-python'.
This old code uses .ix for indexing which I can correct using .iloc easily. However, the line <tr = tr.replace(',' , '')> show up error 'a bytes-like object is required, not 'str'' even if I write it before <tr = utf_string.encode('utf8')>.
I have checked this other link from stackoverflow and couldn't solve my problem
I think I have spotted why this is happening. It's because of the previous for loop used previously to define variable tr. If I omit this line, then I get a DataFrame with the numbers with some attached text. I can filter this with a loop over the entire DataFrame, but a better way must be by properly using the replace() function. I can't figure this bit out.
Here is my full code. I have marked the critical sections of the code I have referred using ######################### exclusively in a line so that the line can be found out quickly (even by Ctrl + F):
import requests
import pandas as pd
from bs4 import BeautifulSoup
Base_url = ("https://nseindia.com/live_market/dynaContent/"+
"live_watch/option_chain/optionKeys.jsp?symbolCode=2772&symbol=UBL&"+
"symbol=UBL&instrument=OPTSTK&date=-&segmentLink=17&segmentLink=17")
page = requests.get(Base_url)
#page.status_code
#page.content
soup = BeautifulSoup(page.content, 'html.parser')
#print(soup.prettify())
table_it = soup.find_all(class_="opttbldata")
table_cls_1 = soup.find_all(id = "octable")
col_list = []
# Pulling heading out of the Option Chain Table
#########################
for mytable in table_cls_1:
table_head = mytable.find('thead')
try:
rows = table_head.find_all('tr')
for tr in rows:
cols = tr.find_all('th')
for th in cols:
er = th.text
#########################
ee = er.encode('utf8')
col_list.append(ee)
except:
print('no thread')
col_list_fnl = [e for e in col_list if e not in ('CALLS', 'PUTS', 'Chart', '\xc2\xa0')]
#print(col_list_fnl)
table_cls_2 = soup.find(id = "octable")
all_trs = table_cls_2.find_all('tr')
req_row = table_cls_2.find_all('tr')
new_table = pd.DataFrame(index=range(0,len(req_row)-3),columns = col_list_fnl)
row_marker = 0
for row_number, tr_nos in enumerate(req_row):
if row_number <= 1 or row_number == len(req_row)-1:
continue # To insure we only choose non empty rows
td_columns = tr_nos.find_all('td')
# Removing the graph column
select_cols = td_columns[1:22]
cols_horizontal = range(0,len(select_cols))
for nu, column in enumerate(select_cols):
utf_string = column.get_text()
utf_string = utf_string.strip('\n\r\t": ')
#########################
tr = tr.replace(',' , '') # Commenting this out makes code partially work, getting numbers + text attached to the numbers in the table
# That is obtained by commenting out the above line with tr variable & running the entire code.
tr = utf_string.encode('utf8')
new_table.iloc[row_marker,[nu]] = tr
row_marker += 1
print(new_table)

For the first section:
er = th.text should be er = th.get_text()
Link to get_text documentation
For the latter section:
Looking at it, your "tr" variable at this point is the last tr tag found in the soup using for tr in rows. This means the tr you are trying to call replace on is a navigable string, not a string.
tr = tr.get_text().replace(',' , '') should work for the first iteration, however as you have overwritten it in the first iteration it will break in the next iteration.
Additionally, thank you for the depth of your question. While you did not pose it as a question, the length you went to describe the trouble you are having as well as the code you have tried is greatly appreciated.

If you replace the below lines of codes
tr = tr.replace(',' , '')
tr = utf_string.encode('utf8')
new_table.iloc[row_marker,[nu]] = tr
with the following code then it should work.
new_table.iloc[row_marker,[nu]] = utf_string.replace(',' , '')
As the replace function doesn't work with the Unicode. You can also consider using below code to decode the column names
col_list_fnl = [e.decode('utf8') for e in col_list if e not in ('CALLS', 'PUTS', 'Chart', '\xc2\xa0')]
col_list_fnl
I hope this helps.

Writing to file random loss and gain of 0s

I am writing a python script to convert old DayLite contacts into CSV format to be imported into Outlook. I have a script that functions completely almost perfectly except for one small issue but due to being mass data fixing it in the file will take way to long.
The list of contacts is very long 1,100+ rows in the spreadsheet. When the text gets written into the CSV file everything is good except certain/random phone numbers lose their leading 0 and gain a '.0' at the end. However the majority of the phone numbers are left in the exact format.
This is my script code:
import xlrd
import xlwt
import csv
import numpy
##########################
# Getting XLS Data sheet #
##########################
oldFormatContacts = xlrd.open_workbook('DayliteContacts_Oct16.xls')
ofSheet = oldFormatContacts.sheet_by_index(0)
##################################
# Storing values in array medium #
##################################
rowVal = [''] * ofSheet.nrows
x = 1
for x in range(ofSheet.nrows):
rowVal[x] = (ofSheet.row_values(x))
######################
# Getting CVS titles #
######################
csvTemp = xlrd.open_workbook('Outlook.xls')
csvSheet = csvTemp.sheet_by_index(0)
csv_title = csvSheet.row_values(0)
rowVal[0] = csv_title
##############################################################
# Append and padding data to contain commas for empty fields #
##############################################################
x = 0
q = '"'
for x in range(ofSheet.nrows):
temporaryRow = rowVal[x]
temporaryRow = str(temporaryRow).strip('[]')
if x > 0:
rowVal[x] = (','+str(q+temporaryRow.split(',')[0]+q)+',,'+str(q+temporaryRow.split(',')[1]+q)+',,'+str(q+temporaryRow.split(',')[2]+q)+',,,,,,,,,,,,,,,,,,,,,,,,,,'+str(q+temporaryRow.split(',')[4]+q)+','+str(q+temporaryRow.split(',')[6]+q)+',,,,,,,,,,,,,,,,,,,,,,,,,'+str(q+temporaryRow.split(',')[8])+q)
j = 0
for j in range(0,21):
rowVal[x] += ','
tempString = str(rowVal[x])
tempString = tempString.replace("'","")
#tempString = tempString.replace('"', '')
#tempString = tempString.replace(" ", "")
rowVal[x] = tempString
######################################
# Open and write values too new file #
######################################
csv_file = open('csvTestFile.csv', 'w')
rownum = 0
for rownum in range(ofSheet.nrows):
csv_file.write(rowVal[rownum])
csv_file.write("\n")
csv_file.close()
Sorry if my coding is incoherent I am a beginner to python scripts.
Unfortunately I cannot show or provide the contact details due to privacy reasons however I will give some examples in the exact format that it occurs.
So in the DayLite document a contact would be saved as "First name, Second name, Company, phone number 1, phone number 2, email" for example:
"Joe, Black, Stack Overflow, 07472329584,"
but when written into the CSV file it will be
"Joe","Black","Stack Overflow","7472329584.0".
This is odd because for each occurrence of that problem there will be 10 or so fine numbers that get saved exactly the same e.g. In DayLite: "+446738193583" when written in CSV: "+446738193583".
I forgot to mention (this is an edit) that many phone numbers KEEP their leading 0 and do not gain a trailing 0. It's probably 1/20 phone numbers that gets messed up.
It seems to me to be a very weird error and this is why I have come here for help! If anyone has any ideas I'd be more than happy to hear them. Cheers guys.

The issue lied within the Excel document but I assumed it lied within my script. I placed an ' before each number that was causing a format error. This meant when read from the sheet there was no issues with format and it wrote it back to the file successfully.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Add page number to a Word document using Python - python

Related

WIkipedia API get text under headers

How to remove the last 2 numbers from a string?

Rewriting Single Words in a .txt with Python

Webscraping NSE Options prices using Python BeautifulSoup, regarding encoding correction

Writing to file random loss and gain of 0s

Categories

Resources