In this program, I am trying to write the index out to a text file named "index.txt", along with printing it out. However, whenever i run the program, I get an error saying "words" is not defined, and my index.txt file only prints out word/tLine Numbers.
Code:
from string import punctuation
def makeIndex(filename):
wordIndex = {}
with open(filename) as f:
lineNum = 1
for line in f:
words = line.lower().split()
for word in words:
for char in punctuation:
word = word.replace(char, '')
if word.isalpha():
if word in wordIndex.keys():
if lineNum not in wordIndex[word]:
wordIndex[word].append(lineNum)
else:
wordIndex[word] = [lineNum]
lineNum += 1
return wordIndex
def output(wordIndex):
print("Word\tLine Numbers")
for key in sorted(wordIndex.keys()):
print(key, '\t', end=" ")
for lineNum in wordIndex[key]:
print(lineNum, end=" ")
print()
def main():
filename = input("What is the file name to be indexed?")
index = makeIndex(filename)
output(index)
with open('index.txt', 'w') as writefile:
writefile.write("Word/tLine Numbers")
print('t', end= "")
for index in range(len(word)):
print(word[index])
writefile.write(word[index] + '/n')
main()
Output:
What is the file name to be indexed?test.txt
Word Line Numbers
a 8 12 38 70 78
all 85 101
also 91
an 34 96
anagrams 93 104
as 84
ask 28
blocks 4
called 61
create 69
different 59
difficulties 47
each 74
employed 65
figure 32
file 9
find 100
finds 92
following 22
for 18 73
given 37
has 80
have 56
here 66
in 7 48
interesting 19
is 52 67
it 103
its 42 87
jumble 25
large 3
letters 43
long 54
many 58
new 14
of 5 16 41 45 86 102
one 44
opens 10
out 33
permutations 62 88
possibilities 17
problem 51
program 23 90
programs 20
puzzles 26
range 15
reorderings 60
same 82
scrambled 39
set 40
signature 72 83
since 94
so 57 76
solver 30
solves 24
solving 49
strategy 64
text 6
that 53 77
the 21 29 46 63 81
this 50 89
to 31 68
typing 95
unique 71
unknown 35
unscrambled 97
up 11
which 27
whole 13
will 99
with 2
word 36 75 79 98
words 55
working 1
tTraceback (most recent call last):
File "C:\Users\jp19p_000\Desktop\wordIndex(1).py", line 46, in <module>
main()
File "C:\Users\jp19p_000\Desktop\wordIndex(1).py", line 41, in main
for index in range(len(word)):
NameError: name 'word' is not defined
This is the index.txt file:
Word/tLine Numbers
from collections import defaultdict
import string
import sys
# convert to lowercase, remove all digits and punctuation
trans = str.maketrans(string.ascii_uppercase, string.ascii_lowercase, string.digits + string.punctuation)
def get_unique_words(s, trans=trans):
return set(s.translate(trans).split())
def make_index(seq, start=1):
index = defaultdict(list)
for i,s in enumerate(seq, start):
for word in get_unique_words(s):
index[word].append(i)
return index
def write_index(index, file=sys.stdout):
print("Word\tLines", file=file)
for word in sorted(index.keys()):
lines = " ".join(str(i) for i in index[word])
print("{}\t{}".format(word, lines), file=file)
def main():
fname = input("What is the name of the file to be indexed? ")
with open(fname) as inf:
index = make_index(inf)
with open("index.txt", "w") as outf:
write_index(index, outf)
if __name__=="__main__":
main()
Related
I am just trying to print the Unicode number ranging from 1 to 100 in python. I have searched a lot in StackOverflow but no question answers my queries.
So basically I want to print Bengali numbers from ১ to ১০০. The corresponding English number is 1 to 100.
What I have tried is to get the Unicode number of ১ which is '\u09E7'. Then I have tried to increase this number by 1 as depicted in the following code:
x = '\u09E7'
print(x+1)
But the above code says to me the following output.
TypeError: can only concatenate str (not "int") to str
So what I want is to get a number series as following:
১, ২, ৩, ৪, ৫, ৬, ৭, ৮, ৯, ১০, ১১, ১২, ১৩, ............, ১০০
TypeError: can only concatenate str (not "int") to str1
I wish if there is any solution to this. Thank you.
Make a translation table. The function str.maketrans() takes a string of characters and a string of replacements and builds a translation dictionary of Unicode ordinals to Unicode ordinals. Then, convert a counter variable to a string and use the translate() function on the result to convert the string:
#coding:utf8
xlat = str.maketrans('0123456789','০১২৩৪৫৬৭৮৯')
for i in range(1,101):
print(f'{i:3d} {str(i).translate(xlat)}',end=' ')
Output:
1 ১ 2 ২ 3 ৩ 4 ৪ 5 ৫ 6 ৬ 7 ৭ 8 ৮ 9 ৯ 10 ১০ 11 ১১ 12 ১২ 13 ১৩ 14 ১৪ 15 ১৫ 16 ১৬ 17 ১৭ 18 ১৮ 19 ১৯ 20 ২০ 21 ২১ 22 ২২ 23 ২৩ 24 ২৪ 25 ২৫ 26 ২৬ 27 ২৭ 28 ২৮ 29 ২৯ 30 ৩০ 31 ৩১ 32 ৩২ 33 ৩৩ 34 ৩৪ 35 ৩৫ 36 ৩৬ 37 ৩৭ 38 ৩৮ 39 ৩৯ 40 ৪০ 41 ৪১ 42 ৪২ 43 ৪৩ 44 ৪৪ 45 ৪৫ 46 ৪৬ 47 ৪৭ 48 ৪৮ 49 ৪৯ 50 ৫০ 51 ৫১ 52 ৫২ 53 ৫৩ 54 ৫৪ 55 ৫৫ 56 ৫৬ 57 ৫৭ 58 ৫৮ 59 ৫৯ 60 ৬০ 61 ৬১ 62 ৬২ 63 ৬৩ 64 ৬৪ 65 ৬৫ 66 ৬৬ 67 ৬৭ 68 ৬৮ 69 ৬৯ 70 ৭০ 71 ৭১ 72 ৭২ 73 ৭৩ 74 ৭৪ 75 ৭৫ 76 ৭৬ 77 ৭৭ 78 ৭৮ 79 ৭৯ 80 ৮০ 81 ৮১ 82 ৮২ 83 ৮৩ 84 ৮৪ 85 ৮৫ 86 ৮৬ 87 ৮৭ 88 ৮৮ 89 ৮৯ 90 ৯০ 91 ৯১ 92 ৯২ 93 ৯৩ 94 ৯৪ 95 ৯৫ 96 ৯৬ 97 ৯৭ 98 ৯৮ 99 ৯৯ 100 ১০০
You can try this. Convert the character to an integer. Do the addition and the convert it to character again. If the number is bigger than 10 you have to convert both digits to characters that's why we are using modulo %.
if num < 10:
x = ord('\u09E6')
print(chr(x+num))
elif num < 100:
mod = num % 10
num = int((num -mod) / 10)
x = ord('\u09E6')
print(''.join([chr(x+num), chr(x+mod)]))
else:
x = ord('\u09E6')
print(''.join([chr(x+1), '\u09E6', '\u09E6']))
You can try running it here
https://repl.it/repls/GloomyBewitchedMultitasking
EDIT:
Providing also javascript code as asked in comments.
function getAsciiNum(num){
zero = "০".charCodeAt(0)
if (num < 10){
return(String.fromCharCode(zero+num))
}
else if (num < 100) {
mod = num % 10
num = Math.floor((num -mod) / 10)
return(String.fromCharCode(zero+num) + String.fromCharCode(zero+mod))
}
else {
return(String.fromCharCode(zero+1) + "০০")
}
}
console.log(getAsciiNum(88))
Hi when I try to print a list, it prints out the directory and not the contents of win.txt. I'm trying to enumerate the txt into a list and split then append it to a, then do other things once get a to print. What am I doing wrong?
import os
win_path = os.path.join(home_dir, 'win.txt')
def roundedStr(num):
return str(int(round(num)))
a=[] # i declares outside the loop for recover later
for i,line in enumerate(win_path):
# files are iterable
if i==0:
t=line.split(' ')
else:
t=line.split(' ')
t[1:6]= map(int,t[1:6])
a.append(t) ## a have all the data
a.pop(0)
print a
prints out directory, like example c:\workspace\win.txt
NOT what I want
I want it to print the contents of win.txt
which takes t[1:6] as integers, like
11 21 31 41 59 21
and prints that out like that same way.
win.txt contains this
05/06/2017 11 21 31 41 59 21 3
05/03/2017 17 18 49 59 66 9 2
04/29/2017 22 23 24 45 62 5 2
04/26/2017 01 15 18 26 51 26 4
04/22/2017 21 39 41 48 63 6 3
04/19/2017 01 19 37 40 52 15 3
04/15/2017 05 22 26 45 61 13 3
04/12/2017 08 14 61 63 68 24 2
04/08/2017 23 36 51 53 60 15 2
04/05/2017 08 20 46 53 54 13 2
I just want [1]-[6]
I think what you want is to open the file 'win.txt', and read its content. Using the open function to create a file object, and a with block to scope it. See my example below. This will read the file, and take the first 6 numbers of each line.
import os
win_path = os.path.join(home_dir, 'win.txt')
a=[] # i declares outside the loop for recover later
with open(win_path, 'r') as file:
for i,line in enumerate(file):
line = line.strip()
print(line)
if i==0:
t=line.split(' ')
else:
t=line.split(' ')
t[1:7]= map(int,t[1:7])
t = t[1:7]
a.append(t) ## a have all the data
a.pop(0)
print (a)
I have a file of which the first column has repeated pattern as belows,
1999.2222 50 100
1999.2222 42 15
1999.2222 24 35
1999.2644 10 25
1999.2644 10 26
1999.3564 65 98
1999.3564 45 685
1999.3564 54 78
1999.3564 78 98
and I want this file into three files as
file1:
1999.2222 50 100
1999.2222 42 15
1999.2222 24 35
file2:
1999.2644 10 25
1999.2644 10 26
file3:
1999.3564 65 98
1999.3564 45 685
1999.3564 54 78
1999.3564 78 98
How could I split like this? Thanks:)
itertools.groupby is probably the most suitable choice for what you're after.
import itertools
with open('file.txt', 'r') as fin:
# group each line in input file by first part of split
for i, (k, g) in enumerate(itertools.groupby(fin, lambda l: l.split()[0]), 1):
# create file to write to suffixed with group number - start = 1
with open('file{0}.txt'.format(i), 'w') as fout:
# for each line in group write it to file
for line in g:
fout.write(line.strip() + '\n')
import csv
import output
fill = input("Enter File name:")
f = open(fill)
csv_f = csv.reader(f)
m = open('data.csv', "w")
dict_out = {}
for row in csv_f:
if row[1] in dict_out:
dict_out[row[1]] += row[3]
else:
dict_out[row[1]] = row[3]
for title, value in dict_out.items():
m.write('{},'.format(title))
m.write ('{} \n'.format(value))
m.close()
Prints my csv as
Title,Detail
Siding, 50 63 22 68 138 47 123 107 107 93 117
Asphalt, 49 8 72 19 125 95 33 83 123 144
Rail, 82 98 89 62 58 66 24 77 120 93
Grinding, 127 47 20 66 29 137 33 145 3 98
Concrete, 130 75 12 88 22 137 114 88 143 16
I would like to put a comma in between the numbers. I have tried m.write(',') after m.write('{} \n'.format(value)) but it only adds it after the last one. How can i format it so it will output as
Title,Detail
Siding, 50,63,22,68,138,47,123,107,107,93,117
Asphalt, 49,8,72,191,25,95,33,83,123,144
Rail, 82,98,89,62,58,66,24,77,120,93
Grinding, 127,47,20,66,29,137,33,145,3,98
Concrete, 130,75,12,88,22,137,114,88,143,16
not the best way but you can:
for title, value in dict_out.items():
m.write('{},'.format(title))
m.write ('{} \n'.format(value.replace(' ', ',')))
but you should definetly use csv writter,
import csv
import output
fill = input("Enter File name:")
f = open(fill)
csv_f = csv.reader(f)
c = open('data.csv', "w")
m = csv.writer(c)
dict_out = {}
for row in csv_f:
if row[1] in dict_out:
dict_out[row[1]].append(row[3])
else:
dict_out[row[1]] = [row[3]]
for title, value in dict_out.items():
m.writerow([title] + value)
c.close()
If value is a string then you need to use value.split(). If it is already a list then you don't need to use the split method.
with open('data.csv', "w") as m:
for title, value in dict_out.items():
m.write(title + "," + ",".join(value.split()) + "\n")
88 90 94 98 100 110 120
75 77 80 86 94 103 113
80 83 85 94 111 111 121
68 71 76 85 96 122 125
77 84 91 102 105 112 119
81 85 90 96 102 109 134
Hi i am very new to computer programming in general and I need some help with my current project. I need to read numbers from a text file into a table and calculate the averages and max.This is what I currently have.
def main():
intro()
#sets variables
n1=[]
n2=[]
n3=[]
n4=[]
n5=[]
n6=[]
n7=[]
numlines = 0
filename = input("Enter the name of the data file: ")
print() #turnin
infile = open(filename,"r")
for line in infile:
#splits the lines
data = line.split()
#takes vertical lines individually and converts them to integers
n1.append(int(data[0]))
n2.append(int(data[1]))
n3.append(int(data[2]))
n4.append(int(data[3]))
n5.append(int(data[4]))
n6.append(int(data[5]))
n7.append(int(data[6]))
datalist = n1,n2,n3,n4,n5,n6
#calculates the average speeds
n1av = (sum(n1))/len(n1)
n2av = (sum(n2))/len(n2)
n3av = (sum(n3))/len(n3)
n4av = (sum(n4))/len(n4)
n5av = (sum(n5))/len(n5)
n6av = (sum(n6))/len(n6)
n7av = (sum(n7))/len(n7)
#calculates the max speeds
n1max = max(n1)
n2max = max(n2)
n3max = max(n3)
n4max = max(n4)
n5max = max(n5)
n6max = max(n6)
n7max = max(n7)
#Calculates the average of the average speeds
Avgav = (n1av + n2av + n3av + n4av + n5av + n6av + n7av) / 7
#Calculates the average of the average max
Avmax = (n1max + n2max + n3max + n4max + n5max + n6max + n7max) / 7
#creates table
print(aver_speed)
print()
print(" "* 27, "Speed (MPH)")
print(" "*3,"Car :", "{:6}".format(30),"{:6}".format(40),"{:6}".format(50)
,"{:6}".format(60),"{:6}".format(70),"{:6}".format(80),
"{:6}".format(90)," :","{:14}".format ("Average Noise"))
print("-"*77)
for i in range(0,len(datalist)):
print("{:6}".format(int("1")+1)," "*2,":", "{:6}".format (n1[i]), "{:6}".format (n2[i]), "{:6}".format (n3[i]),
"{:6}".format (n4[i]),"{:6}".format (n5[i]),"{:6}".format (n6[i]),"{:6}".format (n7[i])," :", )
print("-"*77)
print(" ","Average","{:1}".format(":"), "{:8.1f}".format(n1av),"{:6.1f}".format(n2av),
"{:6.1f}".format(n3av),"{:6.1f}".format(n4av),"{:6.1f}".format(n5av),"{:6.1f}".format(n6av),
"{:6.1f}".format(n7av), "{:9.1f}".format(Avgav))
print()
print(" ","Maximum","{:1}".format(":"), "{:6}".format(n1max), "{:6}".format(n2max), "{:6}".format(n3max), "{:6}".format(n4max)
, "{:6}".format(n5max), "{:6}".format(n6max), "{:6}".format(n7max),"{:11.1f}".format(Avmax))
Any help would be appreciated.
Now that i have updated my code my table looks like this:
Car : 30 40 50 60 70 80 90 : Average Noise
2 : 88 90 94 98 100 110 120 :
2 : 75 77 80 86 94 103 113 :
2 : 80 83 85 94 111 111 121 :
2 : 68 71 76 85 96 122 125 :
2 : 77 84 91 102 105 112 119 :
2 : 81 85 90 96 102 109 134 :
Average : 78.2 81.7 86.0 93.5 101.3 111.2 122.0 96.3
Maximum : 88 90 94 102 111 122 134 105.9
I've been trying to figure out the calculations for average noise and how to list the cars 1 through 6. I was unable to fi
You have a lot of code now. You can do this easier. If you want calculate by strings:
with open(filename, 'r') as f:
for line in f.readlines():
list_of_speed = map(int, line.split())
max_speed = max(list_of_speed)
aver_speed = float(sum(list_of_speed))/len(list_of_speed)
If by column:
with open(filename, 'r') as f:
l = map(lambda x: map(int, x.split()), f.readlines())
for n in range(len(l[0])):
list_of_speed = [value[n] for value in l]
max_speed = max(list_of_speed)
aver_speed = float(sum(list_of_speed))/len(list_of_speed)
You can use sum() function on a list and len() function gives the number of elements in the list. So for average calculation you can simply do sum(n1)/float(len(n1)).
Try to use some dynamic way of keeping track of read data or calculate sum and avg on the fly and keep track of that data. Not to discourage you but using six lists doesn't look so elegant. Hope something similar to this might work:
from pprint import pprint
def main():
# intro()
filename = input("Enter the name of the data file:")
infile = open(filename,"r")
n = {} # a dictionary
for line in infile:
# apply typecasting on each element
data = map(int, line.split())
# add speeds into to a dictionary of lists
# supports any number of data sets
for i,d in enumerate(data):
if i+1 in n:
n[i+1].append(d)
else:
n[i+1] = [d]
pprint (n)
# do whatever you want with the dictionary
for d in n:
print ("-" * 10)
print (d)
print (sum(n[d]))
print (sum(n[d])/float(len(n[d])))
main()
For printing purposes you may want to use some thing like https://pypi.python.org/pypi/PTable