I have a file in the below format
.aaa b/b
.ddd e/e
.fff h/h
.lop m/n
I'm trying to read this file. My desired output is if I find ".aaa" I should get b/b, if I find ".ddd" I should get e/e and so on.
I know how to fetch 1st column and 2nd column but I don't know how to compare them and fetch the value. This is what I've written.
file = open('some_file.txt')
for line in file:
fields = line.strip().split()
print (fields[0]) #This will give 1st column
print (fields[1]) # This will give 2nd column
This is not the right way of doing things. What approach follow?
Any time you want to do lookups, a dictionary is going to be your friend.
You could write a function to load the data into a dictionary:
def load_data(filename):
result = dict()
with open(filename, 'r') as f:
for line in f:
k,v = line.strip().split() # will fail if not exactly 2 fields
result[k] = v
return result
And then use it to perform your lookups like this:
data = load_data('foo.txt')
print data['.aaa']
It sounds like what you may want is to build a dictionary mapping column 1 to column 2. You could try:
file = open('some_file.txt')
field_dict = {}
for line in file:
fields = line.strip().split()
field_dict[fields[0]] = fields[1]
Then in your other code, when you see '.ddd' you can simply get the reference from the dictionary (e.g. field_dict['.ddd'] should return 'e/e')
Just do splitting on each line according to the spaces and check whether the first item matches the word you gave. If so then do printing the second item from the list.
word = input("Enter the word to search : ")
with open(file) as f:
for line in f:
m = line.strip().split()
if m[0] == word:
print m[1]
Related
highest_score = 0
g = open("grades_single.txt","r")
arrayList = []
for line in highest_score:
if float(highest_score) > highest_score:
arrayList.extend(line.split())
g.close()
print(highest_score)
Hello, wondered if anyone could help me , I'm having problems here. I have to read in a file of which contains 3 lines. First line is no use and nor is the 3rd. The second contains a list of letters, to which I have to pull them out (for instance all the As all the Bs all the Cs all the way upto G) there are multiple letters of each. I have to be able to count how many off each through this program. I'm very new to this so please bear with me if the coding created is wrong. Just wondered if anyone could point me in the right direction of how to pull out these letters on the second line and count them. I then have to do a mathamatical function with these letters but I hope to work that out for myself.
Sample of the data:
GTSDF60000
ADCBCBBCADEBCCBADGAACDCCBEDCBACCFEABBCBBBCCEAABCBB
*
You do not read the contents of the file. To do so use the .read() or .readlines() method on your opened file. .readlines() reads each line in a file seperately like so:
g = open("grades_single.txt","r")
filecontent = g.readlines()
since it is good practice to directly close your file after opening it and reading its contents, directly follow with:
g.close()
another option would be:
with open("grades_single.txt","r") as g:
content = g.readlines()
the with-statement closes the file for you (so you don't need to use the .close()-method this way.
Since you need the contents of the second line only you can choose that one directly:
content = g.readlines()[1]
.readlines() doesn't strip a line of is newline(which usually is: \n), so you still have to do so:
content = g.readlines()[1].strip('\n')
The .count()-method lets you count items in a list or in a string. So you could do:
dct = {}
for item in content:
dct[item] = content.count(item)
this can be made more efficient by using a dictionary-comprehension:
dct = {item:content.count(item) for item in content}
at last you can get the highest score and print it:
highest_score = max(dct.values())
print(highest_score)
.values() returns the values of a dictionary and max, well, returns the maximum value in a list.
Thus the code that does what you're looking for could be:
with open("grades_single.txt","r") as g:
content = g.readlines()[1].strip('\n')
dct = {item:content.count(item) for item in content}
highest_score = max(dct.values())
print(highest_score)
highest_score = 0
arrayList = []
with open("grades_single.txt") as f:
arraylist.extend(f[1])
print (arrayList)
This will show you the second line of that file. It will extend arrayList then you can do whatever you want with that list.
import re
# opens the file in read mode (and closes it automatically when done)
with open('my_file.txt', 'r') as opened_file:
# Temporarily stores all lines of the file here.
all_lines_list = []
for line in opened_file.readlines():
all_lines_list.append(line)
# This is the selected pattern.
# It basically means "match a single character from a to g"
# and ignores upper or lower case
pattern = re.compile(r'[a-g]', re.IGNORECASE)
# Which line i want to choose (assuming you only need one line chosen)
line_num_i_need = 2
# (1 is deducted since the first element in python has index 0)
matches = re.findall(pattern, all_lines_list[line_num_i_need-1])
print('\nMatches found:')
print(matches)
print('\nTotal matches:')
print(len(matches))
You might want to check regular expressions in case you need some more complex pattern.
To count the occurrences of each letter I used a dictionary instead of a list. With a dictionary, you can access each letter count later on.
d = {}
g = open("grades_single.txt", "r")
for i,line in enumerate(g):
if i == 1:
holder = list(line.strip())
g.close()
for letter in holder:
d[letter] = holder.count(letter)
for key,value in d.iteritems():
print("{},{}").format(key,value)
Outputs
A,9
C,15
B,15
E,4
D,5
G,1
F,1
One can treat the first line specially (and in this case ignore it) with next inside try: except StopIteration:. In this case, where you only want the second line, follow with another next instead of a for loop.
with open("grades_single.txt") as f:
try:
next(f) # discard 1st line
line = next(f)
except StopIteration:
raise ValueError('file does not even have two lines')
# now use line
In python, how would i select a single character from a txt document that contains the following:
A#
M*
N%
(on seperate lines)...and then update a dictionary with the letter as the key and the symbol as the value.
The closest i have got is:
ftwo = open ("clues.txt", "r")
for lines in ftwo.readlines():
for char in lines:
I'm pretty new to coding so cant work it out!
Supposing that each line contains extactly two characters (first the key, then the value):
with open('clues.txt', 'r') as f:
myDict = {a[0]: a[1] for a in f}
If you have empty lines in your input file, you can filter these out:
with open('clues.txt', 'r') as f:
myDict = {a[0]: a[1] for a in f if a.strip()}
First, you'll want to read each line one at a time:
my_dict = {}
with open ("clues.txt", "r") as ftwo:
for line in ftwo:
# Then, you'll want to put your elements in a dict
my_dict[line[0]] = line[1]
I'm using a for loop to read a file, but I only want to read specific lines, say line that start with "af"and "apn". Is there any built-in feature to achieve this?
How to split this line after reading it ?
How to store the elements from the split into a dictionary?
Lets say the first element of the line after the split is employee ID i store it in the dictionary then the second element is his full name i want to store it in the dictionary too.
So when i use this line "employee_dict{employee_ID}" will i get his full name ?
Thank you.
You can do so very easily
f = open('file.txt', 'r')
employee_dict = {}
for line in f:
if line.startswith("af") or line.startswith("apn"):
emprecords = line.split() #assuming the default separator is a space character
#assuming that all your records follow a common format, you can then create an employee dict
employee = {}
#the first element after the split is employee id
employee_id = int(emprecords[0])
#enter name-value pairs within the employee object - for e.g. let's say the second element after the split is the emp name, the third the age
employee['name'] = emprecords[1]
employee['age'] = emprecords[2]
#store this in the global employee_dict
employee_dict[employee_id] = employee
To retrieve the name of employee id 1 after having done the above use something like:
print employee_dict[1]['name']
Hope this gives you an idea on how to go about
if your file looks like
af, 1, John
ggg, 2, Dave
you could create dict like
d = {z[1].strip() : z[2].strip() for z in [y for y in [x.split(',') for x in open(r"C:\Temp\test1.txt")] if y[0] in ('af', 'apn')]}
More readable version
d = {}
for l in open(r"C:\Temp\test1.txt"):
x = l.split(',')
if x[0] not in ('af', 'apn'): continue
d[x[1].strip()] = x[2].strip()
both solutions give you d = {'1': 'John'} on this example. To get name from the dict, you can do name = d['1']
prefixes = ("af", "apn")
with open('file.txt', 'r') as f:
employee_dict = dict((line.split()[:2]) for line in f if any(line.startswith(p) for p in prefixes))
dictOfNames={}
file = open("filename","r")
for line in file:
if line.startswith('af') or if line.startswith('apn'):
line=line.split(',') #split using delimiter of ','
dictOfNames[line[1]] = line[2] # take 2nd element of line as id and 3rd as name
The program above will read the file and store the second element as id and third as name if it starts with 'af' or 'apn'. Assuming comma is the delimiter.
now you can go with dictOfNames[id] to get the name.
I'm trying to figure out how to open up a file and then store it's contents into a dictionary using the Part no. as the key and the other information as the value. So I want it to look something like this:
{Part no.: "Description,Price", 453: "Sperving_Bearing,9900", 1342: "Panametric_Fan,23400",9480: "Converter_Exchange,93859"}
I was able to store the text from the file into a list, but I'm not sure how to assign more than one value to a key. I'm trying to do this without importing any modules. I've been using the basic str methods, list methods and dict methods.
For a txt file like so
453 Sperving_Bearing 9900
1342 Panametric_Fan 23400
9480 Converter_Exchange 93859
You can just do
>>> newDict = {}
>>> with open('testFile.txt', 'r') as f:
for line in f:
splitLine = line.split()
newDict[int(splitLine[0])] = ",".join(splitLine[1:])
>>> newDict
{9480: 'Converter_Exchange,93859', 453: 'Sperving_Bearing,9900', 1342: 'Panametric_Fan,23400'}
You can get rid of the ----... line by just checking if line.startswith('-----').
EDIT - If you are sure that the first two lines contain the same stuff, then you can just do
>>> testDict = {"Part no.": "Description,Price"}
>>> with open('testFile.txt', 'r') as f:
_ = next(f)
_ = next(f)
for line in f:
splitLine = line.split()
testDict[int(splitLine[0])] = ",".join(splitLine[1:])
>>> testDict
{9480: 'Converter_Exchange,93859', 'Part no.': 'Description,Price', 453: 'Sperving_Bearing,9900', 1342: 'Panametric_Fan,23400'}
This adds the first line to the testDict in the code and skips the first two lines and then continues on as normal.
You can read a file into a list of lines like this:
lines = thetextfile.readlines()
You can split a single line by spaces using:
items = somestring.split()
Here's a principial example how to store a list into a dictionary:
>>>mylist = [1, 2, 3]
>>>mydict = {}
>>>mydict['hello'] = mylist
>>>mydict['world'] = [4,5,6]
>>>print(mydict)
Containers like a tuple, list and dictionary can be nested into each other as their items.
To itereate a list you have to use a for statement like this:
for item in somelist:
# do something with the item like printing it
print item
Here's my stab at it, tested on Python 2.x/3.x:
import re
def str2dict(filename="temp.txt"):
results = {}
with open(filename, "r") as cache:
# read file into a list of lines
lines = cache.readlines()
# loop through lines
for line in lines:
# skip lines starting with "--".
if not line.startswith("--"):
# replace random amount of spaces (\s) with tab (\t),
# strip the trailing return (\n), split into list using
# "\t" as the split pattern
line = re.sub("\s\s+", "\t", line).strip().split("\t")
# use first item in list for the key, join remaining list items
# with ", " for the value.
results[line[0]] = ", ".join(line[1:])
return results
print (str2dict("temp.txt"))
You should store the values as a list or a tuple. Something like this:
textname = input("ENter a file")
thetextfile = open(textname,'r')
print("The file has been successfully opened!")
thetextfile = thetextfile.read()
file_s = thetextfile.split()
holder = []
wordlist = {}
for c in file_s:
wordlist[c.split()[0]] = c.split()[1:]
Your file should look like this:
Part no.;Description,Price
453;Sperving_Bearin,9900
1342;Panametric_Fan,23400
9480;Converter_Exchange,93859
Than you just need to add a bit of code:
d = collections.OrderedDict()
reader = csv.reader(open('your_file.txt','r'),delimiter=';')
d = {row[0]:row[1].strip() for row in reader}
for x,y in d.items():
print x
print y
This dictionary is supposed to take the three letter country code of a country, i.e, GRE for great britain, and then take the four consecutive numbers after it as a tuple. it should be something like this:
{GRE:(204,203,112,116)} and continue doing that for every single country in the list. The txt file goes down like so:
Country,Games,Gold,Silver,Bronze
AFG,13,0,0,2
ALG,15,5,2,8
ARG,40,18,24,28
ARM,10,1,2,9
ANZ,2,3,4,5 etc.;
This isn't actually code i just wanted to show it is formatted.
I need my program to skip the first line because it's a header. Here's what my code looks like thus far:
def medals(goldMedals):
infile = open(goldMedals, 'r')
medalDict = {}
for line in infile:
if infile[line] != 0:
key = line[0:3]
value = line[3:].split(',')
medalDict[key] = value
print(medalDict)
infile.close()
return medalDict
medals('GoldMedals.txt')
Your for loop should be like:
next(infile) # Skip the first line
for line in infile:
words = line.split(',')
medalDict[words[0]] = tuple(map(int, words[1:]))
A variation on a theme, I'd convert all the remaining cols to ints, and I'd use a namedtuple:
from collections import namedtuple
with open('file.txt') as fin:
# The first line names the columns
lines = iter(fin)
columns = lines.next().strip().split(',')
row = namedtuple('Row', columns[1:])
results = {}
for line in lines:
columns = line.strip().split(',')
results[columns[0]] = row(*(int(c) for c in columns[1:]))
# Results is now a dict to named tuples
This has the nice feature of 1) skipping the first line and 2) providing both offset and named access to the rows:
# These both work to return the 'Games' column
results['ALG'].Games
results['ALG'][0]
with open('path/to/file') as infile:
answer = {}
for line in infile:
k,v = line.strip().split(',',1)
answer[k] = tuple(int(i) for i in v.split(','))
I think inspectorG4dget's answer is the most readable... but for those playing code golf:
with open('medals.txt', 'r') as infile:
headers = infile.readline()
dict([(i[0], tuple(i[1:])) for i in [list(line.strip().split(',')) for line in infile]])