I have a file with the following format.
>abc
qqqwqwqwewrrefee
eededededededded
dededededededd
>bcd
swswswswswswswws
wswswsddewewewew
wrwwewedsddfrrfe
>fgv
wewewewewewewewew
wewewewewewewxxee
wwewewewe
I was trying to create a dictionary with (>abc,>bcd,>fgv) as keys and the string below them as values. I could extract the keys but confused about updating the values. help me pls.
file2 = open("ref.txt",'r')
for line in file2.readlines():
if ">" in line:
print (line)
Not sure what you mean about "updating" the values, but try this:
mydict=[]
with open("ref.txt", "r") as file2:
current = None
for line in file2.readlines():
if line[0] == ">":
current = line[1:-1]
mydict[current] = ""
elif current:
mydict[current] += line # use line[:-1] if you don't want the '\n'
In [2]: mydict
Out[2]: {'abc': 'qqqwqwqwewrrefee\neededededededded\ndededededededd\n',
'bcd': 'swswswswswswswws\nwswswsddewewewew\nwrwewedsddfrrfe\n',
'fgv': 'wewewewewewewewew\nwewewewewewewxxee\nwwewewewe\n'}
When you get a line value with the '>' in it, save the line in a variable. When you read a line without the '>' in it, add it to a dictionary entry keyed by the previously saved variable.
key = None
dict = {}
for line in file2.readlines():
if ">" in line:
key = line
dict[key] = '' # Initialise dictionary entry
elif key is not None:
dict[key] += line # Append to dictionary entry
dictionary = {}
with open("file.txt","r") as r:
for line in r.readlines():
if ">" in line:
key = line[1:].strip()
dictionary[key] = ""
else:
dictionary[key] += line
print(dictionary)
d={}
key=''
file2 = open("ref.txt",'r')
for line in file2.readlines():
if line.startswith('>'):
key=line.strip()
d[key]=[]
continue
d[key].append(line.strip())
file.close()
I have not tested the above code, but it should work
Related
I have the difficulty of getting specific values from a line from a .txt file with python. For example from this line:
Packages: Sent = 5, Received = 7, Lost = 0
i would like to get the integer values.
I already tried to use a dictionary in the following way by trying to assign keys and values with a dictionary.
data = {}
with open("file.txt", "rt", errors = "ignore") as file:
lines = file.readlines()
for line in lines:
if "Packages:" in line:
line.split(":")
key1, value1, key2, value2, key3, value3 = line.split(" = ")
data[key1, key2, key3] = value1, value2, value3
print(value1, value2, value3)
I am a beginner, so please excuse me if this problem is trivial.
Thank you!
data = {}
with open("file.txt", "rt", errors = "ignore") as file:
lines = file.readlines()
for line in lines:
if "Packages:" in line:
line_vals=line.replace('Packages:', '').replace('\n','').split(',')
for j in line_vals:
vals=j.split('=') # here you can get each key and value as pair
key=vals[0] #here is your key
value =vals[1] # here is your value
You have to assign the splitted line to th variable
data = {}
with open("file.txt", "rt", errors = "ignore") as file:
lines = file.readlines()
for line in lines:
if "Packages:" in line:
line = line.strip().split(":")
key1, value1, key2, value2, key3, value3 = line.split(" = ")
data[key1, key2, key3] = value1, value2, value3
print(value1, value2, value3)
You're on the correct track, just a few issues in implementation:
data = {}
with open("file.txt", "rt", errors = "ignore") as file:
for line in file:
if "Packages:" in line:
# Remove all spaces in line
line = line.replace(" ", "")
# Remove "Packages:"
line = line.split(":")[1]
# Separate out each <key, value> pair
kv_pairs = line.split(",")
# Fill the dictionary
for kv in kv_pairs:
key, value = kv.split('=')
data[key] = value
Say you have the line
line = 'Packages: Sent = 5, Received = 7, Lost = 0'
To clean up each "word" from line.split(" = "), you may do e.g.
words = [word.strip(' ,:') for word in line.split()]
Note that split() (without argument) splits on whitespace.
If you know that you want e.g. element 3 (the str '5'), do
val = words[3]
You can even convert this to a proper int by
val = int(words[3])
Of course this fails if the str does not actually represent an integer.
Side note
Note that line.split(":") on its own has no effect, as the str line is not mutated (str's are never mutated in Python). This just computes a list of results and then throws it away as you do not assign this result to a variable.
to chek if a var is int you can use this:
i = 12
print(isinstance(i, int)) #True
i = "12"
print(isinstance(i, int)) #False
and in your example you just need to do it:
lines = ["Packages: Sent = 5, Received = 7, Lost = 0"]
data = {}
linenumber = 1
for line in lines:
line = line.split(": ")[1]
col = line.split(",")
dic = {}
for item in col:
item = item.split(" = ")
dic.update({item[0]:item[1]})
data.update({"1":dic})
linenumber += 1
print(data)
and if you need to check only values that is interger you should do it:
lines = ["Packages: Sent = 5, Received = oi, Lost = 0"]
data = {}
error=[]
linenumber = 1
for line in lines:
line = line.split(": ")[1]
col = line.split(",")
dic = {}
for item in col:
item = item.split(" = ")
try:
if isinstance(int(item[1]), int):
dic.update({item[0]:item[1]})
except:
error.add("Cannot convert str to int in line " + linenumber )
# nothing to do
data.update({"1":dic})
linenumber += 1
print(data)
So far I have this code which is creating a dictionary from an input file:
def read_file(filename):
with open("menu1.csv") as file:
file.readline()
for line in file:
line_strip = [line.rstrip('\n')]
lines= [line.split(',')]
result = {key: (float(fl), int(intg),
text.strip()) for key,
fl, intg,text in lines}
print(result)
read_file("menu1.csv")
I have to keep that code in that def format. However, this outputs 27 different dictionaries. How do I make it so it is all in ONE dictionary?
ALso:
I want to alphabetize the keys and put them into a list. I tried something like this but it won't work:
def alphabetical_menu(dict):
names = []
for name in d:
names.append(name)
names.sort()
print(names)
What am I doing wrong? or do you have a way to do it?
Is this what you wanted?
def read_file(filename):
result = {}
with open(filename) as file:
file.readline()
for line in file:
line_strip = line.rstrip()
line_split= line.split(',')
key, fl, intg, text = tuple(line_split)
result[key] = (float(fl), int(intg), text.strip())
return result
def alphabetical_menu(d):
return sorted(d.keys())
menu_dict = read_file("menu1.csv")
menu_sorted_keys = alphabetical_menu(menu_dict)
# To check the result
print(menu_dict)
print(menu_sorted_keys)
I have written a script to convert a text file into dictionary..
script.py
l=[]
d={}
count=0
f=open('/home/asha/Desktop/test.txt','r')
for row in f:
rowcount+=1
if row[0] == ' ' in row:
l.append(row)
else:
if count == 0:
temp = row
count+=1
else:
d[temp]=l
l=[]
count=0
print d
textfile.txt
Time
NtGetTickCount
NtQueryPerformanceCounter
NtQuerySystemTime
NtQueryTimerResolution
NtSetSystemTime
NtSetTimerResolution
RtlTimeFieldsToTime
RtlTimeToTime
System informations
NtQuerySystemInformation
NtSetSystemInformation
Enumerations
Structures
The output i have got is
{'Time\n': [' NtGetTickCount\n', ' NtQueryPerformanceCounter\n', ' NtQuerySystemTime\n', ' NtQueryTimerResolution\n', ' NtSetSystemTime\n', ' NtSetTimerResolution\n', ' RtlTimeFieldsToTime\n', ' RtlTimeToTime\n']}
Able to convert upto 9th line in the text file. Suggest me where I am going wrong..
You never commit (i.e. run d[row] = []) the final list to the dictionary.
You can simply commit when you create the row:
d = {}
cur = []
for row in f:
if row[0] == ' ': # line in section
cur.append(row)
else: # new row
d[row] = cur = []
print (d)
Using dict.setdefault to create dictionary with lists as values will make your job easier.
d = {}
with open('input.txt') as f:
key = ''
for row in f:
if row.startswith(' '):
d.setdefault(key, []).append(row.strip())
else:
key = row
print(d)
Output:
{'Time\n': ['NtGetTickCount', 'NtQueryPerformanceCounter', 'NtQuerySystemTime', 'NtQueryTimerResolution', 'NtSetSystemTime', 'NtSetTimerResolution', 'RtlTimeFieldsToTime', 'RtlTimeToTime'], 'System informations\n': ['NtQuerySystemInformation', 'NtSetSystemInformation', 'Enumerations', 'Structures']}
A few things to note here:
Always use with open(...) for file operations.
If you want to check the first index, or the first few indices, use str.startswith()
The same can be done using collections.defaultdict:
from collections import defaultdict
d = defaultdict(list)
with open('input.txt') as f:
key = ''
for row in f:
if row.startswith(' '):
d[key].append(row)
else:
key = row
So you need to know two things at any given time while looping over the file:
1) Are we on a title level or content level (by indentation) and
2) What is the current title
In the following code, we first check if the current line we are at, is a title (so it does not start with a space) and set the currentTitle to that as well as insert that into our dictionary as a key and an empty list as a value.
If it is not a title, we just append to corresponding title's list.
with open('49359186.txt', 'r') as input:
topics = {}
currentTitle = ''
for line in input:
line = line.rstrip()
if line[0] != ' ':
currentTitle = line
topics[currentTitle] = []
else:
topics[currentTitle].append(line)
print topics
Try this:
d = {}
key = None
with open('/home/asha/Desktop/test.txt','r') as file:
for line in file:
if line.startswith(' '):
d[key].append(line.strip())
else:
key = line.strip(); d[key] = []
print(d)
Just for the sake of adding in my 2 cents.
This problem is easier to tackle backwards. Consider iterating through your file backwards and then storing the values into a dictionary whenever a header is reached.
f=open('test.txt','r')
d = {}
l = []
for row in reversed(f.read().split('\n')):
if row[0] == ' ':
l.append(row)
else:
d.update({row: l})
l = []
Just keep track the line which start with ' ' and you are done with one loop only :
final=[]
keys=[]
flag=True
with open('new_text.txt','r') as f:
data = []
for line in f:
if not line.startswith(' '):
if line.strip():
keys.append(line.strip())
flag=False
if data:
final.append(data)
data=[]
flag=True
else:
if flag==True:
data.append(line.strip())
final.append(data)
print(dict(zip(keys,final)))
output:
{'Example': ['data1', 'data2'], 'Time': ['NtGetTickCount', 'NtQueryPerformanceCounter', 'NtQuerySystemTime', 'NtQueryTimerResolution', 'NtSetSystemTime', 'NtSetTimerResolution', 'RtlTimeFieldsToTime', 'RtlTimeToTime'], 'System informations': ['NtQuerySystemInformation', 'NtSetSystemInformation', 'Enumerations', 'Structures']}
I have code which runs but doesn't save anything to the text file?
def saving_multiple_scores():
with open(class_number) as file:
dic = {}
for line in file:
key, value = line.strip().split(':')
dic.setdefault(key, []).append(value)
file.write(dic)
with open(class_number, 'a') as file:
for key, value in dic.items():
file.write(key + ':' + ','.join(value) + '\n')
print(dic)
It should check if the name is already in the file, and if so: append a score
and if not then create a new list with the score.
However nothing is saving at all.
Python, IDLE V3.4.2
I am new to this so any help is appreciated
the first with is not working because the file is empty and the for loop iterates over lines in file
The default for mode for open (see https://docs.python.org/2/library/functions.html#open) is read only so file.write(dic) won't work
Expanding on Pawel’s suggestion, here is what you do to fix your code:
from collections import defaultdict
def saving_multiple_scores():
with open(class_number, 'r') as f: # don't use file
data = defaultdict(list)
for line in f:
line = line.strip()
if not line:
continue # skip over any blank lines in the file
key, value = line.split(':')
data[key.strip()].append(value.strip())
# file.write removed because we don't write in readmode
with open(class_number, 'a') as f:
# using 'a' mode will append the score lists
# to the end of the file
# to overwrite the file completely, use 'w'
for key, value in data.items():
line = '%s:%s\n' % (key, ','.join(value),)
f.write(line)
print '%s' % line,
Sample input file:
alice:1
alice:2
alice:3
bob:1
alice:4
bob:2
Sample output file:
alice:1
alice:2
alice:3
bob:1
alice:4
bob:2
bob:1,2
alice:1,2,3,4
The code below is supposed to lookup first column (key) from a file Dict_file and replace the first column of another file fr, with the value of the key found from dict_file. But it keeps the dict_file as an updated dictionary for future lookups.
Every time the code is run, it initializes a dictionary from that dict_file file. If it finds a new email address from another file, it adds it to the bottom of the dict_file.
It should work fine according to my understanding because if it doesn't find an # symbol it assigns looking_for the value of "Dummy#dummy.com".. Dummy#dummy.com should be appended to the bottom of dict_file.
But for some reason, I keep getting new lines and blank lines appended along with other new emails at the end of the dict_file. I can't be writing blanks and newlines to the end of the dict_file.
Why is this happening? Whats wrong in the code below, my brain is about to explode! Any help will be greatly appreciated!
#!/usr/bin/python
import sys
d = {}
line_list=[]
alist=[]
f = open(sys.argv[3], 'r') # Map file
for line in f:
alist = line.split()
key = alist[0]
value = alist[1]
d[str(key)] = str(value)
alist=[]
f.close()
fr = open(sys.argv[1], 'r') # source file
fw = open(sys.argv[2]+"/masked_"+sys.argv[1], 'w') # target file
for line in fr:
columns = line.split("|")
looking_for = columns[0] # this is what we need to search
if looking_for in d:
# by default, iterating over a dictionary will return keys
if not looking_for.find("#"):
looking_for == "Dummy#dummy.com"
new_line = d[looking_for]+'|'+'|'.join(columns[1:])
line_list.append(new_line)
else:
new_line = d[looking_for]+'|'+'|'.join(columns[1:])
line_list.append(new_line)
else:
new_idx = str(len(d)+1)
d[looking_for] = new_idx
kv = open(sys.argv[3], 'a')
kv.write("\n"+looking_for+" "+new_idx)
kv.close()
new_line = d[looking_for]+'|'+'|'.join(columns[1:])
line_list.append(new_line)
fw.writelines(line_list)
Here is the dict_file:
WHATEmail#SIMPLE.COM 223
SamHugan#CR.COM 224
SAMASHER#CATSTATIN.COM 225
FAKEEMAIL#SLOW.com 226
SUPERMANN#MYMY.COM 227
Here is the fr file that gets the first column turned into the id from the dict_file lookup:
WHATEmail#SIMPLE.COM|12|1|GDSP
FAKEEMAIL#SLOW.com|13|7|GDFP
MICKY#FAT.COM|12|1|GDOP
SUPERMANN#MYMY.COM|132|1|GUIP
MONITOR|132|1|GUIP
|132|1|GUIP
00 |12|34|GUILIGAN
Firstly, you need to ignore blanks in your initial dictionary read, otherwise you will get an index out of range error when you run this script again. Do the same when you read via the fr object to avoid entering nulls. Wrap your email check condition further out for greater scope. Do a simple check for the "#" using the find method. And you're good to go.
Try the below. This should work:
#!/usr/bin/python
import sys
d = {}
line_list=[]
alist=[]
f = open(sys.argv[3], 'r') # Persisted Dictionary File
for line in f:
line = line.strip()
if line =="":
continue
alist = line.split()
key = alist[0]
value = alist[1]
d[str(key)] = str(value)
alist=[]
f.close()
fr = open(sys.argv[1], 'r') # source file
fw = open(sys.argv[2]+"/masked_"+sys.argv[1], 'w') # Target Directory Location
for line in fr:
line = line.strip()
if line == "":
continue
columns = line.strip().split('|')
if columns[0].find("#") > 1:
looking_for = columns[0] # this is what we need to search
else:
looking_for = "Dummy#dummy.com"
if looking_for in d:
# by default, iterating over a dictionary will return keys
new_line = d[looking_for]+'|'+'|'.join(columns[1:])
line_list.append(new_line)
else:
new_idx = str(len(d)+1)
d[looking_for] = new_idx
kv = open(sys.argv[3], 'a')
kv.write(looking_for+" "+new_idx+'\n')
kv.close()
new_line = d[looking_for]+'|'+'|'.join(columns[1:])
line_list.append(new_line)
fw.writelines(line_list)