Related
Here is the text file1 content
name = test1,description=None,releaseDate="2020-02-27"
name = test2,description=None,releaseDate="2020-02-28"
name = test3,description=None,releaseDate="2020-02-29"
I want a nested dictionary like this. How to create this?
{ 'test1': {'description':'None','releaseDate':'2020-02-27'},
'test2': {'description':'None','releaseDate':'2020-02-28'},
'test3': {'description':'None','releaseDate':'2020-02-29'}}
After this I want to append these values in the following line of code through "for" loop for a list of projects.
Example: For a project="IJTP2" want to go through each name in the dictionary like below
project.create(name="test1", project="IJTP2", description=None, releaseDate="2020-02-27")
project.create(name="test2", project="IJTP2", description=None, releaseDate="2020-02-28")
project.create(name="test3", project="IJTP2", description=None, releaseDate="2020-02-29")
Now to the next project:
List of projects is stored in another file as below
IJTP1
IJTP2
IJTP3
IJTP4
I just started working on Python and have never worked on the nested dictionaries.
I assume that:
each file line has comma-separated columns
each column has only one = and key on its left, value on its right
only first column is special(name)
Of course, as #Alex Hall mentioned, I recommend JSON or CSV, too.
Anyway, I wrote code for your case.
d = {}
with open('test-200229.txt') as f:
for line in f:
(_, name), *rest = (
tuple(value.strip() for value in column.split('='))
for column in line.split(',')
)
d[name] = dict(rest)
print(d)
output:
{'test1': {'description': 'None', 'releaseDate': '"2020-02-27"'}, 'test2': {'description': 'None', 'releaseDate': '"2020-02-28"'}, 'test3': {'description': 'None', 'releaseDate': '"2020-02-29"'}}
So I'm new to Python and I'm having difficulty understanding how to manipulate files and such. Currently I've been trying to assign the lines in my file into a list by splitting it at commas. I'm using this code:
with open('grades.txt','r') as f:
data=f.read()
data=data.split(',')
print(data)
The problem I have now is the output is this:
['22223333', ' Michael Gill', ' 49\n23232323', ' Nicholas Smith', ' 62\n18493214', ' Kerri Morgan', ' 75\n00015542', ' Donald Knuth', ' 90\n00000001', ' Alan Turing', ' 100']
my question is, how do I remove the \n from my output and also how would i go about splitting the values separated by the \n (for example, 49\n23232323, i would like it to be split like '49','23232323').It is my understanding(which is not a lot) that you can't split a list nor can you assign 2 variables for splitting a file, so how would I split the file by commas and '\n'?
The ideal output would be:
['22223333', 'Michael Gill', '49', '23232323', 'Nicholas Smith', '62', '18493214', 'Kerri Morgan', '75', '00015542', 'Donald Knuth', '90', '00000001', 'Alan Turing', '100']
The grades.txt file consists of:
22223333, Michael Gill, 49
23232323, Nicholas Smith, 62
18493214, Kerri Morgan, 75
00015542, Donald Knuth, 90
00000001, Alan Turing, 100
Also, is it possible to split only certain lines/words in a file into a list? (i.e. a file containing (1,2,3,4,a,b,c,d,5,4,3,d,r) and splitting the numbers into one list and the letters into another?)
i'd do something like this:
with open('grades.txt','r') as f:
data=f.read()
data=data.replace("\n", ",").split(',')
print(data)
thus replacing every \n with commas
if you want to have numbers in one list and words in another just create two list and sort them using the function .isdigit() like this:
words = []
numbers = []
for element in data:
if element.replace(" ", "").isdigit(): numbers.append(element)
else: words.append(element)
another way to do it is using try and except:
for element in data:
try:
int(element.replace(" ", ""))
numbers.append(element)
except:
words.append(element)
As someone mentioned in the comments, perhaps the better approach would be to use the csv module. But that requires you to learn/understand Python dictionaries - however dictionaries are a great data structure and very useful in many cases.
from csv import DictReader as dr
data_from_file = []
with open(my_file.csv,'rb') as fh:
my_reader = dr(fh)
column_headings = my_reader.fieldnames
for row in my_reader:
data_from_file.append(row)
The result is a list of dictionaries. Each line in the list corresponds to a row in the initial file. But instead of the data just being some object without specific identity - assuming you have column headings id, name and age in your original file the results would look like
[{'id:':'22223333', 'name': 'Michael Gill', 'age': '49'} . . .]
the column_headings object is a list of the original column headings from the file if you wanted to manipulate/explore those. Of course the next question is how to save your data as a CSV file. There are a number of Q&A here on how to use the DictWriter method.
You can do it in this way as well
list1 = ['22223333', ' Michael Gill', ' 49\n23232323', ' Nicholas Smith', ' 62\n18493214', ' Kerri Morgan', ' 75\n00015542', ' Donald Knuth', ' 90\n00000001', ' Alan Turing', ' 100']
list2=[]
for x in xrange(len(list1)):
list1[x] = list1[x].split('\n')
list2 = sum(list1, [])
print(list2)
output would be
['22223333', ' Michael Gill', ' 49', '23232323', ' Nicholas Smith', ' 62', '18493214', ' Kerri Morgan', ' 75', '00015542', ' Donald Knuth', ' 90', '00000001', ' Alan Turing', ' 100']
You could use Python's chain() function as follows:
from itertools import chain
with open('grades.txt','r') as f:
data = list(chain.from_iterable(line.split() for line in f.readlines()))
print(data)
This would display data as:
['22223333,', 'Michael', 'Gill,', '49', '23232323,', 'Nicholas', 'Smith,', '62', '18493214,', 'Kerri', 'Morgan,', '75', '00015542,', 'Donald', 'Knuth,', '90', '00000001,', 'Alan', 'Turing,', '100']
This uses readlines() to first read each of your lines in. This has the benefit of removing the newlines, giving you a list of lines. For each line, it use split() to create a list of entries, and then flattens all of the lists into a single list to give you the required results using the chain() function.
I suspect those newlines are there separating rows and you would be better off:
with open('grades.txt', 'r') as f:
for row in f.readlines():
data = row.split(',')
print(data)
If you want to have a single, long tuple, you can do that instead by concatenating the results of the operation
I have a text file like this.
1 firm A Manhattan (company name) 25,000
SK Ventures 25,000
AEA investors 10,000
2 firm B Tencent collaboration 16,000
id TechVentures 4,000
3 firm C xxx 625
(and so on)
I want to make a matrix form and put each item into the matrix.
For example, the first row of matrix would be like:
[[1,Firm A,Manhattan,25,000],['','',SK Ventures,25,000],['','',AEA investors,10,000]]
or,
[[1,'',''],[Firm A,'',''],[Manhattan,SK Ventures,AEA Investors],[25,000,25,000,10,000]]
For doing so, I wanna parse texts from each line of the text file. For example, from the first line, I can create [1,firm A, Manhattan, 25,000]. However, I can't figure out how exactly to do it. Every text starts at the same position, but ends at different positions. Is there any good way to do this?
Thank you.
Well if you know all of the start positions:
# 0123456789012345678901234567890123456789012345678901234567890
# 1 firm A Manhattan (company name) 25,000
# SK Ventures 25,000
# AEA investors 10,000
# 2 firm B Tencent collaboration 16,000
# id TechVentures 4,000
# 3 firm C xxx 625
# Field #1 is 8 wide (0 -> 7)
# Field #2 is 15 wide (8 -> 22)
# Field #3 is 19 wide (23 -> 41)
# Field #4 is arbitrarily wide (42 -> end of line)
field_lengths = [ 8, 15, 19, ]
data = []
with open('/path/to/file', 'r') as f:
row = f.readline()
row = row.strip()
pieces = []
for x in field_lengths:
piece = row[:x].strip()
pieces.append(piece)
row = row[x:]
pieces.append(row)
data.append(pieces)
From what you've given as data*, the input changes if the lines starts with a number or a space, and the data can be separated as
(numbers)(spaces)(letters with 1 space)(spaces)(letters with 1 space)(spaces)(numbers+commas)
or
(spaces)(letters with 1 space)(spaces)(numbers+commas)
That's what the two regexes below look for, and they build a dictionary with indexes from the leading numbers, each having a firm name and a list of company and value pairs.
I can't really tell what your matrix arrangement is.
import re
data = {}
f = open('data.txt')
for line in f:
if re.match('^\d', line):
matches = re.findall('^(\d+)\s+((\S\s|\s\S|\S)+)\s\s+((\S\s|\s\S|\S)+)\s\s+([0-9,]+)', line)
idx, firm, x, company, y, value = matches[0]
data[idx] = {}
data[idx]['firm'] = firm.strip()
data[idx]['company'] = [(company.strip(), value)]
else:
matches = re.findall('\s+((\S\s|\s\S|\S)+)\s\s+([0-9,]+)', line)
company, x, value = matches[0]
data[idx]['company'].append((company.strip(), value))
import pprint
pprint.pprint(data)
->
{'1': {'company': [('Manhattan (company name)', '25,000'),
('SK Ventures', '25,000'),
('AEA investors', '10,000')],
'firm': 'firm A'},
'2': {'company': [('Tencent collaboration', '16,000'),
('id TechVentures', '4,000')],
'firm': 'firm B'},
'3': {'company': [('xxx', '625')],
'firm': 'firm C'}
}
* This works on your example, but it may not work on your real data very well. YMMV.
If I understand you correctly (although I'm not totally sure I do), this will produce the output I think your looking for.
import re
with open('data.txt', 'r') as f:
f_txt = f.read() # Change file object to text
f_lines = re.split(r'\n(?=\d)', f_txt)
matrix = []
for line in f_lines:
inner1 = line.split('\n')
inner2 = [re.split(r'\s{2,}', l) for l in inner1]
matrix.append(inner2)
print(matrix)
print('')
for row in matrix:
print(row)
Output of the program:
[[['1', 'firm A', 'Manhattan (company name)', '25,000'], ['', 'SK Ventures', '25,000'], ['', 'AEA investors', '10,000']], [['2', 'firm B', 'Tencent collaboration', '16,000'], ['', 'id TechVentures', '4,000']], [['3', 'firm C', 'xxx', '625']]]
[['1', 'firm A', 'Manhattan (company name)', '25,000'], ['', 'SK Ventures', '25,000'], ['', 'AEA investors', '10,000']]
[['2', 'firm B', 'Tencent collaboration', '16,000'], ['', 'id TechVentures', '4,000']]
[['3', 'firm C', 'xxx', '625']]
I am basing this on the fact that you wanted the first row of your matrix to be:
[[1,Firm A,Manhattan,25,000],['',SK Ventures,25,000],['',AEA investors,10,000]]
However, to achieve this with more rows, we then get a list that is nested 3 levels deep. Such is the output of print(matrix). This can be a little unwieldy to use, which is why TessellatingHeckler's answer uses a dictionary to store the data, which I think is a much better way to access what you need. But if a list of list of "matrices' is what your after, then the code I wrote above does that.
I have a list of dictionaries which I build from .xml file:
list_1=[{'lat': '00.6849879', 'phone': '+3002201600', 'amenity': 'restaurant', 'lon': '00.2855850', 'name': 'Telegraf'},{'lat': '00.6850230', 'addr:housenumber': '6', 'lon': '00.2844493', 'addr:city': 'XXX', 'addr:street': 'YYY.'},{'lat': '00.6860304', 'crossing': 'traffic_signals', 'lon': '00.2861978', 'highway': 'crossing'}]
My aim is to build a text file with values (not keys) in such order:
lat,lon,'addr:street','addr:housenumber','addr:city','amenity','crossing' etc...
00.6849879,00.2855850, , , ,restaurant, ,'\n'00.6850230,00.2844493,YYY,6,XXX, , ,'\n'00.6860304,00.2861978, , , , ,traffic_signals,'\n'
if value not exists there should be empty space.
I tried to loop with for loop:
for i in list_1:
line= i['lat'],i['lon']
print line
Problem occurs if I add value which does not exist in some cases:
for i in list_1:
line= i['lat'],i['lon'],i['phone']
print line
Also tried to loop and use map() function, but results seems not correct:
for i in list_1:
line=map(lambda x1,x2:x1+','+x2+'\n',i['lat'],i['lon'])
print line
Also tried:
for i in list_1:
for k,v in i.items():
if k=='addr:housenumber':
print v
This time I think there might be too many if/else conditions to write.
Seems like solutions is somewhere close. But can't figure out the solution and its optimal way.
I would look to use the csv module, in particular DictWriter. The fieldnames dictate the order in which the dictionary information is written out. Actually writing the header is optional:
import csv
fields = ['lat','lon','addr:street','addr:housenumber','addr:city','amenity','crossing',...]
with open('<file>', 'w') as f:
writer = csv.DictWriter(f, fields)
#writer.writeheader() # If you want a header
writer.writerows(list_1)
If you really didn't want to use csv module then you can simple iterate over the list of the fields you want in the order you want them:
fields = ['lat','lon','addr:street','addr:housenumber','addr:city','amenity','crossing',...]
for row in line_1:
print(','.join(row.get(field, '') for field in fields))
If you can't or don't want to use csv you can do something like
order = ['lat','lon','addr:street','addr:housenumber',
'addr:city','amenity','crossing']
for entry in list_1:
f.write(", ".join([entry.get(x, "") for x in order]) + "\n")
This will create a list with the values from the entry map in the order present in the order list, and default to "" if the value is not present in the map.
If your output is a csv file, I strongly recommend using the csv module because it will also escape values correctly and other csv file specific things that we don't think about right now.
Thanks guys
I found the solution. Maybe it is not so elegant but it works.
I made a list of node keys look for them in another list and get values.
key_list=['lat','lon','addr:street','addr:housenumber','amenity','source','name','operator']
list=[{'lat': '00.6849879', 'phone': '+3002201600', 'amenity': 'restaurant', 'lon': '00.2855850', 'name': 'Telegraf'},{'lat': '00.6850230', 'addr:housenumber': '6', 'lon': '00.2844493', 'addr:city': 'XXX', 'addr:street': 'YYY.'},{'lat': '00.6860304', 'crossing': 'traffic_signals', 'lon': '00.2861978', 'highway': 'crossing'}]
Solution:
final_list=[]
for i in list:
line=str()
for ii in key_list:
if ii in i:
x=ii
line=line+str(i[x])+','
else:
line=line+' '+','
final_list.append(line)
This question already has answers here:
Write values of Python dictionary back to file
(2 answers)
Closed 7 years ago.
I have a list of dictionaries such as:
values = [{'Name': 'John Doe', 'Age': 26, 'ID': '1279abc'},
{'Name': 'Jane Smith', 'Age': 35, 'ID': 'bca9721'}
]
What I'd like to do is print this list of dictionaries to a tab delimited text file to look something like this:
Name Age ID
John Doe 26 1279abc
Jane Smith 35 bca9721
However, I am unable to wrap my head around simply printing the values, as of right now I'm printing the entire dictionary per row via:
for i in values:
f.write(str(i))
f.write("\n")
Perhaps I need to iterate through each dictionary now? I've seen people use something like:
for i, n in iterable:
pass
But I've never understood this. Anyone able to shed some light into this?
EDIT:
Appears that I could use something like this, unless someone has a more pythonic way (Perhaps someone can explain "for i, n in interable"?):
for dic in values:
for entry in dic:
f.write(dic[entry])
This is simple enough to accomplish with a DictWriter. Its purpose is to write column-separated data, but if we specify our delimiter to be that of tabs instead of commas, we can make this work just fine.
from csv import DictWriter
values = [{'Name': 'John Doe', 'Age': 26, 'ID': '1279abc'},
{'Name': 'Jane Smith', 'Age': 35, 'ID': 'bca9721'}]
keys = values[0].keys()
with open("new-file.tsv", "w") as f:
dict_writer = DictWriter(f, keys, delimiter="\t")
dict_writer.writeheader()
for value in values:
dict_writer.writerow(value)
f.write('Name\tAge\tID')
for value in values:
f.write('\t'.join([value.get('Name'), str(value.get('Age')), value.get('ID')]))
You're probably thinking of the items() method. This will return the key and value for each entry in the dictionary. http://www.tutorialspoint.com/python/dictionary_items.htm
for k,v in values.items():
pass
# assuming your dictionary is in values
import csv
with open('out.txt', 'w') as fout:
writer = csv.DictWriter(fout, fields=values.keys(). delimiter="\t")
writer.writeheader()
writer.writerow(values.values())