Why does the the resulting .CSV file have interspersed blank lines? - python

Why does the the resulting .CSV file have interspersed blank lines?
# https://www.tutorialspoint.com/How-to-save-a-Python-Dictionary-to-CSV-file
import csv
csv_columns = ['No','Name','Country'] # These are dictionary keys, they map to columns in the CSV file
dict_data = [
{'No': 1, 'Name': 'Alex', 'Country': 'India'},
{'No': 2, 'Name': 'Ben', 'Country': 'USA'},
{'No': 3, 'Name': 'Shri Ram', 'Country': 'India'},
{'No': 4, 'Name': 'Smith', 'Country': 'USA'},
{'No': 5, 'Name': 'Yuva Raj', 'Country': 'India'},
]
csv_file = "NamesExample.csv" #relative path
try:
with open(csv_file, 'w') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=csv_columns)
writer.writeheader()
for data in dict_data: # One dictionary at a time
writer.writerow(data)
except IOError:
print("I/O error")
This is what I get when I run the above code:
No,Name,Country
1,Alex,India
2,Ben,USA
3,Shri Ram,India
4,Smith,USA
5,Yuva Raj,India

Related

Replace single quotes with doubles to turn contents of a file into a nested JSON and normalize it afterwards

I have 70k files all of which look similar to this:
{'id': 24, 'name': None, 'city': 'City', 'region_id': 19,
'story_id': 1, 'description': 'text', 'uik': None, 'ustatus': 'status',
'wuiki_tik_name': '', 'reaction': None, 'reaction_official': '',
'created_at': '2011-09-07T07:24:44.420Z', 'lat': 54.7, 'lng': 20.5,
'regions': {'id': 19, 'name': 'name'}, 'stories': {'id': 1, 'name': '2011-12-04'}, 'assets': [], 'taggings': [{'tags': {'id': 6, 'name': 'name',
'tag_groups': {'id': 3, 'name': 'Violation'}}},
{'tags': {'id': 8, 'name': 'name', 'tag_groups': {'id': 5, 'name': 'resource'}}},
{'tags': {'id': 1, 'name': '01. Federal', 'tag_groups': {'id': 1, 'name': 'Level'}}},
{'tags': {'id': 3, 'name': '03. Local', 'tag_groups': {'id': 1, 'name': 'stuff'}}},
{'tags': {'id': 2, 'name': '02. Regional', 'tag_groups':
{'id': 1, 'name': 'Level'}}}], 'message_id': None, '_count': {'assets': 0, 'other_messages': 0, 'similars': 0, 'taggings': 5}}
The ultimate goal is to export it into a single CSV file. It can be successfully done without flattening. But since it has a lot of nested values, I would like to flatten it, and this is where I began facing problems related to data types. Here's the code:
import json
from pandas.io.json import json_normalize
import glob
path = glob.glob("all_messages/*.json")
for file in path:
with open(file, "r") as filer:
content = json.loads(json.dumps(filer.read()))
if content != 404:
df_main = json_normalize(content)
df_regions = json_normalize(content, record_path=['regions'], record_prefix='regions.', meta=['id'])
df_stories = json_normalize(content, record_path=['stories'], record_prefix='stories.', meta=['id'])
#... More code related to normalization
df_out.to_csv('combined_json.csv')
This code occasionally throws:
AttributeError: 'str' object has no attribute 'values' or ValueError: DataFrame constructor not properly called!. I realise that this is caused by json.dumps() JSON string output. However, I have failed to turn it into anything useable.
Any possible solutions to this?
If you only need to change ' to ":
...
for file in path:
with open(file, "r") as filer:
filer.replace("\'", "\"")
...
Making copies and using grep would be easier
While it is not the solution I was initially expecting, this approach worked as well. I kept getting error messages related to the structure of the dict literals that were reluctant to become json, so I took the csv file that I wanted to normalise and worked with each column one by one:
df = pd.read_csv("combined_json.csv")
df['regions'] = df['regions'].apply(lambda x: x.replace("'", '"'))
regions = pd.json_normalize(df['regions'].apply(json.loads).tolist()).rename(
columns=lambda x: x.replace('regions.', ''))
df['regions'] = regions['name']
Or, if it had more nested levels:
df['taggings'] = df['taggings'].apply(lambda x: x.replace("'", '"'))
taggings = pd.concat([pd.json_normalize(json.loads(j)) for j in df['taggings']])
df = df.reset_index(drop=True)
taggings = taggings.reset_index(drop=True)
df[['tags_id', 'nametag', 'group_tag', 'group_tag_name']] = taggings[['tags.id', 'tags.name', 'tags.tag_groups.id', 'tags.tag_groups.name']]
Which was eventually df.to_csv().

How to create csv file without blank rows [duplicate]

This question already has answers here:
CSV file written with Python has blank lines between each row
(11 answers)
Closed last year.
I have a below code which is creating a csv file:
import csv
# my data rows as dictionary objects
mydict = [{'branch': 'COE', 'cgpa': '9.0', 'name': 'Nikhil', 'year': '2'},
{'branch': 'COE', 'cgpa': '9.1', 'name': 'Sanchit', 'year': '2'},
{'branch': 'IT', 'cgpa': '9.3', 'name': 'Aditya', 'year': '2'},
{'branch': 'SE', 'cgpa': '9.5', 'name': 'Sagar', 'year': '1'},
{'branch': 'MCE', 'cgpa': '7.8', 'name': 'Prateek', 'year': '3'},
{'branch': 'EP', 'cgpa': '9.1', 'name': 'Sahil', 'year': '2'}]
# field names
fields = ['name', 'branch', 'year', 'cgpa']
# name of csv file
filename = "university_records.csv"
# writing to csv file
with open(filename, 'w') as csvfile:
# creating a csv dict writer object
writer = csv.DictWriter(csvfile, fieldnames=fields)
# writing headers (field names)
writer.writeheader()
# writing data rows
writer.writerows(mydict)
Running the above code is giving below excel sheet
It contains blank rows as well. How can I remove these blank rows. Thanks
You should create a dataframe with your dict, and then just use
pd.to_csv(name_of_dataframe, sep=your_columns_sep)
Adding the newline='' in the with open ... does the trick:
import csv
my_dict = [{'branch': 'COE', 'cgpa': '9.0', 'name': 'Nikhil', 'year': '2'},
{'branch': 'COE', 'cgpa': '9.1', 'name': 'Sanchit', 'year': '2'},
{'branch': 'IT', 'cgpa': '9.3', 'name': 'Aditya', 'year': '2'},
{'branch': 'SE', 'cgpa': '9.5', 'name': 'Sagar', 'year': '1'},
{'branch': 'MCE', 'cgpa': '7.8', 'name': 'Prateek', 'year': '3'},
{'branch': 'EP', 'cgpa': '9.1', 'name': 'Sahil', 'year': '2'}]
fields = ['name', 'branch', 'year', 'cgpa']
filename = "foo_bar.csv"
with open(filename, 'w', newline='') as csv_file:
writer = csv.DictWriter(csv_file, fieldnames=fields)
writer.writeheader()
writer.writerows(my_dict)

I'm looking for help on how to export JSON data from an API call to an CSV

Example of the dictionary returned from the api call. UserList contains the dictionary below.
{1: {'Email': 'JohnDoe#email.com',
'FirstName': 'John',
'Id': {'Value': 1},
'LastName': 'Doe',
'Location': '1',
'UserName': 'JohnDoe'},
2: {'Email': 'JaneDoe#email.com',
'FirstName': 'Jane',
'Id': {'Value': 2},
'LastName': 'Doe',
'Location': '2',
'UserName': 'JaneDoe'},
3: {'Email': 'FredDoe#email.com',
'FirstName': 'Fred',
'Id': {'Value': 1},
'LastName': 'Doe',
'Location': '3',
'UserName': 'FredDoe'}}
Code I'm using to try to export the data. I need to export the data with the keys (UserName, FirstName, LastName) as the headers. The dictionary is saved in the UserList variable.
with open('Test.csv', 'w') as f:
fieldnames = ['UserName', 'FirstName', 'LastName']
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
for data in UserList:
writer.writerow([UserList[data][f] for f in fieldnames])
Below is the error I'm getting...
Traceback (most recent call last):
File "<input>", line 6, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/csv.py", line 155, in writerow
return self.writer.writerow(self._dict_to_list(rowdict))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/csv.py", line 148, in _dict_to_list
wrong_fields = rowdict.keys() - self.fieldnames
AttributeError: 'list' object has no attribute 'keys'
That's because you need to iterate through individual elements.
What you're getting right now is, in your first iteration:
data, UserList[data]
1, {'Email': 'JohnDoe#email.com', 'FirstName': 'John', 'Id': {'Value': 1},'LastName': 'Doe','Location': '1', 'UserName': 'JohnDoe'}
Hence... what you need to do is:
with open('Test.csv', 'w') as f:
fieldnames = ['UserName', 'FirstName', 'LastName']
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
for data in UserList:
# since you only want to write a subset of the dictionary, you'll need to remove the unwanted keys
del UserList[data]['Id']
del UserList[data]['Email']
del UserList[data]['Location']
writer.writerow([UserList[data][f] for f in fieldnames])
Hope that helps...

How to create a list of dictionaries using for loop?

Textfile:
VIP Room, 10, 250
Executive Room,30, 500
Pool Site, 50, 850
Banquet Hall, 200, 1000
Chamber Hall, 500, 2000
Concert Hall, 1000, 3500
My code so far to read the file and create a list:
def readVenueList():
dic={}
venueList=[]
f=open("venue.txt","r")
for line in f:
line = line.split(",")
print(line)
for i in line:
i.split()
dic["name"]=i[0]
dic["num"]=i[1]
dic["cost"]=i[2]
venueList.append(dic)
return(venueList)
How do I create a list of dictionaries with the following output?
venueList = [{'cost': '250', 'name': 'VIP Room', 'num': '10'},
{'cost': '250', 'name': 'Executive Room', 'num': '30'},
# and so on and so forth...
]
You can simply use the csv reader library to handle this.
import csv
headers = ['name', 'num', 'cost']
with open('venue.txt', 'r') as f:
reader = csv.reader(f)
needed_list = [{headers[i]: row[i].strip() for i in range(3)} for row in reader]
It is very similar to earlier answer by #N M
datablob = u"""VIP Room,10,250
Executive Room,30,500
Pool Site,50,850
Banquet Hall,200,1000
Chamber Hall,500,2000
Concert Hall,1000,3500
"""
from csv import reader
from io import StringIO
def readVenueList(fd):
c = reader(fd)
hdr = ["name", "num", "cost"]
for i in c:
d = {}
for el, v in enumerate(i):
d[hdr[el]] = v
yield d
if __name__ == '__main__':
# replace with file object
# file = open("archive1.csv")
file = StringIO(datablob)
print(list(readVenueList(file)))
# Output
[{'name': 'VIP Room', 'num': '10', 'cost': '250'}, {'name':
'Executive Room', 'num': '30', 'cost': '500'}, {'name': 'Pool
Site', 'num': '50', 'cost': '850'}, {'name': 'Banquet Hall',
'num': '200', 'cost': '1000'}, {'name': 'Chamber Hall', 'num':
'500', 'cost': '2000'}, {'name': 'Concert Hall', 'num': '1000',
'cost': '3500'}]
If you don't want to use a CSV reader (though that's probably the best idea), you could also do this using list/dictionary comprehensions
with open('venue.txt', 'r') as f:
lines = (line.split(',') for line in f)
venues = [
{'name': name.strip(), 'number': int(num), 'cost': int(cost)}
for name, num, cost in lines
]
Here's how to modify your code do it properly (and follow the PEP 8 - Style Guide for Python Code recommendations more closely) :
from pprint import pprint
def readVenueList():
venueList = []
with open("venue.txt", "r") as f:
for line in f:
dic = {}
items = [item.strip() for item in line.split(",")]
dic["name"] = items[0]
dic["num"] = items[1]
dic["cost"] = items[2]
venueList.append(dic)
return venueList
venueList = readVenueList()
pprint(venueList)
Output:
[{'cost': '250', 'name': 'VIP Room', 'num': '10'},
{'cost': '500', 'name': 'Executive Room', 'num': '30'},
{'cost': '850', 'name': 'Pool Site', 'num': '50'},
{'cost': '1000', 'name': 'Banquet Hall', 'num': '200'},
{'cost': '2000', 'name': 'Chamber Hall', 'num': '500'},
{'cost': '3500', 'name': 'Concert Hall', 'num': '1000'}]

python 2.x csv writer

When I print these strings
print query, netinfo
I get output below, which is fine. How do i take these strings and put them into a CSV file into a single row?
8.8.8.8 [{'updated': '2012-02-24T00:00:00', 'handle': 'NET-8-0-0-0-1', 'description': 'Level 3 Communications, Inc.', 'tech_emails': 'ipaddressing#level3.com', 'abuse_emails': 'abuse#level3.com', 'postal_code': '80021', 'address': '1025 Eldorado Blvd.', 'cidr': '8.0.0.0/8', 'city': 'Broomfield', 'name': 'LVLT-ORG-8-8', 'created': '1992-12-01T00:00:00', 'country': 'US', 'state': 'CO', 'range': '8.0.0.0 - 8.255.255.255', 'misc_emails': None}, {'updated': '2014-03-14T00:00:00', 'handle': 'NET-8-8-8-0-1', 'description': 'Google Inc.', 'tech_emails': 'arin-contact#google.com', 'abuse_emails': 'arin-contact#google.com', 'postal_code': '94043', 'address': '1600 Amphitheatre Parkway', 'cidr': '8.8.8.0/24', 'city': 'Mountain View', 'name': 'LVLT-GOGL-8-8-8', 'created': '2014-03-14T00:00:00', 'country': 'US', 'state': 'CA', 'range': None, 'misc_emails': None}]
I have tried hobbling this together but it's all jacked up. I could use some help on how to use the csv module.
writer=csv.writer(open('dict.csv', 'ab'))
for key in query:
writer.writerow(query)
You can put your variables in a tuple and write to csv file :
import csv
from operator import itemgetter
with open('ex.csv', 'wb') as csvfile:
spamreader = csv.writer(csvfile, delimiter=' ')
spamreader.writerow((query, netinfo))
Note: if you are in python 3 use following code :
import csv
from operator import itemgetter
with open('ex.csv', 'w',newline='') as csvfile:
spamreader = csv.writer(csvfile, delimiter=' ')
spamreader.writerow((query, netinfo))

Categories

Resources