Python CSV Module - Can't hold text format - python

EDIT*
Solution was to wrap text in the column. This will restore the original format.
I am trying to create a CSV using the CSV module provided in Python. My issue is when the CSV is created the contents of the file inserted into the field loses it's format.
Example input can be pulled from 'whois 8.8.8.8'. I want the field to hold the formatting from that input.
Is there a way to maintain the files original format within the cell?
#!/usr/bin/python
import sys
import csv
file1 = sys.argv[1]
file2 = sys.argv[2]
myfile1 = open(file1, "rb")
myfile2 = open(file2, "rb")
ofile = open('information.csv', "wb")
stuffwriter = csv.writer(ofile, delimiter=',', quotechar='"', quoting=csv.QUOTE_ALL)
stuffwriter.writerow([myfile1.read(),myfile2.read()])
myfile1.close()
myfile2.close()
ofile.close()
Example Input(All In One Cell):
#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#
#
# Query terms are ambiguous. The query is assumed to be:
# "n 8.8.8.8"
#
# Use "?" to get help.
#
#
# The following results may also be obtained via:
# http://whois.arin.net/rest/nets;q=8.8.8.8?showDetails=true&showARIN=false&ext=netref2
#
Level 3 Communications, Inc. LVLT-ORG-8-8 (NET-8-0-0-0-1) 8.0.0.0 - 8.255.255.255
Google Incorporated LVLT-GOOGL-1-8-8-8 (NET-8-8-8-0-1) 8.8.8.0 - 8.8.8.255
#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#
Would like the cell to hold the format above. Currently when I open in Excel it is all one line.
I am getting my data from executing:
whois 8.8.8.8 > inputData.txt
echo "8.8.8.8 - Google" > inputData2.txt
python CreateCSV inputData2.txt inputData.txt
This is what I would like to see:
http://www.2shared.com/photo/WZwDC7w2/Screen_Shot_2013-06-06_at_1231.html
This is what I'm seeing:
http://www.2shared.com/photo/9dRFGCxh/Screen_Shot_2013-06-06_at_1222.html

Convert .CSV to an .XLSX
In Excel, right click column with data that lost it's format
Select 'Format Cells...'
Select the 'Alignment' tab
Check 'Wrap Text'
All is good!

Related

How do I convert .asc data of CAN to .blf using python as per vector format

I have a .asc file, I want to convert it to .blf as per vector format.
I have tried,
from can.io import BLFWriter
import can
import pandas as pd
#input paths
path = '/home/ranjeet/Downloads/CAN/BLF_READER/input/'
asc_file = '20171209_1610_15017.asc'
blf_file = '20171209_1610_15017.blf'
df = pd.read_table(path + asc_file)
print(df)
I am able to read .asc, how do I write it to a .blf file as per vector format.
Why are you reading your asc file with pandas if you're already working with the python-can module?
You will find how to interact with asc and blf files in the doc here and there respectively.
One thing you should pay attention to is to read/write blf files in binary mode. So in your example this should work (don't forget to stop the log otherwise the header will be missing):
import can
with open(asc_file, 'r') as f_in:
log_in = can.io.ASCReader(f_in)
with open(blf_file, 'wb') as f_out:
log_out = can.io.BLFWriter(f_out)
for msg in log_in:
log_out.on_message_received(msg)
log_out.stop()

using python to extract data with duplicate names from a json array

I'll start by apologising if I use the wrong terms here, I am a rank beginner with python.
I have a json array containing 5 sets of data. The corresponding items in each set have duplicate names. I can extract them in java but not in python. The item(s) I want are called "summary_polyline". I have tried so many different ways in the last couple of weeks, so far nothing works.
This is the relevant part of my python-
#!/usr/bin/env python3.6
import os
import sys
from dotenv import load_dotenv, find_dotenv
import polyline
import matplotlib.pyplot as plt
import json
with open ('/var/www/vk7krj/running/strava_activities.json', 'rt') as myfile:
contents = myfile.read()
#print (contents)
#print (contents["summary_polyline"[1]])
activity1 = contents("summary_polyline"[1])
If I un-comment "print content", it prints the file to the screen ok.
I ran the json through an on-line json format checker and it passed ok
How do I extract the five "summay_polylines" and assign them to "activity1" to "activity5"?
If I right understand you, you need convert to json text data which was red from file.
with open ('/var/www/vk7krj/running/strava_activities.json', 'rt') as myfile:
contents = myfile.read()
# json_contents is a List with dicts now
json_contents = json.loads(contents)
# list with activities
activities = []
for dict_item in json_contents:
activities.append(dict_item)
# print all activities (whole file)
print(activities)
# print first activity
print(activities[0])
# print second activity
print(activities[1])

python script to exclude specific field of csv cell data

using python I'm trying to create summary with existing data of csv and finding difficulties in extracting data from one of the cell.
the input csv file
I want to include only the city name and file path from info 4 column and expecting the summary like - AlexxxxxyyyyzzzzzNewyork\Folder1\Folder2\Test.txt
the code
csv_data_out[csv_line_out].append(conten[Name])
csv_data_out[csv_line_out].append(conten[info 1])
csv_data_out[csv_line_out].append(conten[info 2])
csv_data_out[csv_line_out].append(conten[info 3])
csv_data_out[csv_line_out].append(conten[info 4])
csv_summary = ("".join(csv_data_out[csv_line_out]))
with open(outputfile, 'wb') as newfile:
writer = csv.writer(newfile, delimiter = ';')
writer.writerow(csv_columns_out[:])
writer.writerows(csv_data_out)
newfile.close()
any idea to fetch only the required details from info 4 col ?
Essentially you have a csv inside a csv. There's not info posted to give a fully complete answer but here's most of it.
You can take a string and process it as a csv using io.StringIO (or io.BytesIO if a byte string).
#! /usr/bin/env python
# -*- coding: utf-8 -*-
import csv
from io import StringIO
# Create somewhere to put the inputs in case needed later
stored_items = []
with open('data.csv', 'r') as csvfile:
inputs = csv.reader(csvfile)
# skip the header row
next(inputs)
for row in inputs:
# Extract the Info 4 column for processing
f = StringIO(row[4])
string_file = csv.reader(f,quotechar='"')
build_string = ""
for string_row in string_file:
build_string = f"{string_row[0]}{string_row[1]}"
# Merge everything into a summary
summary_string = f"{row[0]}{row[1]}{row[2]}{row[3]}{build_string}"
# Add all the data back to storage
stored_items.append((row[0],row[1],row[2],row[3],row[4],summary_string))
print(summary_string)
The reason why I say there's not enough information posted in because, for example, will the location always be (a) which can have a fixed text replacement, or will be conditional e.g. it could be (a) or (b) in which case it would possibly require regex. (My preference is not to use regex unless absolutely necessary). Also, is it always the first two terms you are after from Info 4, or will the terms be found in different places in the text etc. Without seeing more samples of the data it's impossible to answer definitely.

Missing double quotes (randomly) when downloading csv file using Oauth2.0 gmail API

I am downloading csv attachments from gmail that are csv reports. I am using Python 3.6.1 and the Oauth 2.0 gmail API.
There is a date column in the csv file and I hard code it's format to '%Y-%m-%d'.
When I download the csv attachment and inspect it as a text file, most times, I get the expected date format as follows (1st 3 columns of 1st 2 lines) -
"date","advertiser","advertiser_id", ...
"2017-05-27","Swiss.com India (UK)","29805", ...
However, on occasion, the quotes from the csv file are missing - I then get it as -
date,advertiser,advertiser_id, ...
27/05/2017,Swiss.com India (UK),29805, ...
In this situation, the date pattern turns out to be '%d/%m/%Y'.
There is no discernible pattern to when a file would be downloaded with the unquoted dates. Most times, if I delete the downloaded file and re-run my script, the quoted attachment is re-downloaded.
Is there a way to setup the attachment download such that the date column is downloaded in the quoted format? Or is there a way to ensure that when I read the csv (using csv.reader) I always get the date column in a certain format?
The specific method I am using to download attachments is given here -
https://developers.google.com/gmail/api/v1/reference/users/messages/attachments/get (Python version). The exact code snippet is -
# Get the body of this part and it's keys.
part_body = part['body']
part_body_keys = part_body.keys()
...
if 'data' in part_body_keys:
a_data = part_body['data']
elif 'attachmentId' in part_body_keys:
att_id = part_body['attachmentId']
att = service.users().messages().attachments().get(
userId=user_id, messageId=message['id'],
id=att_id).execute()
a_data=att['data']
else:
...
# Encode it appropriately and write it to the file.
file_data = base64.urlsafe_b64decode(a_data.encode('UTF-8'))
...
f = open(file_name, 'wb')
f.write(file_data)
f.close()
The code snippet when reading the csv file is -
infile = open(file_name, mode="r", encoding='ascii', errors='ignore')
filereader = csv.reader(infile)
date_fmt = "%Y-%m-%d"
…
for a_row in filereader:
…
try:
rf_datetime = time.strptime(a_row[0], date_fmt)
…
Any pointers would be appreciated! This script has become a key component of my business that automates our reporting process and has visibly reduced effort all around.
Regards
Nitin
It looks like the attached csv files are in a different format themselves (or maybe there is a difference between 'data' and 'attachmentId'?).
To be sure, you could download them manually and check them in a text editor.
As for the quotes: for csv it doesn't make a difference if the fields are quoted or not. Only when fields contain a field separator it needs to be surrounded with quotes. But since you're using a csv reader this shouldn't matter.
As for the dates, it's probably easiest to check the date format once before the reading loop (in the first data row), and set date_fmt (for parsing) accordingly.

Saving results of regular expression to csv or xls

I have a log record like this (millions of rows):
previous_status>SERVICE</previous_status><reason>1</>device_id>SENSORS</device_id><DEVICE>ISCS</device_type><status>OK
I would like to to extract all the words in capital into individual columns in excel using python to look like this :
SERVICE SENSORS DEVICE
As per the comments from #peter-wood, it isn't clear what your input is. However, assuming that your input is as you posted, then here is a minimal solution that works off the given structure. If it is not quite right, you should be able to easily change it to search on whatever is really your structure.
import csv
# You need to change this path.
lines = [row.strip() for row in open('/path/to/log.txt').readlines()]
# You need to change this path to where you want to write the file.
with open('/path/to/write/to/mydata.csv', 'w') as fh:
# If you want a different delimiter, like tabs '\t', change it here.
writer = csv.writer(fh, delimiter=',')
for l in lines:
# You can cut and paste the tokens that start and stop the pieces you are looking for here.
service = l[l.find('previous_status>')+len('previous_status>'):l.find('</previous_status')]
sensor = l[l.find('device_id>')+len('device_id>'):l.find('</device_id>')]
device = l[l.find('<DEVICE>')+len('<DEVICE>'):l.find('</device_type>')]
writer.writerow([service, sensor, device])

Categories

Resources