Python : How to read Key Value pair from CSV file? - python

I have a csv file having 3 columns and I want to read 1st and 3rd column as key value pair. I am doing it like below but it's not working.
with open(dirName + fileName) as f:
for line in f:
(key, value) = line.split(',')

I'm thinking you want something like:
with open(dirName + fileName) as f:
for line in f:
fields = line.split(',')
assert len(fields) == 3
(key, _, value) = fields
But maybe glance at the csv module.

Any time you're working with csv files use the csv module.
As #Buckeye14Guy says: you should also use pathlib for path manipulations.
And, for fast lookup, you can store key-value pairs in a dictionary, d.
import csv, pathlib
d = {}
your_path = pathlib.PurePath(dirName).joinpath(filename)
with open(your_path,'r') as f:
reader = csv.reader(f)
for line in reader:
d[line[0]] = line[2] # dict entry with key = 1st col and value = 3rd col

Try this
with open(file,'r+') as text:
for line in text.readlines():
(key, value) = line.split(',')

Related

How to get Dictwriter to write a key value pair to each line of a csv

I have some code which works in solar as it writes a dict to a csv file. It writes the keys as a line of headers and the corresponding values in a line underneath.
What I would like to do is have each key value pair from the dict be written to a single line, then the next key, value pair be written on a newline.
Is this possible with Dictwriter?
Code
import csv
def write_csv(fullfilepath, mydict):
""" Write a simple dict to a csv file at given filename and path """
with open(fullfilepath, 'w', newline='') as filey:
w = csv.DictWriter(filey, mydict.keys())
print(type(w))
w.writeheader()
w.writerow(mydict)
fullfilepath = r"C:\path\to\Desktop\csv\file\dummy.csv"
mydict = {"a":1, "b":2, "c":3}
write_csv(fullfilepath, mydict)
try file opening with append mode like this:
with open(fullfilepath, 'a', newline='') as filey:
It will not write keys and values on same row.
The keys and values will be on different rows only
If you want to write on same line you can prepare string , seperated like this:
keyrow = ",".join(mydict.keys())
valuerow = ",".join(mydict.values())
row = keyrow + ',' + valuerow
I found out how it can be done. This will write a csv file which has the key value pairs on a single line, the newline ='' makes sure there is no empty line separating each row in the csv.
def write_csv(fullfilepath, mydict):
""" Write a simple dict to a csv file at given filename and path """
with open(fullfilepath, 'w', newline = '') as csv_file:
writer = csv.writer(csv_file)
for key, value in mydict.items():
writer.writerow([key, value])

Split a large text file to small ones based on location

Suppose I have a big file as file.txt and it has data of around 300,000. I want to split it based on certain key location. See file.txt below:
Line 1: U0001;POUNDS;**CAN**;1234
Line 2: U0001;POUNDS;**USA**;1234
Line 3: U0001;POUNDS;**CAN**;1234
Line 100000; U0001;POUNDS;**CAN**;1234
The locations are limited to 10-15 different nation. And I need to separate each record of a particular country in one particular file. How to do this task in Python
Thanks for help
This will run with very low memory overhead as it writes each line as it reads it.
Algorithm:
open input file
read a line from input file
get country from line
if new country then open file for country
write the line to country's file
loop if more lines
close files
Code:
with open('file.txt', 'r') as infile:
try:
outfiles = {}
for line in infile:
country = line.split(';')[2].strip('*')
if country not in outfiles:
outfiles[country] = open(country + '.txt', 'w')
outfiles[country].write(line)
finally:
for outfile in outfiles.values():
outfile.close()
with open("file.txt") as f:
content = f.readlines()
# you may also want to remove whitespace characters like `\n` at the end of each line
text = [x.strip() for x in content]
x = [i.split(";") for i in text]
x.sort(key=lambda x: x[2])
from itertools import groupby
from operator get itemgetter
y = groupby(x, itemgetter(2))
res = [(i[0],[j for j in i[1]]) for i in y]
for country in res:
with open(country[0]+".txt","w") as writeFile:
writeFile.writelines("%s\n" % ';'.join(l) for l in country[1])
will group by your item!
Hope it helps!
Looks like what you have is a csv file. csv stands for comma-separated values, but any file that uses a different delimiter (in this case a semicolon ;) can be treated like a csv file.
We'll use the python module csv to read the file in, and then write a file for each country
import csv
from collections import defaultdict
d = defaultdict(list)
with open('file.txt', 'rb') as f:
r = csv.reader(f, delimiter=';')
for line in r:
d[l[2]].append(l)
for country in d:
with open('{}.txt'.format(country), 'wb') as outfile:
w = csv.writer(outfile, delimiter=';')
for line in d[country]:
w.writerow(line)
# the formatting-function for the filename used for saving
outputFileName = "{}.txt".format
# alternative:
##import time
##outputFileName = lambda loc: "{}_{}.txt".format(loc, time.asciitime())
#make a dictionary indexed by location, the contained item is new content of the file for the location
sortedByLocation = {}
f = open("file.txt", "r")
#iterate each line and look at the column for the location
for l in f.readlines():
line = l.split(';')
#the third field (indices begin with 0) is the location-abbreviation
# make the string lower, cause on some filesystems the file with upper chars gets overwritten with only the elements with lower characters, while python differs between the upper and lower
location = line[2].lower().strip()
#get previous lines of the location and store it back
tmp = sortedByLocation.get(location, "")
sortedByLocation[location]=tmp+l.strip()+'\n'
f.close()
#save file for each location
for location, text in sortedByLocation.items():
with open(outputFileName(location) as f:
f.write(text)

CSV file , can't add record from csv file

How can I add record from csv file into dictionary in function where the input attribute will be tha path fo that csv file?
Please help with this uncompleted function :
def csv_file (p):
dictionary={}
file=csv.reader(p)
for rows in file:
dictionary......(rows)
return dictionary
You need to open the file first:
def csv_file(p):
dictionary = {}
with open(p, "rb") as infile: # Assuming Python 2
file = csv.reader(infile) # Possibly DictReader might be more suitable,
for row in file: # but that...
dictionary......(row) # depends on what you want to do.
return dictionary
It seems as though you haven't even opened the file, you need to use open for that.
Try the following code:
import csv
from pprint import pprint
INFO_LIST = []
with open('sample.csv') as f:
reader = csv.reader(f, delimiter=',', quotechar='"')
for i, row in enumerate(reader):
if i == 0:
TITLE_LIST = [var for var in row]
continue
INFO_LIST.append({title: info for title, info in zip(TITLE_LIST, row)})
pprint(INFO_LIST)
I use the following csv file as an example:
"REVIEW_DATE","AUTHOR","ISBN","DISCOUNTED_PRICE"
"1985/01/21","Douglas Adams",0345391802,5.95
"1990/01/12","Douglas Hofstadter",0465026567,9.95
"1998/07/15","Timothy ""The Parser"" Campbell",0968411304,18.99
"1999/12/03","Richard Friedman",0060630353,5.95
"2001/09/19","Karen Armstrong",0345384563,9.95
"2002/06/23","David Jones",0198504691,9.95
"2002/06/23","Julian Jaynes",0618057072,12.50
"2003/09/30","Scott Adams",0740721909,4.95
"2004/10/04","Benjamin Radcliff",0804818088,4.95
"2004/10/04","Randel Helms",0879755725,4.50
You can put all that logic into a function like so:
def csv_file(file_path):
# Checking if a filepath is a string, if not then we return None
if not isinstance(file_path, str):
return None
# Creating a the list in which we will hold our dictionary's files
_info_list = []
with open(file_path) as f:
# Setting the delimiter and quotechar
reader = csv.reader(f, delimiter=',', quotechar='"')
# We user enumerate here, because we know the first row contains data about the information
for i, row in enumerate(reader):
# The first row contains the headings
if i == 0:
# Creating a list from first row
title_list = [var for var in row]
continue
# Zipping title_list and info_list together, so that a dictionary comprehension is possible
_info_list.append({title: info for title, info in zip(title_list, row)})
return _info_list
APPENDIX
open()
zip
Dictionary Comprehension
Delmiter, its the character that separates values, in this case ,.
Quotechar, its the character, that holds values in a csv, in this case ".

Making a dictionary from file, first word is key in each line then other four numbers are to be a tuple value

This dictionary is supposed to take the three letter country code of a country, i.e, GRE for great britain, and then take the four consecutive numbers after it as a tuple. it should be something like this:
{GRE:(204,203,112,116)} and continue doing that for every single country in the list. The txt file goes down like so:
Country,Games,Gold,Silver,Bronze
AFG,13,0,0,2
ALG,15,5,2,8
ARG,40,18,24,28
ARM,10,1,2,9
ANZ,2,3,4,5 etc.;
This isn't actually code i just wanted to show it is formatted.
I need my program to skip the first line because it's a header. Here's what my code looks like thus far:
def medals(goldMedals):
infile = open(goldMedals, 'r')
medalDict = {}
for line in infile:
if infile[line] != 0:
key = line[0:3]
value = line[3:].split(',')
medalDict[key] = value
print(medalDict)
infile.close()
return medalDict
medals('GoldMedals.txt')
Your for loop should be like:
next(infile) # Skip the first line
for line in infile:
words = line.split(',')
medalDict[words[0]] = tuple(map(int, words[1:]))
A variation on a theme, I'd convert all the remaining cols to ints, and I'd use a namedtuple:
from collections import namedtuple
with open('file.txt') as fin:
# The first line names the columns
lines = iter(fin)
columns = lines.next().strip().split(',')
row = namedtuple('Row', columns[1:])
results = {}
for line in lines:
columns = line.strip().split(',')
results[columns[0]] = row(*(int(c) for c in columns[1:]))
# Results is now a dict to named tuples
This has the nice feature of 1) skipping the first line and 2) providing both offset and named access to the rows:
# These both work to return the 'Games' column
results['ALG'].Games
results['ALG'][0]
with open('path/to/file') as infile:
answer = {}
for line in infile:
k,v = line.strip().split(',',1)
answer[k] = tuple(int(i) for i in v.split(','))
I think inspectorG4dget's answer is the most readable... but for those playing code golf:
with open('medals.txt', 'r') as infile:
headers = infile.readline()
dict([(i[0], tuple(i[1:])) for i in [list(line.strip().split(',')) for line in infile]])

Replace character in line inside a file

I have these different lines with values in a text file
sample1:1
sample2:1
sample3:0
sample4:15
sample5:500
and I want the number after the ":" to be updated sometimes
I know I can split the name by ":" and get a list with 2 values.
f = open("test.txt","r")
lines = f.readlines()
lineSplit = lines[0].split(":",1)
lineSplit[1] #this is the value I want to change
im not quite sure how to update the lineSplit[1] value with the write functions
You can use the fileinput module, if you're trying to modify the same file:
>>> strs = "sample4:15"
Take the advantage of sequence unpacking to store the results in variables after splitting.
>>> sample, value = strs.split(':')
>>> sample
'sample4'
>>> value
'15'
Code:
import fileinput
for line in fileinput.input(filename, inplace = True):
sample, value = line.split(':')
value = int(value) #convert value to int for calculation purpose
if some_condition:
# do some calculations on sample and value
# modify sample, value if required
#now the write the data(either modified or still the old one) to back to file
print "{}:{}".format(sample, value)
Strings are immutable, meaning, you can't assign new values inside them by index.
But you can split up the whole file into a list of lines, and change individual lines (strings) entirely. This is what you're doing in lineSplit[1] = A_NEW_INTEGER
with open(filename, 'r') as f:
lines = f.read().splitlines()
for i, line in enumerate(lines):
if condition:
lineSplit = line.split(':')
lineSplit[1] = new_integer
lines[i] = ':'.join(lineSplit)
with open(filename, 'w') as f:
f.write('\n'.join(lines)
Maybe something as such (assuming that each first element before : is indeed a key):
from collections import OrderedDict
with open('fin') as fin:
samples = OrderedDict(line.split(':', 1) for line in fin)
samples['sample3'] = 'something else'
with open('output') as fout:
lines = (':'.join(el) + '\n' for el in samples.iteritems())
fout.writelines(lines)
Another option is to use csv module (: is a column delimiter in your case).
Assuming there is a test.txt file with the following content:
sample1:1
sample2:1
sample3:0
sample4:15
sample5:500
And you need to increment each value. Here's how you can do it:
import csv
# read the file
with open('test.txt', 'r') as f:
reader = csv.reader(f, delimiter=":")
lines = [line for line in reader]
# write the file
with open('test.txt', 'w') as f:
writer = csv.writer(f, delimiter=":")
for line in lines:
# edit the data here
# e.g. increment each value
line[1] = int(line[1]) + 1
writer.writerows(lines)
The contents of test.txt now is:
sample1:2
sample2:2
sample3:1
sample4:16
sample5:501
But, anyway, fileinput sounds more logical to use in your case (editing the same file).
Hope that helps.

Categories

Resources