Python: Compare List with a List in File

Python: Compare List with a List in File - python

I am strugling with this problem that looks simple but I'm stuck! Well, I have to build a function where I receive a list of categories like:
input Example1: ['point_of_interest', 'natural_feature', 'park', 'establishment']
input Example2: ['point_of_interest', 'establishment']
input Example3: ['sublocality', 'political']
So that list could be with variable elements inside I guess from 1 till 4 not more
So with this same data I am gonna create a file with that input in a way that if the new input is not in the file, append it to the file.
The way is each list is an element itself, I mean I have to compare the full elements of the list and if I can find other list exactly equal I don´t have to add it.
In my code I just tried to add the first element in the file because really I don't know how to add the full list to compare with the next list.
def categories(category):
number = 0
repeat = False
if os.path.exists("routes/svm/categories"):
with open('routes/svm/categories', 'rb') as csvfile:
spamreader = csv.reader(csvfile)
for categoryFile in spamreader:
if (cmp(categoryFile,category) == 0):
number += 1
repeat = True
if not repeat:
categoriesFile = open('routes/svm/categories', 'a')
category = str(category[0])
categoriesFile.write(category)
categoriesFile.write('\n')
categoriesFile.close()
else:
categoriesFile = open('routes/svm/categories', 'w')
category = str(category[0])
categoriesFile.write(category)
categoriesFile.write('\n')
categoriesFile.close()
EDIT: Some explanation by #KlausWarzecha: Users might enter a list with (about 4) items. If this list ( = this combination of items) is not in the file already, you want to add the list (and not the items separately!) to the file? –

The problem is really simple. You may take the following approach if it works for you:
Read all the contents of the CSV into a list
Add all the non-matching items from the input into this list
re-write the CSV file
You may start with this sample code:
# input_list here represents the inputs
# You may get input from some other source too
input_list = [['point_of_interest', 'natural_feature', 'park', 'establishment'], ['point_of_interest', 'establishment'], ['sublocality', 'political']]
category_list = []
with open('routes/svm/categories', 'rb') as csvfile:
spamreader = csv.reader(csvfile)
for categoryFile in spamreader:
print categoryFile
category_list.append(categoryFile)
for item in input_list:
if (item in category_list):
print "Found"
else:
category_list.append(item)
print "Not Found"
# Write `category_list` to the CSV file
Please use this code as a starting point and not as a copy-paste solution.

Related

getting some data in a list contains dictionaries for python programming

What is the right way if you want to store certain some info in a list that contains dictionaries into another dictionary? I used a way that keeping extracting data from each smaller part of the list(sorry I do not really know what is the professional way to explain that since I am a newbie).
here is my code :
# TODO: Read database file into a variable
weu = dict()
template = []
LI = []
with open (sys.argv[1] , "r") as file:
reader = csv.DictReader(file)
for line in reader :
LI.append(line)
# TODO: Read DNA sequence file into a variable
with open(sys.argv[2], "r") as (F):
reader =csv.reader(F)
text = next(reader)
# TODO: Find longest match of each STR in DNA sequence
sequence = text
n = len(LI[0])
sub = []
sub = list(LI[0].keys())[1:]
#There is another function that actually gets the data for the value of that dictionary here but the problem is not that so I just put something random for the value of the dinctionary here.
for subsequence in sub :
weu[subsequence] = 200
element = dict()
# TODO: Check database for matching profile
for i in range (int(len(LI))):
#this line keeps telling me my i is out of range, I do not really know why
template[i] = list(LI[i].values())[1:]
for subsequence in template[i]:
element[subsequence] = list(template[i])[subsequence]
And I did feel my way of doing this is quiet messy. In a Youtube video that explains this program, I saw the author just simply calling that (Which probably did the same thing as my last few lines did in the case my program works right) :
for person in LI:
for subsequence in sub :
if int(person[subsequence]) == weu[subsequence]:
I actually got the idea, but is that the most efficient one in this case? And if I want to do it in my way how to fix that so my last few lines are gonna work.

Need to read csv files (when csv file is multiple input files) in Python

I have a school assignment that is asking me to write a program that first reads in the name of an input file and then reads the file using the csv.reader() method. The file contains a list of words separated by commas. The program should output the words and their frequencies (the number of times each word appears in the file) without any duplicates.
I have been able to figure out how to do this somewhat for one specific input file, but the program needs to be able to read multiple input files. This is what I have so far:
with open('input1.csv', 'r') as input1file:
csv_reader = csv.reader(input1file, delimiter = ',')
for row in csv_reader:
new_row = set(row)
for m in new_row:
count = row.count(m)
print(m, count)
This is what I get:
woman 1
man 2
Cat 1
Hello 1
boy 2
cat 2
dog 2
hey 2
hello 1
This works (almost) for the input1 file, except it changes the order each time I run it.
And I need it to work for two other input files?
sample CSV
hello,cat,man,hey,dog,boy,Hello,man,cat,woman,dog,Cat,hey,boy

See the code below for an example, I've commented it so you understand what it does and why.
As for the fact that for your implementation the order is different is due to the usage of set. A set by definition is unordered.
Also note that with your implementation you are passing over the rows twice, once to turn it into a set, and once more to count. Besides this, if the file contains more than one row, your logic would fail, as the counting part only gets reached when the last line of the file is read.
import csv
def count_things(filename):
with open(filename) as infile:
csv_reader = csv.reader(infile, delimiter = ',')
result = {}
for row in csv_reader:
# go over the row by element
for element in row:
# does it exist already?
if element in result:
# if yes, increase count
result[element] += 1
else:
# if no, add and set count to 1
result[element] = 1
# sorting, explained in detail here:
# https://stackoverflow.com/a/613218/9267296
return {k: v for k, v in sorted(result.items(), key=lambda item: item[1], reverse=True)}
# you could just return unsorted result by using:
# return result
for key, value in count_things("input1.csv").items():
# iterate over items() by key/value pairs
# see this link:
# https://www.w3schools.com/python/python_dictionaries_access.asp
print(key, value)

Creating a function to concatenate strings based on len(array)

I am trying to concatenate a string to send a message via python>telegram
My plan is so that the function is modular.
It first import lines from a .txt file and based on that many lines it creates two different arrays
array1[] and array2[], array1 will receive the values of the list as strings and array2 will receive user generated information to complemente what is stored in the same position as to a way to identify the differences in the array1[pos], as to put in a way:
while (k<len(list)):
array2[k]= str(input(array1[k]+": "))
k+=1
I wanted to create a single string to send in a single message like however in a way that all my list goes inside the same string
string1 = array1[pos]+": "+array2[pos]+"\n"
I have tried using while to compared the len but I kept recalling and rewriting my own string again and again.

It looks like what you're looking for is to have one list that comes directly from your text file. There's lots of ways to do that, but you most likely won't want to create a list iteratively with the index position. I would say to just append items to your list.
The accepted answer on this post has a good reference, which is basically the following:
import csv
with open('filename.csv', 'r') as fd:
reader = csv.reader(fd)
for row in reader:
# do something
Which, in your case would mean something like this:
import csv
actual_text_list = []
with open('filename.csv', 'r') as fd:
reader = csv.reader(fd)
for row in reader:
actual_text_list.append(row)
user_input_list = []
for actual_text in actual_text_list:
the_users_input = input(f'What is your response to {actual_text}? ')
user_input_list.append(the_users_input)
This creates two lists, one with the actual text, and the other with the other's input. Which I think is what you're trying to do.
Another way, if the list in your text file will not have duplicates, you could consider using a dict, which is just a dictionary, a key-value data store. You would make the key the actual_text from the file, and the value the user_input. Another technique, you could make a list of lists.
import csv
actual_text_list = []
with open('filename.csv', 'r') as fd:
reader = csv.reader(fd)
for row in reader:
actual_text_list.append(row)
dictionary = dict()
for actual_text in actual_text_list:
the_users_input = input(f'What is your response to {actual_text}? ')
dictionary[actual_text] = the_users_input
Then you could use that data like this:
for actual_text, user_input in dictionary.items():
print(f'In response to {actual_text}, you specified {user_input}.')

list_of_strings_from_txt = ["A","B","C"]
modified_list = [f"{w}: {input(f'{w}:')}" for w in list_of_strings_from_txt]
I guess? maybe?

From a file containing prime numbers to a list of integers on Python

In order to work out some asymptotic behavior on the topic of twin prime conjecture, I am required to take a raw file(.csv or .txt) and convert that data into a list in python where I could reach by pointing its index number.
That is, I have a big(~10 million numbers) list of prime numbers in .csv file, lets say that this is that list:
2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83
I am and trying to produce the following
[2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83]
in order to examine, ay the third element in the list, which is 5.
The approach I am taking is the following:
import sys
import csv
# The csv file might contain very huge fields, therefore increase the field_size_limit:
csv.field_size_limit(sys.maxsize)
with open('primes1.csv') as csvfile:
reader = csv.reader(csvfile, delimiter=' ')
output = []
for i in reader:
output.append(i)
Then, if printing,
for rows in output:
print(rows)
I am getting
['2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83'].
How does one resolve this? Thank you very much.

Maybe this:
with open("primes1.csv", "r") as f:
lst = [int(i) for i in f.read().split(",")]

You don't need to use the csv reader for that (like the other answer showed) but if you want to, you could do it like this, reading just the first row.
Your code is iterating rows and adding them to the output list, but you need to iterate columns just in the first row. The next(reader) call returns just the first row.
with open('test.csv','r') as csvFile:
reader = csv.reader(csvFile, delimiter=',')
output = [int(i) for i in next(reader)]
# alternate approach
# output = [int(i) for i in csvFile.read().strip().split(',')]
print(output)

Replacing data in files with new inputs

First of all my program must work with several files and 10 inputs in every file, this is just little piece, to be clear.
My code right now:
code = input(">> ")
print("\nPress <Enter> and parameter will be same!")
f = open("komad_namestaja.txt", "r")
allDATA = f.readlines()
f.close()
for line in allDATA:
lst = line.split("|")
if code == lst[0]:
print("\nName :", lst[1])
name = input("New Name >> ")
if name == "":
name = lst[1]
f = open("komad_namestaja.txt", "r")
allDATA = f.read()
f.close()
newdata = allDATA.replace(lst[1], name)
f = open("komad_namestaja.txt", "w")
f.write(newdata)
f.close()
print("\ndestination :", lst[2])
destination = input("New destination >> ")
if destination == "":
destination = lst[2]
#Writting function here
File before:
312|chessburger|Denmark
621|chesscake|USA
code input: 312
name input: Gyros
destination input: Poland
file after inputs:
312|Gyros|Poland
621|chesscake|USA
Problem is this replacing in file I cant write 7 lines code every time, because I have 10 x 5 inputs, and also I tried everything and cant make function of this.
I must write some function for reading/writing/replacing or replacing all inputs after last one.

You don't have to read the file in every time to modify one field, write it out, reopen it to change another field, and so on. That's inefficient, and in your case, causes code explosion.
Since your files are small, you could just read everything into memory at once and work on it in memory. Your code is easy to map via a dict.
Here's a function that takes a filename and converts your file into a dictionary.
def create_mapping(filename):
with open(filename, 'r') as infile:
data = infile.readlines()
mapping = {int(k): (i,d) for k,i,d in
(x.strip().split('|') for x in data)}
# Your mapping now looks like
# { 312: ('cheeseburger', 'Denmark'),
# 621: ('chesscake', 'USA') }
return mapping
Then you can update the mapping from user input since it's just a dictionary.
Once you want to write the file out, you can just serialize out your dictionary by iterating over the keys and rejoining all the elements using |.
If you want to use lists
If you want to stick with just using lists for everything, that is possible.
I would still recommend reading your file into a list, like so:
def load_file(filename):
with open(filename, 'r') as infile:
data = infile.readlines()
items = [(int(k), i, d) for k,i,d in
(x.strip().split('|') for x in data]
# Your list now looks like
# [(312, 'cheeseburger', 'Denmark'), (621, 'chesscake', 'USA')]
return items
Then when you get some user input, you have to traverse the list and find the tuple with what you want inside.
For example, say the user entered code 312, you could find the tuple that contained the 312 value from the list of tuples with this:
items = load_file(filename)
# Get input for 'code' from user
code = int(input(">> "))
# Get the position in the list where the item with this code is
try:
list_position = [item[0] for item in items].index(code)
# Do whatever you need to (ask for more input?)
# If you have to overwrite the element, just reassign its
# position in the list with
# items[list_position] = (code, blah, blah)
except IndexError:
# This means that the user's entered code wasn't entered
# Here you do what you need to (maybe add a new item to the list),
# but I'm just going to pass
pass

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: Compare List with a List in File - python

Related

getting some data in a list contains dictionaries for python programming

Need to read csv files (when csv file is multiple input files) in Python

Creating a function to concatenate strings based on len(array)

From a file containing prime numbers to a list of integers on Python

Replacing data in files with new inputs

Categories

Resources