Splitting a list into strings - python

I have a variable that contains multiple lists. I'm trying to split the lists into strings so that I can add them to a csv file but I'm not sure how.
This is what I have been trying to do. For some reason, integrating through the different lists (i.e. participants) doesn't seem to work properly. It only uses the last list instead.
results = open("results.csv", "w")
strings = ""
for participant in contents:
for list in participant:
s = ""
for x in list:
s += x
strings += s
results.write(f"{strings}")

with open('results.csv', 'w') as f:
f.write('\n'.join(','.join(s) for s in [participant for participant in contest]))

Related

Creating a function to concatenate strings based on len(array)

I am trying to concatenate a string to send a message via python>telegram
My plan is so that the function is modular.
It first import lines from a .txt file and based on that many lines it creates two different arrays
array1[] and array2[], array1 will receive the values of the list as strings and array2 will receive user generated information to complemente what is stored in the same position as to a way to identify the differences in the array1[pos], as to put in a way:
while (k<len(list)):
array2[k]= str(input(array1[k]+": "))
k+=1
I wanted to create a single string to send in a single message like however in a way that all my list goes inside the same string
string1 = array1[pos]+": "+array2[pos]+"\n"
I have tried using while to compared the len but I kept recalling and rewriting my own string again and again.
It looks like what you're looking for is to have one list that comes directly from your text file. There's lots of ways to do that, but you most likely won't want to create a list iteratively with the index position. I would say to just append items to your list.
The accepted answer on this post has a good reference, which is basically the following:
import csv
with open('filename.csv', 'r') as fd:
reader = csv.reader(fd)
for row in reader:
# do something
Which, in your case would mean something like this:
import csv
actual_text_list = []
with open('filename.csv', 'r') as fd:
reader = csv.reader(fd)
for row in reader:
actual_text_list.append(row)
user_input_list = []
for actual_text in actual_text_list:
the_users_input = input(f'What is your response to {actual_text}? ')
user_input_list.append(the_users_input)
This creates two lists, one with the actual text, and the other with the other's input. Which I think is what you're trying to do.
Another way, if the list in your text file will not have duplicates, you could consider using a dict, which is just a dictionary, a key-value data store. You would make the key the actual_text from the file, and the value the user_input. Another technique, you could make a list of lists.
import csv
actual_text_list = []
with open('filename.csv', 'r') as fd:
reader = csv.reader(fd)
for row in reader:
actual_text_list.append(row)
dictionary = dict()
for actual_text in actual_text_list:
the_users_input = input(f'What is your response to {actual_text}? ')
dictionary[actual_text] = the_users_input
Then you could use that data like this:
for actual_text, user_input in dictionary.items():
print(f'In response to {actual_text}, you specified {user_input}.')
list_of_strings_from_txt = ["A","B","C"]
modified_list = [f"{w}: {input(f'{w}:')}" for w in list_of_strings_from_txt]
I guess? maybe?

Writing results of NLTK FreqDist to a .csv file as a row in Python

I'm attempting to write out the results of a frequency count of specific words in a text file based on a collection of words in a python list ( I haven't included it in the code listing as there are several hundred)
file_path = 'D:/TestHedges/Hedges_Test_11.csv'
corpus_root = test_path
wordlists = PlaintextCorpusReader(corpus_root, '.*')
print(wordlists.fileids())
CIK_List = []
freq_out = []
for filename in glob.glob(os.path.join(test_path, '*.txt')):
CIK = filename[33:39]
CIK = CIK.strip('_')
# CIK = CIK.strip('_0') commented out to see if it deals with just removing _. It does not 13/9/2020
newstext = wordlists.words()
fdist = nltk.FreqDist([w.lower() for w in newstext])
CIK_List.append(CIK)
with open(file_path, 'w', newline='') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(["CIK"] + word_list)
for val in CIK_List:
writer.writerow([val])
for m in word_list:
print(CIK, [fdist[m]], end='')
writer.writerows([fdist[m]])
My problem is with the writing of fdist[m] as a row into a .csv file. It is generating an error
_csv.Error: iterable expected, not int
How can I re-write this to place the frequency distribution into a row in a .csv file?
Thanks in advance
You have two choices - either use writerow instead of writerows or create a list of values first and then pass it to writer.writerows instead of fdist[m]. Now, each of the row values in the list should be a tuple (or an interable). Therefore, for writerows to work you would have to encapsulate it again in a tuple:
writer.writerows([(fdist[m],)])
Here, the comma denotes a 1-value tuple.
In order to write all of the values in one row instead of this code:
for m in word_list:
print(CIK, [fdist[m]], end='')
writer.writerows([fdist[m]])
You should use:
for m in word_list:
print(CIK, [fdist[m]], end='')
writer.writerows(([fdist[m] for m in word_list],))
Please note a list comprehension.
On a different note, just by looking at your code, it seems to me that you could do the same without involving NLTK library just by using collections.Counter from standard library. It is the underlying container in FreqDist class.

Sort file by key

I am learning Python 3 and I'm having issues completing this task. It's given a file with a string on each new line. I have to sort its content by the string located between the first hyphen and the second hyphen and write the sorted content into a different file. This is what I tried so far, but nothing gets sorted:
def sort_keys(path, input, output):
list = []
with open(path+'\\'+input, 'r') as f:
for line in f:
if line.count('-') >= 1:
list.append(line)
sorted(list, key = lambda s: s.split("-")[1])
with open(path + "\\"+ output, 'w') as o:
for line in list:
o.write(line)
sort_keys("C:\\Users\\Daniel\\Desktop", "sample.txt", "results.txt")
This is the input file: https://pastebin.com/j8r8fZP6
Question 1: What am I doing wrong with the sorting? I've used it to sort the words of a sentence on the last letter and it worked fine, but here don't know what I am doing wrong
Question 2: I feel writing the content of the input file in a list, sorting the list and writing aftwerwards that content is not very efficient. What is the "pythonic" way of doing it?
Question 3: Do you know any good exercises to learn working with files + folders in Python 3?
Kind regards
Your sorting is fine. The problem is that sorted() returns a list, rather than altering the one provided. It's also much easier to use list comprehensions to read the file:
def sort_keys(path, infile, outfile):
with open(path+'\\'+infile, 'r') as f:
inputlines = [line.strip() for line in f.readlines() if "-" in line]
outputlines = sorted(inputlines, key=lambda s: s.split("-")[1])
with open(path + "\\" + outfile, 'w') as o:
for line in outputlines:
o.write(line + "\n")
sort_keys("C:\\Users\\Daniel\\Desktop", "sample.txt", "results.txt")
I also changed a few variable names, for legibility's sake.
EDIT: I understand that there are easier ways of doing the sorting (list.sort(x)), however this way seems more readable to me.
First, your data has a couple lines without hyphens. Is that a typo? Or do you need to deal with those lines? If it is NOT a typo and those lines are supposed to be part of the data, how should they be handled?
I'm going to assume those lines are typos and ignore them for now.
Second, do you need to return the whole line? But each line is sorted by the 2nd group of characters between the hyphens? If that's the case...
first, read in the file:
f = open('./text.txt', 'r')
There are a couple ways to go from here, but let's clean up the file contents a little and make a list object:
l = [i.replace("\n","") for i in f]
This will create a list l with all the newline characters removed. This particular way of creating the list is called a list comprehension. You can do the exact same thing with the following code:
l = []
for i in f:
l.append(i.replace("\n","")
Now lets create a dictionary with the key as the 2nd group and the value as the whole line. Again, there are some lines with no hyphens, so we are going to just skip those for now with a simple try/except block:
d = {}
for i in l:
try:
d[i.split("-")[1]] = i
except IndexError:
pass
Now, here things can get slightly tricky. It depends on how you want to approach the problem. Dictionaries are inherently unsorted in python, so there is not a really good way to simply sort the dictionary. ONE way (not necessarily the BEST way) is to create a sorted list of the dictionary keys:
s = sorted([k for k, v in d.items()])
Again, I used a list comprehension here, but you can rewrite that line to do the exact same thing here:
s = []
for k, v in d.items():
s.append(k)
s = sorted(s)
Now, we can write the dictionary back to a file by iterating through the dictionary using the sorted list. To see what I mean, lets print out the dictionary one value at a time using the sorted list as the keys:
for i in s:
print(d[i])
But instead of printing, we will now append the line to a file:
o = open('./out.txt', 'a')
for i in s:
o.write(d[i] + "\n")
Depending on your system and formatting, you may or may not need the + "\n" part. Also note that you want to use 'a' and not 'w' because you are appending one line at a time and if you use 'w' your file will only be the last item of the list.

How to write list elements into a tab-separated file?

I have searched the web but I haven't found the answer for my problem:
I have a a dictionary with lists as elements and every list has a different length. For example like that:
dict_with_lists[value1] = [word1, word2, word3, word4]
dict_with_lists[value2] = [word1, word2, word3]
My main problem is, that I want to write the list elements in to a file which should be tab-separated and if the list is finished it should write the new list in a new line.
I found a solution like that:
with open('fname', 'w') as file:
file.writelines('\t'.join(i) + '\n' for i in nested_list)
But it doesn't only separate the words with tabs but also the characters.
If nested_list is one of your dictionary values, then you are applying '\t'.join() to the individual words. You'd want to join the whole list:
file.write('\t'.join(nested_list) + '\n')
or, if you were to loop over the values of the dictionary:
file.writelines(
'\t'.join(nested_list) + '\n'
for nested_list in dict_with_lists.values())
The above uses the file.writelines() method correctly; passing in an iterable of strings to write. If you were to pass in a single string, then you are only causing Python extra work as it loops over all the individual characters of that string to write those separately, after which the underlying buffer has to assemble those back into bigger strings again.
However, there is no need to re-invent the character-separated-values writing wheel here. Use the csv module, setting the delimiter to '\t':
import csv
with open('fname', 'w', newline='') as file:
writer = csv.writer(file, delimiter='\t')
writer.writerows(dict_with_lists.values())
The above writes all lists in the dict_with_lists dictionary to a file. The csv.writer() object doesn't mind if your lists are of differing lengths.
You need to turn each list value in the dictionary into a string of tab-separated values that also have a '\n' newline character at the end of each one of them:
value1, value2 = 'key1', 'key2'
dict_with_lists = {}
dict_with_lists[value1] = ['word1', 'word2', 'word3', 'word4']
dict_with_lists[value2] = ['word1', 'word2', 'word3']
fname = 'dict_list_values.tsv'
with open(fname, 'w') as file:
file.writelines(
'\t'.join(values)+'\n' for values in dict_with_lists.values()
)
I think you're doing a list comprehension over the list inside of the dictionary. An alternative solution would be
with open('fname', 'w') as file:
for nested_list in dict_with_lists.values():
for word in nested_list:
file.write(word + '\t')
file.write('\n')
\I'm just looping over the values of the dictionaries, which are lists in this case and joining them using a tab and writing a newline at the end of each list. I haven't tested it but theoretically I think it should work.
Instead of jumping on an answering your question I'm going to give you a hint on how to tackle your actual problem.
There is another way to store data like that (dictionaries of non-tabular form) and it is by saving it in the JSON-string format.
import json
with open('fname','w') as f:
json.dump(dict_with_lists, f)
And then the code to load it would be:
import json
with open('fname') as f:
dict_with_lists = json.load(f)

Trouble with turning .txt file into lists

In a project I am doing I am storing the lists in a .txt file in order to save space in my actual code. I can turn each line into a separate list but I need to turn multiple lines into one list where each letter is a different element. I would appreciate any help I can get. The program I wrote is as follows:
lst = open('listvariables.txt', 'r')
data = lst.readlines()
for line in data:
words = line.split()
print(words)
Here is a part of the .txt file I am using:
; 1
wwwwwwwwwww
wwwwswwwwww
wwwsswwwssw
wwskssssssw
wssspkswssw
wwwskwwwssw
wwwsswggssw
wwwswwgwsww
wwsssssswww
wwssssswwww
wwwwwwwwwww
; 2
wwwwwwwww
wwwwwsgsw
wwwwskgsw
wwwsksssw
wwskpswww
wsksswwww
wggswwwww
wssswwwww
wwwwwwwww
If someone could make the program print out two lists that would be great.
You can load the whole file and turn it into a char-list like:
with open('listvariables.txt', 'r') as f
your_list = list(f.read())
I'm not sure why you want to do it, tho. You can iterate over string the same way you can iterate over a list - the only advantage is that list is a mutable object but you wouldn't want to do complex changes to it, anyway.
If you want each character in a string to be an element of the final list you should use
myList = list(myString)
If I understand you correctly this should work:
with open('listvariables.txt', 'r') as my_file:
my_list = [list(line) for line in my_file.read().splitlines()]

Categories

Resources