As an attorney I am a total newbie in programimg. As an enthusiastic newbie, I learn a lot. (what are variables, ect.) ).
So I'm working a lot with dir() and I'm looking into results. It would by nicer if I could see the output in one or more columns. So I want to write my first program which write for example dir(sys) in a output file in columns.
So far I've got this:
textfile = open('output.txt','w')
syslist = dir(sys)
for x in syslist:
print(x)
The output on display is exactly what I want, but when I use the .write like:
textfile = open('output.txt','w')
syslist = dir(sys)
for x in syslist:
textfile.write(x)
textfile.close()
The text is in lines.
Can anyone pleaase help me, how to write the output of dir(sys) to a file in columns?
If I can ask you, please write the easysiet way, because I really have to look almost after for every word you write in documentation. Thanks in advance.
print adds a newline after the string printed by default, file.write doesn't. You can do:
for x in syslist: textfile.write("%s\n" % x)
...to add newlines as you're appending. Or:
for x in syslist: textfile.write("%s\t" % x)
...for tabs in between.
I hope this is clear for you "prima facie" ;)
The other answers seem to be correct if they guess that you're trying to add newlines that .write doesn't provide. But since you're new to programming, I'll point out some good practices in python that end up making your life easier:
with open('output.txt', 'w') as textfile:
for x in dir(sys):
textfile.write('{f}\n'.format(f=x))
The 'with' uses 'open' as a context manager. It automatically closes the file it opens, and allows you to see at a quick glance where the file is open. Only keep things inside the context manager that need to be there. Also, using .format is often encouraged.
Welcome to Python!
The following code will give you a tab-separated list in three columns, but it won't justify the output for you. It's not fully optimized so it should be easier to understand, and I've commented the portions that were added.
textfile = open('output.txt','w')
syslist = dir(sys)
MAX_COLUMNS = 3 # Maximum number of columns to print
colcount = 0 # Track the column number
for x in syslist:
# First thing we do is add one to the column count when
# starting the loop. Since we're doing some math on it below
# we want to make sure we don't divide by zero.
colcount += 1
textfile.write(x)
# After each entry, add a tab character ("\t")
textfile.write("\t")
# Now, check the column count against the MAX_COLUMNS. We
# use a modulus operator (%) to get the remainder after dividing;
# any number divisible by 3 in our example will return '0'
# via modulus.
if colcount % MAX_COLUMNS == 0:
# Now write out a new-line ("\n") to move to the next line.
textfile.write("\n")
textfile.close()
Hope that helps!
I'm a little confused by your question, but I imagine the answer is as simple as adding in tabs. So change textfile.write(x) to textfile.write(x + "\t"), no? You can adjust the number of tabs based on the size of the data.
I'm editing my answer.
Note that dir(sys) gives you an array of string values. These string values do not have any formatting. The print x command adds a newline character by default, which is why you are seeing them each on their own line. However, write does not. So when you call write, you need to add in any necessary formatting. If you want it to look identical to the output of print, you need to add in write(x + "\n") to get those newline characters that print was automatically including.
Related
I am trying to compare two lists in python and produce two arrays that contain matching rows and non-matching rows, but the program prints the data in an ugly format. How can I clean I go about cleaning it up?
If you want to read the file without the \n character, you might consider doing the following
lines = list1.readlines()
lines2 = list2.readlines()
would read your file without the "\n" characters
Alternatively, for each line, you can do .strip("\n")
The "ugly format" might be because you are using print(match) (which is actually translated by Python to print ( repr(match) ), printing something that is more useful for debugging or as input back to Python - but not 'nice'.
If you want it printed 'nicely', you'd have to decide what format that would be and write the code for it. In the simplest case, you might do:
for i in match:
print(i)
(note your original list contains \n characters, that's what enumerating an open text file does. They will get printed, as well (together with the `\n' added by print() itself). I don't know if you want them removed or not. See the other answer for possible ways of getting rid of them.
I want to write to a file in tabular format and following is code I have written till now.
file_out=open("testing_string","w")
file_out.write("{0:<12} {1:<20} {2:<30}\n".format("TUPLE","LOGFILE STATUS","FSDB STATUS"))
file_out.write("{0:12}".format("Check"))
file_out.write("{0:12}".format("_5"))
file_out.close()
Testing_string looks like this.
TUPLE LOGFILE STATUS FSDB STATUS
Check _5
Problem is I want _5 to be with check. Please see that I cannot concatenate check with _5 as check is printed first in file then depeding on some logic I fill LOGFILE STATUS FSDB STATUS. If I am unable to fill status then I check if I have to append _5 or not. so due to this I cannot concatenate string.
How to then print _5 right next to Check?
In a perfect world, you wouldn't do what is given in the below answer. It is hacky and error-prone and really weird. In a perfect world you would figure out how to write out what you want before you actually write to disk. I assume the only reason you are even considering this is that you are maintaining some old and crusty legacy code and cannot do things "the right way".
This is not the most elegant answer, but you can use the backspace character to overwrite something previously written.
with open('test.txt', 'w') as file_out:
file_out.write("{0:<12} {1:<20} {2:<30}\n".format("TUPLE","LOGFILE STATUS","FSDB STATUS"))
file_out.write("{0:12}".format("Check"))
backup_amount = 12 - len("Check")
file_out.write("\b" * backup_amount)
file_out.write("{0:12}".format("_5"))
Output:
TUPLE LOGFILE STATUS FSDB STATUS
Check_5
This only works in this specific case because we are completely overwriting the previously written characters with new characters - the backspace nearly backs up the cursor but does not actually overwrite the previously written data. Observe:
with open('test.txt', 'w') as f:
f.write('hello')
f.write('\b\b')
f.write('p')
Output:
helpo
Since we backspaced two characters but only wrote one, the original second character still exists. You would have to manually write ' ' characters to overwrite these.
Because of this caveat, you will probably have to start messing with the length of the format codes (i.e. '{0:12}' might need to become '{0:5}' or something else) when you add '_5'. It's going to become messy.
The problem is that you are specifying 12 characters for Check. Try this:
file_out=open("testing_string","w")
file_out.write("{0:<12} {1:<20} {2:<30}\n".format("TUPLE","LOGFILE STATUS","FSDB STATUS"))
file_out.write("{0:5}".format("Check"))
file_out.write("{0:7}".format("_5"))
file_out.close()
I have a long text file that I am trying to pull certain strings out of. The length of these strings are variable with the text file but are always located after certain identifiers. So for example say my text file looks like this:
junk text...
Name:
Age:
Robert
twenty
four.
junk text...
I always know that the "Robert" string is located at "Age:\n\n" but I am not sure how long it is only that it will end at a "\n\n" and the same principle with the "twenty four." string. I have tried using
namepos1 = string.find("Age:")
namepos2 = namepos1 + 6
this will give the starting location of the string I want but I do not know how to save it into a variable such that it always saves the whole string up to the two new line characters. If it was a set length and not variable I think I could use:
name = string[namepos2:length]
but any help would be greatly appreciated. I may have to go about doing it completely different, but this is the first way I have thought about it and tried to do it.
Thanks!
You could do this by finding age, then moving forward your cursor two lines if you would like to do that, if you want the entire section of text after the "junk", and you know how long that text is, this would also work:
lookup = 'age'
lines=[]
with open('C:/Users/Luke/Desktop/Summer 2016/Programs/untitled5.txt') as myFile:
for num, line in enumerate(myFile, 1):
if lookup in line:
lines.append(num+2)
ofile=open('C:/Users/Luke/Desktop/Summer 2016/Programs/untitled5.txt')
line=ofile.readlines()
interestinglines=''
for i in range(len(lines)):
interestinglines+=(line[lines[i]]+'\n')
you may need to tinker with it a bit, but I believe this should reproduce mostly what you're looking for. The '\n' is added onto the line[lines[i]] so that you may save it to a new file.
After you found the location in string, you can split the String by \n\n and get the first item.
s = file_str[namepos2 :]
name = s.split('\n\n')[0]
I am new to python and have a question:
I have checked similar questions, checked the tutorial dive into python, checked the python documentation, googlebinging, similar Stack Overflow questions and a dozen other tutorials.
I have a section of python code that reads a text file containing 20 tweets. I am able to extract these 20 tweets using the following code:
with open ('output.txt') as fp:
for line in iter(fp.readline,''):
Tweets=json.loads(line)
data.append(Tweets.get('text'))
i=0
while i < len(data):
print data[i]
i=i+1
The above while loop iterates perfectly and prints out the 20 tweets (lines) from output.txt.
However, these 20 lines contain Non-English Character data like "Los ladillo a los dos, soy maaaala o maloooooooooooo", URLs like "http://t.co/57LdpK", the string "None" and Photos with a URL like so "Photo: http://t.co/kxpaaaaa(I have edited this for privacy)
I would like to purge the output of this (which is a list), and exclude the following:
The None entries
Anything beginning with the string "Photo:"
It would be a bonus also if I can exclude non-unicode data
I have tried the following bits of code
Using data.remove("None:") but I get the error list.remove(x): x not in list.
Reading the items I do not want into a set and then doing a comparison on the output but no luck.
Researching into list comprehensions, but wonder if I am looking at the right solution here.
I am from an Oracle background where there are functions to chop out any wanted/unwanted section of output, so really gone round in circles in the last 2 hours on this. Any help greatly appreciated!
Try something like this:
def legit(string):
if (string.startswith("Photo:") or "None" in string):
return False
else:
return True
whatyouwant = [x for x in data if legit(x)]
I'm not sure if this will work out of the box for your data, but you get the idea. If you're not familiar, [x for x in data if legit(x)] is called a list comprehension
First of all, only add Tweet.get('text') if there is a text entry:
with open ('output.txt') as fp:
for line in iter(fp.readline,''):
Tweets=json.loads(line)
if 'text' in Tweets:
data.append(Tweets['text'])
That'll not add None entries (.get() returns None if the 'text' key is not present in the dictionary).
I'm assuming here that you want to further process the data list you are building here. If not, you can dispense with the for entry in data: loops below and stick to one loop with if statements. Tweets['text'] is the same value as entry in the for entry in data loops.
Next, you are looping over python unicode values, so use the methods provided on those objects to filter out what you don't want:
for entry in data:
if not entry.startswith("Photo:"):
print entry
You can use a list comprehension here; the following would print all entries too, in one go:
print '\n'.join([entry for entry in data if not entry.startswith("Photo:")])
In this case that doesn't really buy you much, as you are building one big string just to print it; you may as well just print the individual strings and avoid the string building cost.
Note that all your data is Unicode data. What you perhaps wanted is to filter out text that uses codepoints beyond ASCII points perhaps. You could use regular expressions to detect that there are codepoints beyond ASCII in your text
import re
nonascii = re.compile(ur'[^\x00-0x7f]', re.UNICODE) # all codepoints beyond 0x7F are non-ascii
for entry in data:
if entry.startswith("Photo:") or nonascii.search(entry):
continue # skip the rest of this iteration, continue to the next
print entry
Short demo of the non-ASCII expression:
>>> import re
>>> nonascii = re.compile(ur'[^\x00-\x7f]', re.UNICODE)
>>> nonascii.search(u'All you see is ASCII')
>>> nonascii.search(u'All you see is ASCII plus a little more unicode, like the EM DASH codepoint: \u2014')
<_sre.SRE_Match object at 0x1086275e0>
with open ('output.txt') as fp:
for line in fp.readlines():
Tweets=json.loads(line)
if not 'text' in Tweets: continue
txt = Tweets.get('text')
if txt.replace('.', '').replace('?','').replace(' ','').isalnum():
data.append(txt)
print txt
Small and simple.
Basic principle, one loop, if data matches your "OK" criteria add it and print it.
As Martijn pointed out, 'text' might not be in all the Tweets data.
Regexp replacement for .replace() would go something along the lines of: if re.match('^[\w-\ ]+$', txt) is not None: (it will not work for blankspace etc so yea as mentioned below..)
I'd suggest something like the following:
# use itertools.ifilter to remove items from a list according to a function
from itertools import ifilter
import re
# write a function to filter out entries you don't want
def my_filter(value):
if not value or value.startswith('Photo:'):
return False
# exclude unwanted chars
if re.match('[^\x00-\x7F]', value):
return False
return True
# Reading the data can be simplified with a list comprehension
with open('output.txt') as fp:
data = [json.loads(line).get('text') for line in fp]
# do the filtering
data = list(ifilter(my_filter, data))
# print the output
for line in data:
print line
Regarding unicode, assuming you're using python 2.x, the open function won't read data as unicode, it'll be read as the str type. You might want to convert it if you know the encoding, or read the file with a given encoding using codecs.open.
Try this:
with open ('output.txt') as fp:
for line in iter(fp.readline,''):
Tweets=json.loads(line)
data.append(Tweets.get('text'))
i=0
while i < len(data):
# these conditions will skip (continue) over the iterations
# matching your first two conditions.
if data[i] == None or data[i].startswith("Photo"):
continue
print data[i]
i=i+1
I can't seem to figure out how to use values given in a text file and import them into python to create a list. What I'm trying to accomplish here is to create a gameboard and then put numbers on it as a sample set. I have to use Quickdraw to accomplish this - I kind of know how to get the numbers on Quickdraw but I cannot seem to import the numbers from the text file. Previous assignments involved getting the user to input values or using an I/O redirection, this is a little different. Could anyone assist me on this?
Depends on the contents of the file you want to read and output in the list you want to get.
# assuming you have values each on separate line
values = []
for line in open('path-to-the-file'):
values.append(line)
# might want to implement stripping newlines and such in here
# by using line.strip() or .rstrip()
# or perhaps more than one value in a line, with some separator
values = []
for line in open('path-to-the-file'):
# e.g. ':' as a separator
separator = ':'
line = line.split(separator)
for value in line:
values.append(value)
# or all in one line with separators
values = open('path-to-the-file').read().split(separator)
# might want to use .strip() on this one too, before split method
It could be more accurate if we knew the input and output requirements.
Two steps here:
open the file
read the lines
This page might help you: http://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects