So I'm trying to remove weird line breaks in order to read a LONG datatype field into a single field on Excel. Length of field does not matter as long as we get all the info into a single field.
After exporting the dataset from TOAD into a .txt flat file, if I open the file on Notepad, the rows are generated perfectly. However, when I open the file on Excel, weird line breaks are inserted to generate bad rows. These line breaks originate from the LONG datatype's line breaks, but I can't figure out to remove them so that I can view the good format on Excel.
I considered loading the .txt file in Python and do a "for line in file.readline" then a "line.replace("\n","")" for all the lines, but I'm not sure if the actual character is a "\n", and whether Python would read the bad line breaks like Excel as well.
Anyways, it's not a huge issue, but wanted to see if there was a quick or interesting fix out there. I could always do my analysis on the .txt file.
If those line breaks are CHR(10) and/or CHR(13), you could replace them with an empty string in SELECT, e.g.
select replace(replace(col, chr(10), ''), chr(13), '') as result
from some_table
Related
I am relatively new to python and I am making a small game that involves importing each line of the text from a .txt file, so that it can be printed to the user. To do this, I'm using linecache.getline() to get the specific line of the file that I want while not having the whole file stored as a list. However, if I use "\n" to create a new line, then linecache.getline() automatically inserts another backslash to "cancel" it.
For example, if in the text file I write
\nHello,
linecache.getline() will store it in the variable as
\\nHello
which prints as
\nHello.
Is there any way to stop this? I can post my specific code if required.
Any help with file manipulation will be appreciated since I am very new to it and thank you for looking at my question.
I'm looking for help in understanding why my name separator script isn't working. I am working through 'Automate the Boring Stuff With Python' and had the opportunity to test some things out at work today. I recognize this probably not the most efficient solution, but I'm trying to put my learning to work.
The Goal
I have an excel file with first and last names in a single cell. I need to separate these into two cells, one for first name and one for last name.
The Process
I began my saving the excel file as a .csv to then open in a text editor.
Used regular expressions to find the full name, grouping first and last names separately. (see the code in link provided)
I copy the raw .csv text to the clipboard using pyperclip (I don't know how to read from files yet.)
I extract the name data using the regex.
I run a for loop which creates a string with first name + ',' + last name + ',' so that excel will put the first and last names in different cells.
I want to end each firstName,lastName, pair with a new line so that my .csv file looks like:
firstname,lastname,
firstname2,lastname2,
etc...
I'm getting stuck on the last step. My for loop gets the firstname,lastname, pairs correct, but when I paste from the clipboard, the newline characters are not inserted. Everything is pasted as one huge string. Since I'm appending a new line character each cycle, shouldn't it paste everything on separate lines? Please help me understand what I'm missing!
Here is a link to my script: https://github.com/RNGeezus/name-separator/blob/master/name_separator.py
Here is what my .csv file looks like (recreated with dummy names to protect peopel's privacy):
my sample
Figured it out! Turns out I needed each pair to be followed by a \r\n. I was doing carriage returns, but no newlines. Doh!
How can I make python program return to the start of output area after writing 4 lines of data. For example Program outputs fields 1....field 4 in different lines,after this program wants to add some data to line of field 1 ,but output is coming on line 5. The program is for converting data into tabular form.
If you are writing to a file, you can use the seek() function to relocate the file pointer wherever you want. For example, f.seek(0,0) will take you to the beginning of the file and then you can output the next data item there. However, keep in mind that you'll need to first move the data that you already wrote to the file, otherwise it will be over-written; that is, you need to "make space" for the new data you want to write to the beginning of the file.
For a quick intro, see https://docs.python.org/3.5/tutorial/inputoutput.html, near the bottom of the page.
I want to convert a csv file to a db (database) file using python. How should I do it ?
You need to find a library that helps you to parse the csv file, or read the file line by line and parse it with standard python, it could be as simple as split the line on commas.
Insert in the Sqlite database. Here you have the python documentation on SQLite. You could also use sqlalchemy or other ORM .
Another way, could be using the sqlite shell itself.
I don't think this can be done in full generality without out-of-band information or just treating everything as strings/text. That is, the information contained in the CSV file won't, in general, be sufficient to create a semantically “satisfying” solution. It might be good enough to infer what the types probably are for some cases, but it'll be far from bulletproof.
I would use Python's csv and sqlite3 modules, and try to:
convert the cells in the first CSV line into names for the SQL columns (strip “oddball” characters)
infer the types of the columns by going through the cells in the second CSV file line (first line of data), attempting to convert each one first to an int, if that fails, try a float, and if that fails too, fall back to strings
this would give you a list of names and a list of corresponding probably types from which you can roll a CREATE TABLE statement and execute it
try to INSERT the first and subsequent data lines from the CSV file
There are many things to criticize in such an approach (e.g. no keys or indexes, fails if first line contains a field that is a string in general but just so happens to contain a value that's Python-convertible to an int or float in the first data line), but it'll probably work passably for the majority of CSV files.
I want to
open file
add 4 underline character to beginning of line
find blank lines
replace the newline character in the blank lines with 50 underline characters
add new lines before and after 50 underline characters
I found many similar questions in stackoverflow but I could not combine all these operations without getting errors. See my previous question here. Is there a simple beginners way to accomplish this so that I can start from there? (I don't mind writing to the same file; there is no need to open two files) Thanks.
You're going to have to pick:
Use two files, but never have to store more than 1 line in memory at a time
or
Build the new file in memory as you read the original, then overwrite the original with the new
A file isn't a flexible memory structure. You can't replace the 1 or 2 characters from a newline with 50 underscores, it just doesn't work like that. If you are sure the new file is going to be a manageable size and you don't mind writing over the original, you can do it without having a new file.
Myself, I would always allow the user to opt for an output file. What if something goes wrong? Disk space is super cheap.
You can do everything you want reading the file first, performing the changes on the lines, and finally writing it back. If the file doesn't fit in memory, then you should read the file in batches and create an temporal file. You can't modify the file in situ.