mysql load data infile for a list of files

mysql load data infile for a list of files - python

I am using ubuntu 12.04 operating system. I have a folder full of .csv files. I need to import all these csv files into a mysql data base on the local machine. Currently, I have been using this syntax, from the mysql command line, to load the csv files into the data base 1 by 1:
load data local infile 'file_name.csv' into table table_name fields terminated by ',' optionally enclosed by '"' lines terminated by '\r\n';
This works really well. I want to know if there is a way that I could load all these files at once. My first idea was to make a python script to handle it:
import MySQLdb as mysql
import os
import string
db=mysql.connect(host="localhost",user="XXXX",passwd="XXXX",db="test")
l = os.listdir(".")
for file_name in l:
print file_name
c=db.cursor()
if (file_name.find("DIV.csv")>-1):
c.execute("""load data local infile '%s' into table cef_div_table fields terminated by ',' optionally enclosed by '"' lines terminated by '\r\n';""" % file_name)
With this solution, I am running into the problem that load data local infile will not work with the new versions of MySQL clients, unless I start MySQL from the command line with the --local-infile option. That is really a drag...
I found a solution that seemed to work. I use the local_file = 1 option when establishing the connection in python (as suggested here: MySQL LOAD DATA LOCAL INFILE Python). This way, the code appears to complete without any errors, but nothing is every uploaded to the database.
It is strange, just to make sure, I tried uploading a single file from the mysql command line, and it worked file.
I am willing to try another solution to this problem of uploading multiple csv files into mysql all at once. Any help is greatly appreciated!

Related

Exporting zipped folder only with csv content

1.I am using Oracle and the idea is to use python script to export tables as zipped folder containing csv file which holds my data.
2.Additionaly: Is it possible to save this data in csv per columns. For example, I have 4 columns and all of them are stored in 1 column in csv.
see this image
This is my script:
import os
import cx_Oracle
import csv
import zipfile
connection = cx_Oracle.connect('dbconnection')
SQL = "SELECT * FROM AIR_CONDITIONS_YEARLY_MVIEW ORDER BY TIME"
filename = "sample.csv"
cursor = connection.cursor()
cursor.execute(SQL)
with open (filename, 'r') as output:
writer = csv.writer (output)
writer.writerow([i[0] for i in cursor.description])
writer.writerows(cursor.fetchall())
air_zip = zipfile.ZipFile("sample.zip", 'w')
air_zip.write(filename, compress_type=zipfile.ZIP_DEFLATED)
cursor.close()
connection.close()
air_zip.close()
Code I did exports me separately both csv and zipped folder with proper csv file inside and I want to keep exporting this zipped folder only!
Both sample.zip containing sample.csv as expected and sample.csv generated at the same time.

There are a list of problems:
The .csv file is not properly formatted (a row is seen as a single record (string) instead of a sequence of records):
Looking (blindly) at the code and tools (csv.writer, cx_Oracle) documentation, it seems correct
When noticing that the file is opened with Excel, I remembered that at some point I had a similar issue. A quick search yielded [SuperUser]: How to get Excel to interpret the comma as a default delimiter in CSV files?
. And this was the culprit (the .csv file looks fine in an online viewer / editor)
Code "exporting" both .csv and .zip files (I don't know what export means, I assume generate - meaning that after running the code, both files are present):
The way of getting around this is by deleting the .csv file after it was archived into the .zip file. Translated into code that would mean adding at the end of the current script snippet:
os.unlink(filename)
As a final observation (if one wants to be pedantic), the lines that close the cursor and the databbase could be moved just after the with block or before air_zip creation (there's no point keeping them open while archiving).

Error Exporting .csv from PostgreSQL using Python 3

I have a simple PostgreSQL copy statement that copies a table from a network host (NETWORK-HOST) database to a .csv file on a shared network folder. It works fine in PGAdmin, but when I transfer it to a psycopg2 script, it tells me permission denied. I have double checked to make sure full control is granted to my username on the network share, but that has not made a difference. I am running Windows 10, Python 3 (32 bit), PostgreSQL 9.5.1.
Here is the script in PGAdmin that runs successfully:
copy "Schema".county_check_audit to '\\NETWORK-HOST\NetworkFolder\county_check.csv' delimiter ',' CSV HEADER;
here is the script where I get the permission error:
import psycopg2
connection = psycopg2.connect(database="db", user="postgres", password="password", host="NETWORK-HOST")
cursor = connection.cursor()
cursor.execute("copy \"Schema\".area_check_audit to '\\\\NETWORK-HOST\\NetworkFolder\\area_check.csv' delimiter ',' CSV HEADER;")
connection.commit()
This is the error:
psycopg2.ProgrammingError: could not open file "\\NETWORK-HOST\NetworkFolder\area_check.csv" for writing: Permission denied
Any insights are greatly appreciated.

According to the error message, you have to add write access to the file.
To change file'security access on windows, check : Permission denied when trying to import a CSV file from PGAdmin
I'd suggest to test your code first by trying to write to a file that is on the same host and once you are sure your code is fine, you can debug the access rights to your file on another host.

Optparse to find a string

I have a mysql database and I am trying to print all the test result from a specific student. I am trying to create a command line where I enter the username and then it will shows his/her test result.
I visited this page already but I couldn't get my answer.
optparse and strings
#after connecting to mysql
cursor.execute("select * from database")
def main():
parser = optparse.OptionParser()
parser.add_option("-n", "--name", type="string", help = "student name")
(options, args) = parser.parse_args()
studentinfo = []
f = open("Index", "r")
#Index is inside database, it is a folder holds all kinds of files

Well, the first thing you should do is not use optparse, as it's deprecated - use argparse instead. The help I linked you to is quite useful and informative, guiding you through creating a parser and setting the different options. After reading through it you should have no problem accessing the variables passed from the command line.
However, there are other errors in your script as well that will prevent it from running. First, you can't open a directory with the open() command - you need to use os.listdir() for that, then read the resulting list of files. It is also very much advisable to use a context manager when open()ing files:
filelist = os.listdir("/path/to/Index")
for filename in filelist:
with open(filename, "r") as f:
for line in f:
# do stuff with each line
This way you don't need to worry about closing the file handler later on, and it's just a generally cleaner way of doing things.
You don't provide enough information in your question as to how to get the student's scores, so I'm afraid I can't help you there. You'll (I assume) have to connect the data that's coming out of your database query with the files (and their contents) in the Index directory. I suspect that if the student scores are kept in the DB, then you'll need to retrieve them from the DB using SQL, instead of trying to read raw files in the filesystem. You can easily get the student of interest's name from the command line, but then you'll have to interpolate that into a SQL query to find the correct table, select the rows from the table corresponding to the student's test scores, then process the results with Python to print out a pretty summary.
Good luck!

How to save many CSV files from URL

I have many CSV files that I need to get from a URL. I found this reference: How to read a CSV file from a URL with Python?
It does almost the thing I want, but I don't want to go through Python to read the CSV and then have to save it. I just want to directly save the CSV file from the URL to my hard drive.
I have no problem with for loops and cycling through my URLs. It is simply a matter of saving the CSV file.

If all you want to do is save a csv, then I wouldn't suggest using python at all. In fact this is more of a unix question. Making the assumption here that you're working on some kind of *nix system I would suggest just using wget. For instance:
wget http://someurl/path/to/file.csv
You can run this command directly from python like so:
import subprocess
bashCommand = lambda url, filename: "wget -O %s.csv %s" % (filename, url)
save_locations = {'http://someurl/path/to/file.csv': 'test.csv'}
for url, filename in save_locations.items():
process = subprocess.Popen(bashCommand(url, filename).split(), stdout=subprocess.PIPE)
output = process.communicate()[0]

load multiple txt files into mysql

I have more than 40 txt files needed to be loaded into a table in Mysql. Each file contains 3 columns of data, each column lists one specific type of data, but in general the format of each txt file is exactly the same, but these file names are various, first I tried LOAD DATA LOCAL INFILE 'path/*.txt' INTO TABLE xxx"
Cause I think maybe use *.txt can let Mysql load all the txt file in this folder. But it turned out no.
So how can I let Mysql or python do this? Or do I need to merge them into one file manually first, then use LOAD DATA LOCAL INFILE command?
Many thanks!

If you want to avoid merging your text files, you can easily "scan" the folder and run the SQL import query for each file:
import os
for dirpath, dirsInDirpath, filesInDirPath in os.walk("yourFolderContainingTxtFiles"):
for myfile in filesInDirPath:
sqlQuery = "LOAD DATA INFILE %s INTO TABLE xxxx (col1,col2,...);" % os.path.join(dirpath, myfile)
# execute the query here using your mysql connector.
# I used string formatting to build the query, but you should use the safe placeholders provided by the mysql api instead of %s, to protect against SQL injections

The only and best way is to merge your data into 1 file. That's fairly easy using Python :
fout=open("out.txt","a")
# first file:
for line in open("file1.txt"):
fout.write(line)
# now the rest:
for num in range(2,NB_FILES):
f = open("file"+str(num)+".txt")
for line in f:
fout.write(line)
f.close() # not really needed
fout.close()
Then run the command you know (... INFILE ...) to load the one file to MySql. Works fine as long as your separation between columns are strictly the same. Tabs are best in my opinion ;)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.