Why is dbf-table still open, if using "with tbl.open()"? - python

I use Ethan Furmans dbf-python-module (v. 0.99.3). If I use this code:
import dbf
tbl = dbf.Table(os.path.join(db_pfad, tabelle + ".dbf"))
with tbl.open(mode=dbf.READ_ONLY) as tbl:
for rec in tbl:
...
tbl.close()
... everything is running fine.
But for my understanding of using with-clause, the last line tbl.close() should be redundant and superfluous. Leaving the range of with should close the table - not?
Anyway: If I ommit that line, the table will be left open!
Is this a bug in dbf-module or something I didn't get right about with-clause in python?

When the with block is entered, the table is checked to see if it was already open, and if so leaves it open on exit -- and you are manually opening it with the .open() call.
What you want to do is:
tbl = ...
with tbl:
# do stuff
That will open the table in read/write mode, and close it when done. If you need it to be opened read-only, then there's no point in using with:
tbl = ...
tbl.open(dbf.READ_ONLY)
for rec in tbl:
...
table.close()

Related

Read_sql returning results even though SQL table not present

I think I've lost my mind. I have created a python script to read a temp table in SQL SSMS. While testing, we found out that python is able to query and read the table even when it's not there/queryable in SSMS. I believe the DF is storing in cache or something but let me break down the problem into steps:
Starting point, the temp table is present in SSMS, MAIN_DF = python.read_sql('SELECT Statement') and stored in DF and saved to excel file (using ExcelWriter)
We delete the temp table in SQL, then run the python script again. To make sure, we use THE SAME 'SELECT' statement found in the python script in SSMS and it displays 'Invalid object name' which is correct because the table has been dropped. BUT when I run the python script again, it is able to query the table and get the same results it had before! It should be throwing the same error as SSMS because the table isn't there! Why isn't python starting from scratch when I run it? It seems to be holding information over from the initial run. How do I ensure I am starting from scratch every time I run it?
I have tried many things including starting the script with blank DF's so they should not have anything held over. 'MAIN_DF = pd.DataFrame()'. I have tried deleting the DF's at the end as well. 'del MAIN_DF'
I don't understand what is happening..
try:
conn = pyodbc.connect(r'Driver={SQL Server};Server=GenericServername;Database=testdb;Trusted_Connection=yes;')
print('Connected to SQL: ' + str(datetime.now()))
MAIN_DF = pd.read_sql('SELECT statement',conn)
print('Queried Main DF: ' + str(datetime.now()))
It's because I didn't close the connection conn.close() so it was cached in the memory and SQL didn't perform / close it

Python loop to create MYSQL insert into statements from data in the CSV

I need to write a python code that converts all the entries of a CSV to a mySQL insert into statement through a loop. I have csv files with about 6 million entries.
This code below can probably read a row.. Has some syntactically errors though. Can't really pin point as I don't have a background in coding.
file = open('physician_data.csv','r')
for row in file:
header_string = row
header_list = list(header_string.split(','))
number_of_columns = len(header_list)
insert_into_query= INSERT INTO physician_data (%s)
for i in range(number_of_columns):
if i != number_of_columns-1:
insert_into_query+="%s," %(header_list[i])
else:
# no comma after last column
insert_into_query += "%s)" %(header_list[i])
print(insert_into_query)
file.close
Can someone tell me how to make this work?
Please include error messages when you describe a problem (https://stackoverflow.com/help/mcve).
You may find the documentation for the CSV library quite helpful.
Use quotes where appropriate, e.g. insert_into_query = "INSERT..."
Call the close function like this: file.close()

Bulk insert into Vertica using Python using Uber's vertica-python package

Question 1 of 2
I'm trying to import data from CSV file to Vertica using Python, using Uber's vertica-python package. The problem is that whitespace-only data elements are being loaded into Vertica as NULLs; I want only empty data elements to be loaded in as NULLs, and non-empty whitespace data elements to be loaded in as whitespace instead.
For example, the following two rows of a CSV file are both loaded into the database as ('1','abc',NULL,NULL), whereas I want the second one to be loaded as ('1','abc',' ',NULL).
1,abc,,^M
1,abc, ,^M
Here is the code:
# import vertica-python package by Uber
# source: https://github.com/uber/vertica-python
import vertica_python
# write CSV file
filename = 'temp.csv'
data = <list of lists, e.g. [[1,'abc',None,'def'],[2,'b','c','d']]>
with open(filename, 'w', newline='', encoding='utf-8') as f:
writer = csv.writer(f, escapechar='\\', doublequote=False)
writer.writerows(data)
# define query
q = "copy <table_name> (<column_names>) from stdin "\
"delimiter ',' "\
"enclosed by '\"' "\
"record terminator E'\\r' "
# copy data
conn = vertica_python.connect( host=<host>,
port=<port>,
user=<user>,
password=<password>,
database=<database>,
charset='utf8' )
cur = conn.cursor()
with open(filename, 'rb') as f:
cur.copy(q, f)
conn.close()
Question 2 of 2
Are there any other issues (e.g. character encoding) I have to watch out for using this method of loading data into Vertica? Are there any other mistakes in the code? I'm not 100% convinced it will work on all platforms (currently running on Linux; there may be record terminator issues on other platforms, for example). Any recommendations to make this code more robust would be greatly appreciated.
In addition, are there alternative methods of bulk inserting data into Vertica from Python, such as loading objects directly from Python instead of having to write them to CSV files first, without sacrificing speed? The data volume is large and the insert job as is takes a couple of hours to run.
Thank you in advance for any help you can provide!
The copy statement you have should perform the way you want with regards to the spaces. I tested it using a very similar COPY.
Edit: I missed what you were really asking with the copy, I'll leave this part in because it might still be useful for some people:
To fix the whitespace, you can change your copy statement:
copy <table_name> (FIELD1, FIELD2, MYFIELD3 AS FILLER VARCHAR(50), FIELD4, FIELD3 AS NVL(MYFIELD3,'') ) from stdin
By using filler, it will parse that into something like a variable which you can then assign to your actual table field using AS later in the copy.
As for any gotchas... I do what you have on Solaris often. The only one thing I noticed is you are setting the record terminator, not sure if this is really something you need to do depending on environment or not. I've never had to do it switching between linux, windows and solaris.
Also, one hint, this will return a resultset that will tell you how many rows were loaded. Do a fetchone() and print it out and you'll see it.
The only other thing I can recommend might be to use reject tables in case any rows reject.
You mentioned that it is a large job. You may need to increase your read timeout by adding 'read_timeout': 7200, to your connection or more. I'm not sure if None would disable the read timeout or not.
As for a faster way... if the file is accessible directly on the vertica node itself, you could just reference the file directly in the copy instead of doing a copy from stdin and have the daemon load it directly. It's much faster and has a number of optimizations that you can do. You could then use apportioned load, and if you have multiple files to load you can just reference them all together in a list of files.
It's kind of a long topic, though. If you have any specific questions let me know.

Raspberry pi python mysql code stops saving data after a while

I'm new at this blog although I've used several tips on publications done here.
I have a code that reads the GPS position, and other data from another device. After this, I do some calculations and store the data in a database and a text file. My problem is that after a while it stops storing the data, but the code keeps running. Any idea why?
My code is a little extensive, but basically it creates two serial variables so I can read two different devices.
Then in a infinite loop I do the following:
open database
manipulate and analyze data
store data in txt file
store data in mysql
close connection
All of this with a try catch so i can save any exception in another text file.
The code I use to store ina text file is:
with open("datos.txt", "a") as mydata:
mydata.write("(" + str(hora)+","+str(latitud)+","+str(longitud)+",#"+str(color)+","+str(latitudReal)+","+str(longitudReal)+","+str(libre)+","+espectro+")" + ", " + cc1 + "\n")
mydata.flush()
mydata.close()
The code to store in the mysql database is:
try:
sql2 = "INSERT INTO pichincha101(hora, latitud, longitud, color, latitudReal, longitudReal, porcentaje, espectro) VALUES ('"+str(hora)+"','"+str(latitud)+"','"+str(longitud)+"','#"+str(color)+"','"+str(latitudReal)+"','"+str(longitudReal)+"','"+str(libre)+"','"+espectro+"')"
cur.execute(sql2)
db.commit()
except Exception as e:
print "error INSERT mysql"
print e
I'm also deleting some arrays and cleaning up the ram before doing it again with this code:
del res1
del res2
del ampp
del amp1
os.system("echo 3 > /proc/sys/vm/drop_caches")
It's the same problem even if I dont include the last code section I just published.
In advanced, thank you for your help.
Perhaps you have a timeout in your mysql (SQL_TIMEOUT) or your server settings? For example, when I work in Apache, I have to adjust the max_execution_time variable. Cheers!
The solution was empting the buffers from my serial devices. Hope it helps someone else.

pysqlite, save database, and open it later

Newbie to sql and sqlite.
I'm trying to save a database, then copy the file.db to another folder and open it. So far I created the database, copy and pasted the file.db to another folder but when I try to access the database the output says that it is empty.
So far I have
from pysqlite2 import dbapi2 as sqlite
conn = sqlite.connect('db1Thu_04_Aug_2011_14_20_15.db')
c = conn.cursor()
print c.fetchall()
and the output is
[]
You need something like
c.execute("SELECT * FROM mytable")
for row in c:
#process row
I will echo Mat and point out that is not valid syntax. More than that, you do not include any select request (or other sql command) in your example. If you actually do not have a select statement in your code, and you run fetchall on a newly created cursor, you can expect to get an empty list, which seems to be what you have.
Finally, do make sure that you are opening the file from the right directory. If you tell sqlite to open a nonexistent file, it will happily create a new, empty one for you.

Categories

Resources