I have a list/array of strings:
l = ['jack','jill','bob']
Now I need to create a table in slite3 for python using which I can insert this array into a column called "Names". I do not want multiple rows with each name in each row. I want a single row which contains the array exactly as shown above and I want to be able to retrieve it in exactly the same format. How can I insert an array as an element in a db? What am I supposed to declare as the data type of the array while creating the db itself? Like:
c.execute("CREATE TABLE names(id text, names ??)")
How do I insert values too? Like:
c.execute("INSERT INTO names VALUES(?,?)",(id,l))
EDIT: I am being so foolish. I just realized that I can have multiple entries for the id and use a query to extract all relevant names. Thanks anyway!
You can store an array in a single string field, if you somehow genereate a string representation of it, e.g. sing the pickle module. Then, when you read the line, you can unpickle it. Pickle converts many different complex objects (but not all) into a string, that the object can be restored of. But: that is most likely not what you want to do (you wont be able to do anything with the data in the tabel, except selecting the lines and then unpickle the array. You wont be able to search.
If you want to have anything of varying length (or fixed length, but many instances of similiar things), you would not want to put that in a column or multiple columns. Thing vertically, not horizontally there, meaning: don't thing about columns, think about rows. For storing a vector with any amount of components, a table is a good tool.
It is a little difficult to explain from the little detail you give, but you should think about creating a second table and putting all the names there for every row of your first table. You'd need some key in your first table, that you can use for your second table, too:
c.execute("CREATE TABLE first_table(int id, varchar(255) text, additional fields)")
c.execute("CREATE TABLE names_table(int id, int num, varchar(255) name)")
With this you can still store whatever information you have except the names in first_table and store the array of names in names_table, just use the same id as in first_table and num to store the index positions inside the array. You can then later get back the array by doing someting like
SELECT name FROM names_table
WHERE id=?
ORDER BY num
to read the array of names for any of your rows in first_table.
That's a pretty normal way to store arrays in a DB.
This is not the way to go. You should consider creating another table for names with foreign key to names.
You could pickle/marshal/json your array and store it as binary/varchar/jsonfield in your database.
Something like:
import json
names = ['jack','jill','bill']
snames = json.dumps(names)
c.execute("INSERT INTO nametable " + snames + ";")
Related
I need a little help. My problem is that I don't understand how to connect a WHERE condition in SQLite with the insertion of multiple elements from a dictionary.
My goal is to compare the Location Column from the Dictionary with the Country column from the existing table.
But I can't find a solution how to approach this, to implement a WHERE condition.
my code:
def add_countries_to_table(self, countryList):
self.cursor.execute('''
INSERT OR IGNORE INTO country (Country)
VALUES (:Location)''', countryList)
self.db.saveChanges()
Thanks for any help.
Simple. You don't. An insert is an insert, there is no WHERE clause in it.
You can define your Country to be UNIQUE and use an 'UPSERT' (UPDATE or INSERT) to ignore already existing values in the dictionary.
So INSERT INTO country (Country) VALUES(:Location) ON CONFLICT(X) DO NOTHING would be something close to the command you're looking for.
Alternatively you can try using a Set instead of a Dictionary to prevent duplicate values beforehand and let your program/script deal with duplicates instead of the DB.
You need to use a playholder. I presume you tried that with (:Location). There are many possibilities for placeholder. Normally placeholder would be ? or %s.
I presume you want to work with the values of the dict.
If you want to insert multiple rows you need to use a tulpe.
def add_countries_to_table(self, countryList):
ctry_tl = ()
for row in countryList['Location']:
ctry_tulpe = (row)
ctry_tl.append(ctry_tuple)
self.cursor.execute('''
UPSERT INTO country
VALUES (?)''', ctry_tl)
self.db.commit()
I am using the python library Camelot to parse through multiple PDFs and pull out all tables within those PDF files. The first line of code yields back all of the tables that were scraped from the pdf in list format. I am looking for one table in particular that has a unique string in it. Thankfully, this string is unique to this table so I can, theoretically, use it to isolate the table that I want to grab.
These pdfs are more or less created in the same format, however there is enough variance that I cant just have a static call on the table that I want. For example, sometimes the table I want will be the first table scraped, and sometimes it will be the third. Therefore, I need to write some code to be able to select the table dynamically.
The workflow I have in my mind logically goes like this:
Create an empty list before the for loop to append the tables to. Call a for loop and iterate over each table in the list outputted by the Camelot code. If the table does not have the string I am looking for, delete all data in that table and then append the empty data frame to the empty list. If it does have the string I am looking for, append it to the empty list without deleting anything.
Is there a better way to go about this? Im sure there probably is.
I have put what I have so far put together in my code. Im struggling putting together a conditional statement to drop all of the rows of the dataframe if the string is present. I have found plenty of examples of dropping columns and rows if the string is present, but nothing for the entire data frame
import camelot
import pandas as pd
#this creates a list of all the tables that Camelot scrapes from the pdf
tables = camelot.read_pdf('pdffile', flavor ='stream', pages = '1-end')
#empty list to append the tables to
elist = []
for t in tables:
dftemp = t.df
#my attempt at dropping all the value if the unique value isnt found. THIS DOESNT WORK
dftemp[dftemp.values != "Unique Value", dftemp.iloc[0:0]]
#append to the list
elist.append(dftemp)
#combine all the dataframes in the list into one dataframe
dfcombined = pd.concat(elist)
You can use the 'in' operator on the numpy array returned by dftemp.values
link
for t in tables:
dftemp = t.df
#my attempt
if "Unique Value" in dftemp.values:
#append to the list
elist.append(dftemp)
You can do it in a single row:
dfcombined = pd.concat([t.df if "Unique Value" in t.df.values else pd.DataFrame() for t in tables ])
I'm building up a table in Python's SQLite module which will consist of 18 columns at least. These columns are named by times (for example "08-00"), all stored in a list called 'time_range'. I want to avoid writing up all 18 table names by hand in the SQL statement, since all this already exists inside the mentioned list and it would make the code quite ugly. Howver, this:
marks = '?,'*18
self.c.execute('''CREATE TABLE finishedJobs (%s)''' % marks, tuple(time_range))
did not work. As it seems, Python/SQLite does not accept parameters at this place. Is there any smart workaround for my purpose or do I really have to name every single column in a CREATE TABLE statement by hand?
How can i get the just the column headers of a table in KDB? Is there a special query for this? I am asking because when I pull the data from the table into python the column headers are lost. Thanks!
If you're using one of the two regular python APIs for getting data from KDB, I think both implement a Flip class from which you can get the column names (usually there is an x which is an array of strings).
Otherwise cols tableName gives you list of symbols (which would be deserialized as an array of strings in python)
When using the sqlite3 module in python, all elements of cursor.description except the column names are set to None, so this tuple cannot be used to find the column types for a query result (unlike other DB-API compliant modules). Is the only way to get the types of the columns to use pragma table_info(table_name).fetchall() to get a description of the table, store it in memory, and then match the column names from cursor.description to that overall table description?
No, it's not the only way. Alternatively, you can also fetch one row, iterate over it, and inspect the individual column Python objects and types. Unless the value is None (in which case the SQL field is NULL), this should give you a fairly precise indication what the database column type was.
sqlite3 only uses sqlite3_column_decltype and sqlite3_column_type in one place, each, and neither are accessible to the Python application - so their is no "direct" way that you may have been looking for.
I haven't tried this in Python, but you could try something like
SELECT *
FROM sqlite_master
WHERE type = 'table';
which contains the DDL CREATE statement used to create the table. By parsing the DDL you can get the column type info, such as it is. Remember that SQLITE is rather vague and unrestrictive when it comes to column data types.