I'm trying to insert data from a dictionary into a database using named parameters. I have this working with a simple SQL statement e.g.
SQL = "INSERT INTO status (location, arrival, departure) VALUES (:location, :arrival,:departure)"
dict = {'location': 'somewhere', 'arrival': '1000', 'departure': '1001'}
c.execute(SQL,dict)
Inserts somewhere into location, 1000 into the arrival column, and 1001 into departure column.
The data that I will actually have will contain location but may contain either arrival, or departure but might not have both (in which case either nothing or NULL can go into the table). In this case, I get sqlite3.ProgrammingError: You did not supply a value for binding 2.
I can fix this by using defaultdict:
c.execute(SQL,defaultdict(str,dict))
To make things slightly more complicated, I will actually have a list of dictionaries containing multiple locations with either an arrival or departure.
({'location': 'place1', 'departure': '1000'},
{'location': 'palce2', 'arrival': '1010'},
{'location': 'place2', 'departure': '1001'})
and I want to be able to run this with c.executemany however I now can't use defaultdict.
I could loop through each dictionary in the list and run many c.execute statements, but executemany seems a tidier way to do it.
I've simplified this example for convenience, the actual data has many more entries in the dictionary, and I build it from a JSON text file.
Anyone have any suggestions for how I could do this?
Use None to insert a NULL:
dict = {'location': 'somewhere', 'arrival': '1000', 'departure': None}
You can use a default dictionary and a generator to use this with executemany():
defaults = {'location': '', 'arrival': None, 'departure': None}
c.executemany(SQL, ({k: d.get(k, defaults[k]) for k in defaults} for d in your_list_of_dictionaries)
There is a simpler solution to this problem that should be feasible in most cases; just pass to executemany a list of defaultdict instead of a list of dict.
In other words, if you build from scratch your rows as defaultdict you can pass the list of defaultdict rows directly to the command executemany, instead of building them as dictionaries and later patch the situation before using executemany.
The following working example (Python 3.4.3) shows the point:
import sqlite3
from collections import defaultdict
# initialization
db = sqlite3.connect(':memory:')
c = db.cursor()
c.execute("CREATE TABLE status(location TEXT, arrival TEXT, departure TEXT)")
SQL = "INSERT INTO status VALUES (:location, :arrival, :departure)"
# build each row as a defaultdict
f = lambda:None # use str if you prefer
row1 = defaultdict(f,{'location':'place1', 'departure':'1000'})
row2 = defaultdict(f,{'location':'place2', 'arrival':'1010'})
rows = (row1, row2)
# insert rows, executemany can be safely used without additional code
c.executemany(SQL, rows)
db.commit()
# print result
c.execute("SELECT * FROM status")
print(list(zip(*c.description))[0])
for r in c.fetchall():
print(r)
db.close()
If you run it, it prints:
('location', 'arrival', 'departure')
('place1', None, '1000') # None in Python maps to NULL in sqlite3
('place2', '1010', None)
Related
How to save (and also restore, and add elements to) a set of strings in a Sqlite3 database?
This does not work because sets are not JSON-serializable:
import sqlite3, json
db = sqlite3.connect(':memory:')
db.execute('CREATE TABLE t(id TEXT, myset TEXT);')
s = {'a', 'b', 'c'}
db.execute("INSERT INTO t VALUES (?, ?);", ('1', json.dumps(s)))
# Error: Object of type set is not JSON serializable
so we can use a list, or a dict with dummy values:
s = list(s)
# or s = {'a':0, 'b':0, 'c': 0}
db.execute("INSERT INTO t VALUES (?, ?);", ('1', json.dumps(s)))
# RETRIEVE A SET FROM DB
r = db.execute("SELECT myset FROM t WHERE id = '1'").fetchone()
if r is not None:
s = set(json.loads(r[0]))
print(s)
Then adding a string element to a set already in the DB is not very elegant:
one has to SELECT,
retrieve as string,
parse the JSON with json.loads,
convert from list to set,
add an element to the set,
convert from set to list (or, as an alternative for these 3 last steps: check if the element is already present in the list, and add it or not to the list)
JSONify it with json.dumps,
database UPDATE
Is there a more pythonic way to work with sets in a Sqlite database?
You can register adapter and converter functions with sqlite that will automatically perform the desired conversions.
import json
import sqlite3
def adapt_set(value):
return json.dumps(list(value))
def convert_set(value):
return set(json.loads(value))
sqlite3.register_adapter(set, adapt_set)
sqlite3.register_converter('set_type', convert_set)
Once these functions have been registered, pass detect_types to the connection factory to tell sqlite how to use them.
Passing sqlite3.PARSE_DECLTYPE will make the connection use the declared type to look up the adapter/converter.
db = sqlite3.connect(':memory:', detect_types=sqlite3.PARSE_DECLTYPES)
# Declare the myset column as type "set_type".
db.execute('CREATE TABLE t(id TEXT, myset set_type);')
db.execute("INSERT INTO t VALUES (?, ?);", ('1', {1, 2, 3}))
r = db.execute("""SELECT myset FROM t WHERE id = '1'""").fetchone()
print(r[0]) # <- r[0] is a set.
Passing sqlite.PARSE_COLNAMES will cause the column name in the cursor description to be searched for the type name enclosed in square brackets.
db = sqlite3.connect(':memory:', detect_types=sqlite3.PARSE_COLNAMES)
# The type is not declared in the created table statement.
db.execute('CREATE TABLE t(id TEXT, myset TEXT);')
db.execute("INSERT INTO t VALUES (?, ?);", ('1', {1, 2, 3}))
# Include the type in the column label.
r = db.execute("""SELECT myset "AS myset [set_type]" FROM t WHERE id = '1'""").fetchone()
print(r[0]) <- r[0] is a set
I would register a "set adapter" that converts a set into a byte string by simply taking the string representation of the set and encoding it into a bytes string for storage and a "set converter" that converts our user-defined column type named "set_type" (pick any alternate you wish) from a byte string back into a set by decoding the byte string back into a Unicode string and then applying eval against it:
import sqlite3
from decimal import Decimal
db = sqlite3.connect(':memory:', detect_types=sqlite3.PARSE_DECLTYPES|sqlite3.PARSE_COLNAMES)
def set_adapter(the_set):
return str(the_set).encode('utf-8')
def set_converter(s):
return eval(s.decode('utf-8'))
sqlite3.register_adapter(set, set_adapter)
sqlite3.register_converter('set_type', set_converter)
# Define the columns with type set_type:
db.execute('CREATE TABLE t(id TEXT, myset set_type);')
s = {'a', 'b', 'c', (1, 2, 3), True, False, None, b'abcde', Decimal('12.345')}
# Our adapter will store s as a byte string:
db.execute("INSERT INTO t VALUES (?, ?);", ('1', s))
cursor = db.cursor()
# Our converter will convert column type set_type from bytes to set:
cursor.execute('select myset from t')
row = cursor.fetchone()
s = row[0]
print(type(s), s)
db.close()
Prints:
<class 'set'> {False, True, 'b', None, (1, 2, 3), Decimal('12.345'), b'abcde', 'a', 'c'}
As you can see, this also handles more datatypes than JSON can after converting a set to a list. JSON could potentially handle as many data types but only if you write JSON converters (which usually means converting those data types to one of the supported types such as strings).
Defining the adapter and converter to use pickle as in the answer offered by Jonathan Feenstra will result in a speed improvement but possibly use more storage. But I would use the technique outlined above, i.e. using adapter and converter functions with a special user-defined column type:
import pickle
def set_adapter(the_set):
return pickle.dumps(the_set, pickle.HIGHEST_PROTOCOL)
def set_converter(s):
return pickle.loads(s)
A simpler way to store the set in the SQLite database is to use the BLOB datatype for the myset column and serialise it to bytes using pickle.dumps:
import sqlite3
import pickle
db = sqlite3.connect(":memory:")
db.execute("CREATE TABLE t(id TEXT, myset BLOB)")
s = {"a", "b", "c"}
db.execute(
"INSERT INTO t VALUES (?, ?)",
("1", sqlite3.Binary(pickle.dumps(s, pickle.HIGHEST_PROTOCOL))),
)
r = db.execute("SELECT myset FROM t WHERE id = '1'").fetchone()
if r is not None:
s = pickle.loads(r[0])
print(s)
To add new elements to a set in the database, the serialisation and deserialisation steps are still required, but no more conversion to/from a list or checking for elements that are already present in the set.
Alternatively, you could ensure the uniqueness of the ID-element combination at database-level, for example using a composite primary key:
import sqlite3
db = sqlite3.connect(":memory:")
db.execute("CREATE TABLE t(id TEXT, element TEXT, PRIMARY KEY(id, element))")
s = {"a", "b", "c"}
for element in s:
db.execute("INSERT INTO t VALUES(?, ?)", ("1", element))
r = db.execute("SELECT element FROM t WHERE id = '1'").fetchall()
s = set(row[0] for row in r)
print(s)
You can use aiosqlitedict
Here is what it can do
Easy conversion between sqlite table and Python dictionary and vice-versa.
Get values of a certain column in a Python list.
Order your list ascending or descending.
Insert any number of columns to your dict.
Getting Started
We start by connecting our database along with
the reference column
from aiosqlitedict.database import Connect
countriesDB = Connect("database.db", "user_id")
Make a dictionary
The dictionary should be inside an async function.
async def some_func():
countries_data = await countriesDB.to_dict("my_table_name", 123, "col1_name", "col2_name", ...)
You can insert any number of columns, or you can get all by specifying
the column name as '*'
countries_data = await countriesDB.to_dict("my_table_name", 123, "*")
so you now have made some changes to your dictionary and want to
export it to sql format again?
Convert dict to sqlite table
async def some_func():
...
await countriesDB.to_sql("my_table_name", 123, countries_data)
But what if you want a list of values for a specific column?
Select method
you can have a list of all values of a certain column.
country_names = await countriesDB.select("my_table_name", "col1_name")
to limit your selection use limit parameter.
country_names = await countriesDB.select("my_table_name", "col1_name", limit=10)
you can also arrange your list by using ascending parameter
and/or order_by parameter and specifying a certain column to order your list accordingly.
country_names = await countriesDB.select("my_table_name", "col1_name", order_by="col2_name", ascending=False)
So I am using psycopg2 on Python3.5 to insert some data into a postgresql database. What I would like to do is have two columns that are strings and have the last column just be a dict object. I don't need to search the dict, just be able to pull it out of the database and use it.
so for instance:
uuid = "testName"
otherString = ""
dict = {'id':'122','name':'test','number':'444-444-4444'}
# add code here to store two strings and dict to postgresql
cur.execute('''SELECT dict FROM table where uuid = %s''', 'testName')
newDict = cur.fetchone()
print(newDict['number'])
Is this possible, and if so how would I go about doing this?
If your PostgreSQL version is sufficiently new (9.4+) and psycopg version is >= 2.5.4 all the keys are strings and values can be represented as JSON, it would be best to store this into a JSONB column. Then, should the need arise, the column would be searchable too. Just create the table simply as
CREATE TABLE thetable (
uuid TEXT,
dict JSONB
);
(... and naturally add indexes, primary keys etc as needed...)
When sending the dictionary to PostgreSQL you just need to wrap it with the Json adapter; when receiving from PostgreSQL the JSONB value would be automatically converted into a dictionary, thus inserting would become
from psycopg2.extras import Json, DictCursor
cur = conn.cursor(cursor_factory=DictCursor)
cur.execute('INSERT into thetable (uuid, dict) values (%s, %s)',
['testName', Json({'id':'122','name':'test','number':'444-444-4444'})])
and selecting would be as simple as
cur.execute('SELECT dict FROM thetable where uuid = %s', ['testName'])
row = cur.fetchone()
print(row['dict']) # its now a dictionary object with all the keys restored
print(row['dict']['number']) # the value of the number key
With JSONB, PostgreSQL can store the values more efficiently than just dumping the dictionary as text. Additionally, it becomes possible to do queries with the data, for example just select the some of the fields from the JSONB column:
>>> cur.execute("SELECT dict->>'id', dict->>'number' FROM thetable")
>>> cur.fetchone()
['122', '444-444-4444']
or you could use them in queries if needed:
>>> cur.execute("SELECT uuid FROM thetable WHERE dict->>'number' = %s',
['444-444-4444'])
>>> cur.fetchall()
[['testName', {'id': '122', 'name': 'test', 'number': '444-444-4444'}]]
You can serialize the data using JSON before storing the data:
import json
data = json.dumps({'id':'122','name':'test','number':'444-444-4444'})
Then when retrieving the code you deserialize it:
cur.execute('SELECT dict from ....')
res = cur.fetchone()
dict = json.loads(res['dict'])
print(dict['number'])
I'm writing some code using psycopg2 to connect to a PostGreSQL database.
I have a lot of different data types that I want to write to different tables in my PostGreSQL database. I am trying to write a function that can write to each of the tables based on a single variable passed in the function and I want to write more than 1 row at a time to optimize my query. Luckily PostGreSQL allows me to do that: PostGreSQL Insert:
INSERT INTO films (code, title, did, date_prod, kind) VALUES
('B6717', 'Tampopo', 110, '1985-02-10', 'Comedy'),
('HG120', 'The Dinner Game', 140, DEFAULT, 'Comedy');
I have run into a problem that I was hoping someone could help me with.
I need to create a string:
string1 = (value11, value21, value31), (value12, value22, value32)
The string1 variable will be created by using a dictionary with values. So far I have been able to create a tuple that is close to the structure I want. I have a list of dictionaries. The list is called rows:
string1 = tuple([tuple([value for value in row.values()]) for row in rows])
To test it I have created the following small rows variable:
rows = [{'id': 1, 'test1': 'something', 'test2': 123},
{'id': 2, 'test1': 'somethingelse', 'test2': 321}]
When rows is passed through the above piece of code string1 becomes as follows:
((1, 'something', 123), (2, 'somethingelse', 321))
As seen with string1 I just need to remove the outmost parenthesis and make it a string for it to be as I need it. So far I don't know how this is done. So my question to you is: "How do I format string1 to have my required format?"
execute_values makes it much easier. Pass the dict sequence in instead of a values sequence:
import psycopg2, psycopg2.extras
rows = [
{'id': 1, 'test1': 'something', 'test2': 123},
{'id': 2, 'test1': 'somethingelse', 'test2': 321}
]
conn = psycopg2.connect(database='cpn')
cursor = conn.cursor()
insert_query = 'insert into t (id, test1, test2) values %s'
psycopg2.extras.execute_values (
cursor, insert_query, rows,
template='(%(id)s, %(test1)s, %(test2)s)',
page_size=100
)
And the values are inserted:
table t;
id | test1 | test2
----+---------------+-------
1 | something | 123
2 | somethingelse | 321
To have the number of affected rows use a CTE:
insert_query = '''
with i as (
insert into t (id, test1, test2) values %s
returning *
)
select count(*) from i
'''
psycopg2.extras.execute_values (
cursor, insert_query, rows,
template='(%(id)s, %(test1)s, %(test2)s)',
page_size=100
)
row_count = cursor.fetchone()[0]
With little modification you can achieve this.
change your piece of cod as follows
','.join([tuple([value for value in row.values()]).__repr__() for row in rows])
current output is
tuple of tuple
(('something', 123, 1), ('somethingelse', 321, 2))
After changes output will be
in string format as you want
"('something', 123, 1),('somethingelse', 321, 2)"
The solution that you described is not so well because potentially it may harm your database – that solution does not care about escaping string, etc. So SQL injection is possible.
Fortunately, psycopg (and psycopg2) has cursor's methods execute and mogrify that will properly do all this work for you:
import contextlib
with contextlib.closing(db_connection.cursor()) as cursor:
values = [cursor.mogrify('(%(id)s, %(test1)s, %(test2)s)', row) for row in rows]
query = 'INSERT INTO films (id, test1, test2) VALUES {0};'.format(', '.join(values))
For python 3:
import contextlib
with contextlib.closing(db_connection.cursor()) as cursor:
values = [cursor.mogrify('(%(id)s, %(test1)s, %(test2)s)', row) for row in rows]
query_bytes = b'INSERT INTO films (id, test1, test2) VALUES ' + b', '.join(values) + b';'
peewee allows bulk inserts via insert_many() and insert_from(), however insert_many() allows a list of data to be inserted, but does not allow data computed from other parts of the database. insert_from() does allow data computed from other parts of the database, but does not allow any data to be sent from python.
Example:
Assuming a model structure like so:
class BaseModel(Model):
class Meta:
database = db
class Person(BaseModel):
name = CharField(max_length=100, unique=True)
class StatusUpdate(BaseModel):
person = ForeignKeyField(Person, related_name='statuses')
status = TextField()
timestamp = DateTimeField(constraints=[SQL('DEFAULT CURRENT_TIMESTAMP')], index=True)
And some initial data:
Person.insert_many(rows=[{'name': 'Frank'}, {'name': 'Joe'}, {'name': 'Arnold'}]).execute()
print ('Person.select().count():',Person.select().count())
Output:
Person.select().count(): 3
Say we want to add a bunch new status updates, like the ones in this list:
new_status_updates = [ ('Frank', 'wat')
, ('Frank', 'nooo')
, ('Joe', 'noooo')
, ('Arnold', 'nooooo')]
We might try to use insert_many() like so:
StatusUpdate.insert_many( rows=[{'person': 'Frank', 'status': 'wat'}
, {'person': 'Frank', 'status': 'nooo'}
, {'person': 'Joe', 'status': 'noooo'}
, {'person': 'Arnold', 'status': 'nooooo'}]).execute()
But this would fail: the person field expects a Person model or a Person.id, and we would have to make an extra query to retrieve those from the names.
We might be able to avoid this with insert_from() allows us to make subqueries, but insert_from() has no way of processing our lists or dictionaries. What to do?
One idea is to use the SQL VALUES clause as part of a SELECT statement.
If you are familiar with SQL, you may have seen the VALUES clause before, it is commonly used as part of an INSERT statement like so:
INSERT INTO statusupdate (person_id,status)
VALUES (1, 'my status'), (1, 'another status'), (2, 'his status');
This tells the database to insert three rows - AKA tuples - into the table statusupdate.
Another way of inserting things though is to do something like:
INSERT INTO statusupdate (person_id,status)
SELECT ..., ... FROM <elsewhere or subquery>;
This is equivalent to the insert_from() functionality that peewee provides.
But there is another less common thing you can do: you can use the VALUES clause in any select to provide literal values. Example:
SELECT *
FROM (VALUES (1,2,3), (4,5,6)) as my_literal_values;
This will return a result-set of two rows/tuples, each with 3 values.
So, if you can convert the "bulk" insert into a SELECT/FROM/VALUES statement, you can then do whatever transformations you need to do (namely, convert Person.name values to corresponding Person.id values) and then combine it with the peewee 'insert_from()` functionality.
So let us see how this would look.
First let us begin constructing the VALUES clause itself. We want properly escaped values, so we will use question marks instead of the values for now, and put the actual values in later.
#this is gonna look like '(?,?), (?,?), (?,?)'
# or '(%s,%s), (%s,%s), (%s,%s)' depending on the database type
values_question_marks = ','.join(['(%s, %s)' % (db.interpolation,db.interpolation)]*len(new_status_updates))
The next step is to construct the values clause. Here is our first attempt:
--the %s here will be replaced by the question marks of the clause
--in postgres, you must have a name for every item in `FROM`
SELECT * FROM (VALUES %s) someanonymousname
OK, so now we have a result-set that looks like:
name | status
-----|-------
... | ...
Except! There are no column names. This will cause us a bit of heartache in a minute, so we have to figure out a way to give the result-set proper column names.
The postgres way would be to just alter the AS clause:
SELECT * FROM (VALUES %s) someanonymousname(name,status)
sqlite3 does not support that (grr).
So we are reduced to a kludge. Luckily stackoverflow provides: Is it possible to select sql server data using column ordinal position, and we can construct something like this:
SELECT NULL as name, NULL as status WHERE 1=0
UNION ALL
SELECT * FROM (VALUES %s) someanonymousname
This works by first creating an empty result-set with the proper column-names, and then concatenating the result-set from the VALUES clause to it. This will produce a result-set that has the proper column-names, will work in sqlite3, and in postgres.
Now to bring this back to peewee:
values_query = """
(
--a trick to make an empty query result with two named columns, to more portably name the resulting
--VALUES clause columns (grr sqlite)
SELECT NULL as name, NULL as status WHERE 1=0
UNION ALL
SELECT * FROM (VALUES %s) someanonymousname
)
"""
values_query %= (values_question_marks,)
#unroll the parameters into one large list
#this is gonna look like ['Frank', 'wat', 'Frank', 'nooo', 'Joe', 'noooo' ...]
values_query_params = [value for values in new_status_updates for value in values]
#turn it into peewee SQL
values_query = SQL(values_query,*values_query_params)
data_query = (Person
.select(Person.id, SQL('values_list.status').alias('status'))
.from_(Person,values_query.alias('values_list'))
.where(SQL('values_list.name') == Person.name))
insert_query = StatusUpdate.insert_from([StatusUpdate.person, StatusUpdate.status], data_query)
print (insert_query)
insert_query.execute()
print ('StatusUpdate.select().count():',StatusUpdate.select().count())
Output:
StatusUpdate.select().count(): 4
I am trying to setup a website in django which allows the user to send queries to a database containing information about their representatives in the European Parliament. I have the data in a comma seperated .txt file with the following format:
Parliament, Name, Country, Party_Group, National_Party, Position
7, Marta Andreasen, United Kingdom, Europe of freedom and democracy Group, United Kingdom Independence Party, Member
etc....
I want to populate a SQLite3 database with this data, but so far all the tutorials I have found only show how to do this by hand. Since I have 736 observations in the file I dont really want to do this.
I suspect this is a simple matter, but I would be very grateful if someone could show me how to do this.
Thomas
So assuming your models.py looks something like this:
class Representative(models.Model):
parliament = models.CharField(max_length=128)
name = models.CharField(max_length=128)
country = models.CharField(max_length=128)
party_group = models.CharField(max_length=128)
national_party = models.CharField(max_length=128)
position = models.CharField(max_length=128)
You can then run python manage.py shell and execute the following:
import csv
from your_app.models import Representative
# If you're using different field names, change this list accordingly.
# The order must also match the column order in the CSV file.
fields = ['parliament', 'name', 'country', 'party_group', 'national_party', 'position']
for row in csv.reader(open('your_file.csv')):
Representative.objects.create(**dict(zip(fields, row)))
And you're done.
Addendum (edit)
Per Thomas's request, here's an explanation of what **dict(zip(fields,row)) does:
So initially, fields contains a list of field names that we defined, and row contains a list of values that represents the current row in the CSV file.
fields = ['parliament', 'name', 'country', ...]
row = ['7', 'Marta Andreasen', 'United Kingdom', ...]
What zip() does is it combines two lists into one list of pairs of items from both lists (like a zipper); i.e. zip(['a','b,'c'], ['A','B','C']) will return [('a','A'), ('b','B'), ('c','C')]. So in our case:
>>> zip(fields, row)
[('parliament', '7'), ('name', 'Marta Andreasen'), ('country', 'United Kingdom'), ...]
The dict() function simply converts the list of pairs into a dictionary.
>>> dict(zip(fields, row))
{'parliament': '7', 'name': 'Marta Andreasen', 'country': 'United Kingdom', ...}
The ** is a way of converting a dictionary into a keyword argument list for a function. So function(**{'key': 'value'}) is the equivalent of function(key='value'). So in out example, calling create(**dict(zip(field, row))) is the equivalent of:
create(parliament='7', name='Marta Andreasen', country='United Kingdom', ...)
Hope this clears things up.
As SiggyF says and only slightly differently than Joschua:
Create a text file with your schema, e.g.:
CREATE TABLE politicians (
Parliament text,
Name text,
Country text,
Party_Group text,
National_Party text,
Position text
);
Create table:
>>> import csv, sqlite3
>>> conn = sqlite3.connect('my.db')
>>> c = conn.cursor()
>>> with open('myschema.sql') as f: # read in schema file
... schema = f.read()
...
>>> c.execute(schema) # create table per schema
<sqlite3.Cursor object at 0x1392f50>
>>> conn.commit() # commit table creation
Use csv module to read file with data to be inserted:
>>> csv_reader = csv.reader(open('myfile.txt'), skipinitialspace=True)
>>> csv_reader.next() # skip the first line in the file
['Parliament', 'Name', 'Country', ...
# put all data in a tuple
# edit: decoding from utf-8 file to unicode
>>> to_db = tuple([i.decode('utf-8') for i in line] for line in csv_reader)
>>> to_db # this will be inserted into table
[(u'7', u'Marta Andreasen', u'United Kingdom', ...
Insert data:
>>> c.executemany("INSERT INTO politicians VALUES (?,?,?,?,?,?);", to_db)
<sqlite3.Cursor object at 0x1392f50>
>>> conn.commit()
Verify that all went as expected:
>>> c.execute('SELECT * FROM politicians').fetchall()
[(u'7', u'Marta Andreasen', u'United Kingdom', ...
Edit:
And since you've decoded (to unicode) on input, you need to be sure to encode on output.
For example:
with open('encoded_output.txt', 'w') as f:
for row in c.execute('SELECT * FROM politicians').fetchall():
for col in row:
f.write(col.encode('utf-8'))
f.write('\n')
You could read the data using the csv module. Then you can create an insert sql statement and use the method executemany:
cursor.executemany(sql, rows)
or use add_all if you use sqlalchemy.
You asked what the create(**dict(zip(fields, row))) line did.
I don't know how to reply directly to your comment, so I'll try to answer it here.
zip takes multiple lists as args and returns a list of their correspond elements as tuples.
zip(list1, list2) => [(list1[0], list2[0]), (list1[1], list2[1]), .... ]
dict takes a list of 2-element tuples and returns a dictionary mapping each tuple's first element (key) to its second element (value).
create is a function that takes keyword arguments. You can use **some_dictionary to pass that dictionary into a function as keyword arguments.
create(**{'name':'john', 'age':5}) => create(name='john', age=5)
Something like the following should work: (not tested)
# Open database (will be created if not exists)
conn = sqlite3.connect('/path/to/your_file.db')
c = conn.cursor()
# Create table
c.execute('''create table representatives
(parliament text, name text, country text, party_group text, national_party text, position text)''')
f = open("thefile.txt")
for i in f.readlines():
# Insert a row of data
c.execute("""insert into representatives
values (?,?,?,?,?,?)""", *i.split(", ")) # *i.split(", ") does unpack the list as arguments
# Save (commit) the changes
conn.commit()
# We can also close the cursor if we are done with it
c.close()
If you want to do it with a simple method using sqlite3, you can do it using these 3 steps:
$ sqlite3 db.sqlite3
sqlite> .separator ","
sqlite> .import myfile.txt table_name
However do keep the following points in mind:
The .txt file should be in the same directory as your db.sqlite3,
otherwise use an absolute path "/path/myfile.txt" when importing
Your schema for tables (number of columns) should match with the number of values seperated by commas in each row in the txt file
You can use the .tables command to verify your table name
SQLite version 3.23.1 2018-04-10 17:39:29
Enter ".help" for usage hints.
sqlite> .tables
auth_group table_name
auth_group_permissions django_admin_log
auth_permission django_content_type
auth_user django_migrations
auth_user_groups django_session
auth_user_user_permissions