type of the data from mysql - python

Here is my code in Python:
queryuniq = "SELECT COUNT(distinct src_ip), COUNT(distinct video_id)FROM video"
cur.execute(queryuniq)
uniq = []
uniq = cur.fetchall()
print uniq
ip = str(uniq[0])
video = str(uniq[1])
fd2.write("There are %d ip addresses and %d video in total" %(int(ip), int(video)))
This is the value of "uniq" variable I got:
((2052L, 163581L),)
And this error message:
fd2.write("There are %d ip addresses in total" %(int(ip)))
ValueError: invalid literal for int() with base 10: '((2052L,),)'
video = str(uniq[1])
IndexError: tuple index out of range
I just simply want to count the distinct items in a column in the database, and print the INT value in a file.
Can anyone explain why the SELECT command return a weird data format like ((2052L, 163581L),) ? Don't understand why there is a "L"after the number..
How can I solve this problem? Many thanks!

uniq is a tuple of tuples (each entry at the outer level represents a database row, within which there is a tuple of column values).
You query always returns one row. Therefore the outer tuple always contains one element, and you could fix your code by replacing:
uniq = cur.fetchall()
with
uniq = cur.fetchall()[0]
Also, the conversions from int to string and then back to int are unnecessary.
To summarize, the following is a tidied up version of your code:
queryuniq = "SELECT COUNT(distinct src_ip), COUNT(distinct video_id)FROM video"
cur.execute(queryuniq)
uniq = cur.fetchall()[0]
ip, video = uniq
fd2.write("There are %d ip addresses and %d video in total" %(ip, video))

There several things wrong with your code.
Firstly, cur.fetchall() - as the name implies - fetches all the results from the query. Since Python does not know that your query only returns a single row, it still returns a tuple of all rows. So uniq[0] does not refer to the first field in the row, it refers to the first row in the result.
Since you know you only want one row, you could use cur.fetchone().
Secondly, why are you converting the results to strings then converting them back to ints? That seems pointless. They are in the correct format already - L just means they are 'long ints'.

Related

pyhon: repeatedly mysql query on_message in websocket not getting latest results [duplicate]

I'm wondering why my MySQL COUNT(*) query always results in ->num_rows to be equal 1.
$result = $db->query("SELECT COUNT( * ) FROM u11_users");
print $result->num_rows; // prints 1
Whereas fetching "real data" from the database works fine.
$result = $db->query("SELECT * FROM u11_users");
print $result->num_rows; // prints the correct number of elements in the table
What could be the reason for this?
Because Count(*) returns just one line with the number of rows.
Example:
Using Count(*) the result's something like the following.
array('COUNT(*)' => 20);
echo $result['COUNT(*)']; // 20
Reference
It should return one row*. To get the count you need to:
$result = $db->query("SELECT COUNT(*) AS C FROM u11_users");
$row = $result->fetch_assoc();
print $row["C"];
* since you are using an aggregate function and not using GROUP BY
that's why COUNT exists, it always returns one row with number of selected rows
http://dev.mysql.com/doc/refman/5.1/en/counting-rows.html
Count() is an aggregate function which means it returns just one row that contains the actual answer. You'd see the same type of thing if you used a function like max(id); if the maximum value in a column was 142, then you wouldn't expect to see 142 records but rather a single record with the value 142. Likewise, if the number of rows is 400 and you ask for the count(*), you will not get 400 rows but rather a single row with the answer: 400.
So, to get the count, you'd run your first query, and just access the value in the first (and only) row.
By the way, you should go with this count(*) approach rather than querying for all the data and taking $result->num_rows; because querying for all rows will take far longer since you're pulling back a bunch of data you do not need.

populating sqlite3 db with links in python

I'm trying to populate a database with a single column with a list of strings (links). I scraped the list and I must modify every single link before sending it to the database. This is the code:
for event in events:
link_url = "https://www.website.com"+event+"#all"
c.execute("INSERT INTO table (links) VALUES(?)", link_url)
I can get it working if I modify the variables and send a tuple, like this:
for event in events:
link_url = "https://www.website.com"+event+"#all"
link = (link_url,)
c.execute("INSERT INTO seriea (links) VALUES(?)", link_url)
but I don't want to use this solution since I want to get a list of strings back out later:
c = connection.execute('select links from table')
list_of_urls = c.fetchall()
But this gives me a list of tuples.
This is the error I have: ProgrammingError: Incorrect number of bindings supplied. The current statement uses 1, and there are 80 supplied.
I think that's because the string characters are counted (actually more but I noticed that the number before "supplied" changes with the link fed)
I don't want to use this solution since I want to get a list of strings back out later:
c = connection.execute('select links from table')
list_of_urls = c.fetchall()
But this gives me a list of tuples.
The list of tuples you're getting when you do a select have nothing to do with the way you insert data. Remember, tables have two dimensions:
id
links
something
else
1
"foo"
"bar"
"baz"
2
"quux"
"herp"
"derp"
When you do a select you get a list that corresponds to the rows here. But each row has multiple fields: id, links, something, and else. Each tuple in the list contains the values for each of the fields in the table.
If you just want the URLs as a list of strings you can use a list comprehension or similar:
c = connection.execute('select links from table')
list_of_rows = c.fetchall()
list_of_strings = [row[0] for row in list_of_rows]
# ^ index of first element in
# ^^^ the tuple of values for each row
Note that you do have to provide a tuple or other sequence when you insert the data:
For the qmark style, parameters must be a sequence. For the named style, it can be either a sequence or dict instance. The length of the sequence must match the number of placeholders, or a ProgrammingError is raised. If a dict is given, it must contain keys for all named parameters.
You might be thinking of the tuple part of it the wrong way. You don't need to pass in a tuple of URLs, you need to pass in a tuple of parameters. You're not saying "the links column should contain this tuple" but rather "this tuple contains enough values to fill in the placeholders in this query".
I'd rewrite that like so:
for event in events:
link_url = "https://www.website.com"+event+"#all"
c.execute("INSERT INTO seriea (links) VALUES(?)", (link_url,))
This is so you can have multiple parameters, e.g.
c.execute(
"INSERT INTO seriea (links, some, other) VALUES(?, ?, ?)",
(link_url, foo, bar),
)
The current statement uses 1, and there are 80 supplied.
I think that's because the string characters are counted
Yes, that's most likely what's happening. c.execute() expects to receive a sequence, and strings are a sequence of characters.

Python MySQLdb TypeError("not all arguments converted during string formatting")

I know this is a popular topic but I searched the various answers and didn't see a clear answer to my issue. I have a function that I want to use to insert records into my NDBC database that is giving me the error I mentioned in the title. The function is below:
def insertStdMet(station,cursor,data):
# This function takes in a station id, database cursor and an array of data. At present
# it assumes the data is a pandas dataframe with the datetime value as the index
# It may eventually be modified to be more flexible. With the parameters
# passed in, it goes row by row and builds an INSERT INTO SQL statement
# that assumes each row in the data array represents a new record to be
# added.
fields=list(data.columns) # if our table has been constructed properly, these column names should map to the fields in the data table
# Building the SQL string
strSQL1='REPLACE INTO std_met (station_id,date_time,'
strSQL2='VALUES ('
for f in fields:
strSQL1+=f+','
strSQL2+='%s,'
# trimming the last comma
strSQL1=strSQL1[:-1]
strSQL2=strSQL2[:-1]
strSQL1+=") " + strSQL2 + ")"
# Okay, now we have our SQL string. Now we need to build the list of tuples
# that will be passed along with it to the .executemany() function.
tuplist=[]
for i in range(len(data)):
r=data.iloc[i][:]
datatup=(station,r.name)
for f in r:
datatup+=(f,)
tuplist.append(datatup)
cursor.executemany(strSQL1,tuplist)
When we get to the cursor.executemany() call, strSQL looks like this:
REPLACE INTO std_met (station_id,date_time,WDIR,WSPD,GST,WVHT,DPD,APD,MWD,PRES,ATMP,WTMP,DEWP,VIS) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)'
I'm using % signs throughout and I am passing a list of tuples (~2315 tuples). Every value being passed is either a string,datetime, or number. I still have not found the issue. Any insights anyone cares to pass along would be sincerely appreciated.
Thanks!
You haven't given your SQL query a value for either station_id or date_time, so when it goes to unpack your arguments, there are two missing.
I suspect you want the final call to be something like:
REPLACE INTO std_met
(station_id,date_time,WDIR,WSPD,GST,WVHT,DPD,APD,MWD,
PRES,ATMP,WTMP,DEWP,VIS) VALUES (%s, %s, %s,%s,%s,%s,
%s,%s,%s,%s,%s,%s,%s,%s)'
Note the extra two %s. It looks like your tuple already contains values for station_id and date_time, so you could try this change:
strSQL1='REPLACE INTO std_met (station_id,date_time,'
strSQL2='VALUES (%s, %s, '

SQLite output from query into Python script

I have this Python script:
s = stdscr.getstr(0,0, 20) #input length last number
c = db.execute("""SELECT "debit" FROM "members" WHERE "barcode" = '%s' LIMIT 1""" % (s,))
for row in c:
print row
if row == '(0,)':
#display cross
print 'Tick'
else:
#display tick
print 'Cross'
Where it is asking for a barcode input, and matching the debit field in the database.
The "print row" command returns "(0,)" but when I try to match it, I always get "Cross" as the output, which is not the intended result. Is there a semantic I'm obviously not observing?
Many thanks!
The variable row is a tuple, and '(0,)' is its string representation. Your are comparing a variable with its string representation, which cannot work.
You need to compare it to the tuple value
if row == (0,):
Simply remove the quote marks.
Alternatively, you can write
if row[0] == 0:
which will avoid the creation of a tuple just for the comparison. As noted by #CL., row will never be an empty tuple so extracting row[0] is safe.

python MySQLdb: SELECT DISTINCT - why returning long

Here's a shortened version of my script:
import MySQLdb
src_db = MySQLdb.connect(**some_connection)
src_cursor = src_db.cursor()
v = src_cursor.execute('SELECT node_id FROM stats WHERE time_unit >= 1388534400')
v ends up being of long type, which I cannot understand. I expect to have a generator that would return 1-element tuples (I ask only for one column). And it returns a long value, being number of rows returned from db. Why?
When I try to iterate through it:
node_ids = {int(x[0]) for x in v}
I get following error:
TypeError: 'long' object is not iterable
You need to read the python database API specification, PEP-249.
Basically, after you've executed a query with a cursor object, you then query the cursor object for the results.
cursor.execute(my_sql)
for record in cursor.fetchall():
# do stuff
src_cursor.execute(SELECT_QUERY) will return the number of rows matching your query.
To iterate through the result of the query:
for row in src_cursor.fetchall():
To get one row at a time:
row = src_cursor.fetchone()

Categories

Resources