cannot select a row based on certain string - python

I have been trying to select rows of a column in a dataset based on a string, named 'Gemeente', the dutch translation of municipality.
I have used the following code to select it.
select * from incomecbs where regioaanduiding = 'Gemeente'
In this case, regioaanduiding means region'.
Unfortunately i get no results when doing this.
Does anyone know what i am doing wrong?

There may be trailing spaces in your data. Try using like:
select * from incomecbs where regioaanduiding like 'Gemeente%'
Keep in mind that this may perform poorly on large tables compared to strict equality.

Related

Extract pandas dataframe column names from query string

I have a dataset with a lot of fields, so I don't want to load all of it into a pd.DataFrame, but just the basic ones.
Sometimes, I would like to do some filtering upon loading and I would like to apply the filter via the query or eval methods, which means that I need a query string in the form of, i.e. "PROBABILITY > 10 and DISTANCE <= 50", but these columns need to be loaded in the dataframe.
Is is possible to extract the column names from the query string in order to load them from the dataset?
I know some magic using regex is possible, but I'm sure that it would break sooner or later, as the conditions get complicated.
So, I'm asking if there is a native pandas way to extract the column names from the query string.
I think you can use when you load your dataframe the term use cols I use it when I load a csv I dont know that is possible when you use a SQL or other format.
Columns_to use=['Column1','Column3']
pd.read_csv(use_cols=Columns_to_use,...)
Thank you

How can I print only rows that have a precise column number greater than x ? Sheets Api with Python

So what I want to do is print only rows that have for example the price (or any other row "title" cell greater or equal to let's say 50.
I haven't been able to find the answer elsewhere and couldn't do it myself with the API documentation.
I'm using Google Sheets API v4 and my goal is based on a sheets that contain information on mobile subscription, allow user to select what they want for price, GB, etc.
Here is what my sheets look like:
Also, here is an unofficial documentation which I found great even though it didn't contain the answer I need, maybe someone here would succeed?
I tried running the following code but it didn't work:
val_list = col5
d = wks.findall(>50) if cell.value >50 :
print (val_list)
I hope you will be able to help me. I'm new to Python.
I think you had the right idea, but it looks like findall is for strings or regex, not an arbitrary boolean condition. Also, some of the syntax is a bit off, but that's to be expected when you are just starting out.
Here is how I would approach this with just what I could find in your attached document. I doubt this is the fastest or cleanest way to do this, but I think it's at least conceptually clear:
#list of all values in 4th/price column
prices=wks.col_values(4)
#Remove nonnumeric characters from prices
prices=[p.replace('*','') for p in prices[1:]]
#Get indices of rows with price >=50
##i+2 to account for one indexing and removing header row
indices=[i+2 for i,p in enumerate(prices) if float(p)>=50]
#Print these rows
for i in indices:
row=wks.row_values(i)
print(row)
Going forward with this project, you may want to put these row values into a dataframe rather than just printing them so you can do further analysis on this subset of the data.

Python/SQLite: smooth way to set column names in CREATE TABLE

I'm building up a table in Python's SQLite module which will consist of 18 columns at least. These columns are named by times (for example "08-00"), all stored in a list called 'time_range'. I want to avoid writing up all 18 table names by hand in the SQL statement, since all this already exists inside the mentioned list and it would make the code quite ugly. Howver, this:
marks = '?,'*18
self.c.execute('''CREATE TABLE finishedJobs (%s)''' % marks, tuple(time_range))
did not work. As it seems, Python/SQLite does not accept parameters at this place. Is there any smart workaround for my purpose or do I really have to name every single column in a CREATE TABLE statement by hand?

appending non-unique rows to another database using python

Hey all,
I have two databases. One with 145000 rows and approx. 12 columns. I have another database with around 40000 rows and 5 columns. I am trying to compare based on two columns values. For example if in CSV#1 column 1 says 100-199 and column two says Main St(meaning that this row is contained within the 100 block of main street), how would I go about comparing that with a similar two columns in CSV#2. I need to compare every row in CSV#1 to each single row in CSV#2. If there is a match I need to append the 5 columns of each matching row to the end of the row of CSV#2. Thus CSV#2's number of columns will grow significantly and have repeat entries, doesnt matter how the columns are ordered. Any advice on how to compare two columns with another two columns in a separate database and then iterate across all rows. I've been using python and the import csv so far with the rest of the work, but this part of the problem has me stumped.
Thanks in advance
-John
A csv file is NOT a database. A csv file is just rows of text-chunks; a proper database (like PostgreSQL or Mysql or SQL Server or SQLite or many others) gives you proper data types and table joins and indexes and row iteration and proper handling of multiple matches and many other things which you really don't want to rewrite from scratch.
How is it supposed to know that Address("100-199")==Address("Main Street")? You will have to come up with some sort of knowledge-base which transforms each bit of text into a canonical address or address-range which you can then compare; see Where is a good Address Parser but be aware that it deals with singular addresses (not address ranges).
Edit:
Thanks to Sven; if you were using a real database, you could do something like
SELECT
User.firstname, User.lastname, User.account, Order.placed, Order.fulfilled
FROM
User
INNER JOIN Order ON
User.streetnumber=Order.streetnumber
AND User.streetname=Order.streetname
if streetnumber and streetname are exact matches; otherwise you still need to consider point #2 above.

sqlite3 and cursor.description

When using the sqlite3 module in python, all elements of cursor.description except the column names are set to None, so this tuple cannot be used to find the column types for a query result (unlike other DB-API compliant modules). Is the only way to get the types of the columns to use pragma table_info(table_name).fetchall() to get a description of the table, store it in memory, and then match the column names from cursor.description to that overall table description?
No, it's not the only way. Alternatively, you can also fetch one row, iterate over it, and inspect the individual column Python objects and types. Unless the value is None (in which case the SQL field is NULL), this should give you a fairly precise indication what the database column type was.
sqlite3 only uses sqlite3_column_decltype and sqlite3_column_type in one place, each, and neither are accessible to the Python application - so their is no "direct" way that you may have been looking for.
I haven't tried this in Python, but you could try something like
SELECT *
FROM sqlite_master
WHERE type = 'table';
which contains the DDL CREATE statement used to create the table. By parsing the DDL you can get the column type info, such as it is. Remember that SQLITE is rather vague and unrestrictive when it comes to column data types.

Categories

Resources