Error while adding columns to SQLite3 database table using python loop - python

I am trying to create a table in SQLite database using a list of items in python but while creating a table it is giving me an operational error.
I am using a python for loop to use this list items as column headers. The loop was working fine till list item 'QT' and after that it is breaking and giving me the error.
col_names = [
"ABPs",
"ABPd",
"ABPm",
"ARTs",
"ARTd",
"ARTm",
"AoS",
"AoD",
"AoM",
"UAPs",
"UAPd",
"UAPm",
"P1s",
"P1d",
"P1m",
"NBPs",
"NBPd",
"NBPm",
"P_1d",
"P_1m",
"PVC",
"QT",
"QT-HR",
"QTc",
"ST-I",
"ST-II",
"ST-III",
"ST-aVR",
"ST-aVF",
"ST-aVL",
"ST-MCL",
"ST-V1",
"ST-V2",
"dQTc",
"ST-V3",
"ST-V4",
"ST-V5",
"ST-V6",
]
cur.execute("""CREATE TABLE CICU2 (datetime DATETIME)""")
for colname in col_names:
cur.execute("""ALTER TABLE CICU2 ADD COLUMN """ + colname + """ TEXT""")
Error : near "-": syntax error.

- is not permitted in unescaped column names; you are asking SQLite to subtract one column name from another. You must use proper quoting if you want to use any non-standard characters in a column name.
The standard method of quoting a column name is to use double quotes; any existing double quotes in a column name can be replaced with doubled double quotes:
cur.execute('ALTER TABLE CICU2 ADD COLUMN "{}" TEXT'.format(colname.replace('"', '""'))
I used a format string to insert the column name into the query, which is easier to read for humans. The {} part is replaced by the value of the first argument to the str.format() method.

Related

How to fix "psycopg2.errors.InvalidDatetimeFormat: invalid input syntax for type date:"-"

I have one excel file with different columns and rows that type of columns is various same as date.
some lists has exported from xlsx file with pandas and then zip those lists to one list named result.
Now I want to insert that result list on PostgreSQL,but when the value of date cell is null and also
in my modles.py I have set datefiled to null = True,Blank= True but I got this error:
psycopg2.errors.InvalidDatetimeFormat: invalid input syntax for type date: "-"
I'm new in coding.
cursor = connection.cursor()
for z in result:
cursor.execute("""INSERT INTO
forecast_forcast(document_number,document_name,project,discipline_code,first_Plan_issue_Date,second_Plan_issue_Date,final_Plan_issue_Date,class_num,rev,latest_status,comment_status,current_complete,responsible,weight) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)""",z)
connection.commit()
please use str(datetime.datetime.now().date()) as parameter for null or blank value.

Creating a table in MariaDB using a list of column names in Python

I am trying to create a table in mariadb using python. I have all the column names stored in a list as shown below.
collist = ['RR', 'ABPm', 'ABPs', 'ABPd', 'HR', 'SPO']
This is just the sample list. Actual list has 200 items in the list. I am trying to create a table using the above collist elements as columns and the datatype for the columns is VARCHAR.
This is the code I am using to create a table
for p in collist:
cur.execute('CREATE TABLE IF NOT EXISTS table1 ({} VARCHAR(45))'.format(p)
The above code is executing but only the first element of the list is being added as a column in the table and I cannot see the remaining elements. I'd really appreciate if I can get a help with this.
You can build the string in 3 parts and then .join() those together. The middle portion is the column definitions, joining each of the item in the original list. This doesn't seem particularly healthy; both in the number of columns and the fact that everything is VARCHAR(45) but that's your decision:
collist = ['RR', 'ABPm', 'ABPs', 'ABPd', 'HR', 'SPO']
query = ''.join(["(CREATE TABLE IF NOT EXISTS table1 ",
' VARCHAR(45), '.join(collist),
' VARCHAR(45))'])
Because we used join, you need to specify the last column type separately (the third item in the list) to correctly close the query.
NOTE: If the input data comes from user input then this would be susceptible to SQL injection since you are just formatting unknown strings in, to be executed. I am assuming the list of column names is internal to your program.

Referencing row values in pyodbc when column name contains dashes (hyphens)

I am new to python and pyodbc
I try to print the first a row from a table from a progress openedge database. (Windows 7) Here is the code block that is not running:
cursor.execute("select my-nr, my-dt-my from mytable")
row = cursor.fetchone()
print(row.my-nr, row.my-dt-my)
This gives errors undefined name: 'nr'
undefined name 'dt'
undefined name 'my'
I guess it has something to do with the minus - symbols behind the dot . in print(row.my-nr, row.my-dt-my)
It was easy to print out the table names and column names from the database earlier but for some reason printing out rows is harder.
Any ideas how to get the rows printed?
pyodbc allows us to reference values in a pyodbc.Row object using the form row.column_name provided that the column names are legal Python identifiers. So, for example, we can do something like
row = crsr.fetchone()
print(row.city)
to print the value of the "city" column. Unfortunately, my-nr is not a legal Python identifier so if we try to print the value of the "my-nr" column using ...
row = crsr.fetchone()
print(row.my-nr) # error
... Python parses that as "row.my minus nr" where row.my would be interpreted as a column in the Row object and nr would be interpreted as a Python variable.
To work around the issue we can grab a list of the column names, merge those names with the row values into a dictionary, and then refer to the values in the dictionary:
crsr.execute(sql)
col_names = [x[0] for x in crsr.description]
row = crsr.fetchone()
row_as_dict = dict(zip(col_names, row))
print(row_as_dict['my-nr']) # no error
The most simple solution I can think of is this. First, columns containing hyphens need to be quoted in OpenEdge (see here). Second, you can alias the columns so they can be referenced as valid Python attributes. You'll need to do something like this:
cursor.execute('select "my-nr" as mynr, "my-dt-my" as mydtmy from mytable')
row = cursor.fetchone()
print(row.mynr, row.mydtmy)
Good luck!
I beleive that you need to change the variable names of the database, make sure they don't contain any '-' characters.
Variables can not contain characters reserved by python. For example you have to avoid hyphens(-), exclamation marks (!), colons (:) and so on.
According to this answer it seems like underscore (_) is the only character allowed in variable names.

Why does SQLite3 not yield an error

I am quite new to SQL, but trying to bugfix the output of an SQL-Query. However this question does not concern the bug, but rather why SQLite3 does not yield an error when it should.
I have query string that looks like:
QueryString = ("SELECT e.event_id, "
"count(e.event_id), "
"e.state, "
"MIN(e.boot_time) AS boot_time, "
"e.time_occurred, "
"COALESCE(e.info, 0) AS info "
"FROM events AS e "
"JOIN leg ON leg.id = e.leg_id "
"GROUP BY e.event_id "
"ORDER BY leg.num_leg DESC, "
"e.event_id ASC;\n"
)
This yields an output with no errors.
What I dont understand, is why there is no error when I GROUP BY e.event_id and e.state and e.time_occurred does not contain aggregate-functions and is not part of the GROUP BY statement?
e.state is a string column. e.time_occurred is an integer column.
I am using the QueryString in Python.
In a misguided attempt to be compatible with MySQL, this is allowed. (The non-aggregated column values come from some random row in the group.)
Since SQLite 3.7.11, using min() or max() guarantees that the values in the non-aggregated columns come from the row that has the minimum/maximum value in the group.
SQLite and MySQL allow bare columns in an aggregation query. This is explained in the documentation:
In the query above, the "a" column is part of the GROUP BY clause and
so each row of the output contains one of the distinct values for "a".
The "c" column is contained within the sum() aggregate function and so
that output column is the sum of all "c" values in rows that have the
same value for "a". But what is the result of the bare column "b"? The
answer is that the "b" result will be the value for "b" in one of the
input rows that form the aggregate. The problem is that you usually do
not know which input row is used to compute "b", and so in many cases
the value for "b" is undefined.
Your particular query is:
SELECT e.event_id, count(e.event_id), e.state, MIN(e.boot_time) AS boot_time,
e.time_occurred, COALESCE(e.info, 0) AS info
FROM events AS e JOIN
leg
ON leg.id = e.leg_id "
GROUP BY e.event_id
ORDER BY leg.num_leg DESC, e.event_id ASC;
If e.event_id is the primary key in events, then this syntax is even supported by the ANSI standard, because event_id is sufficient to uniquely define the other columns in a row in events.
If e.event_id is a PRIMARY or UNIQUE key of the table then e.time_occurred is called "functionally dependent" and would not even throw an error in other SQL compliant DBMSs.
However, SQLite has not implemented functional dependency. In the case of SQLite (and MySQL) no error is thrown even for columns that are not functionally dependent on the GROUP BY columns.
SQLite (and MySQL) simply select a random row from the result set to fill the (in SQLite lingo) "bare column", see this.

How to select and order multiple columns in a Pyspark Dataframe after a join

I want to select multiple columns from existing dataframe (which is created after joins) and would like to order the fileds as my target table structure. How can it be done ? The approached I have used is below. Here I am able to select the necessary columns required but not able to make in sequence.
Required (Target Table structure) :
hist_columns = ("acct_nbr","account_sk_id", "zip_code","primary_state", "eff_start_date" ,"eff_end_date","eff_flag")
account_sk_df = hist_process_df.join(broadcast(df_sk_lkp) ,'acct_nbr','inner' )
account_sk_df_ld = account_sk_df.select([c for c in account_sk_df.columns if c in hist_columns])
>>> account_sk_df
DataFrame[acct_nbr: string, primary_state: string, zip_code: string, eff_start_date: string, eff_end_date: string, eff_flag: string, hash_sk_id: string, account_sk_id: int]
>>> account_sk_df_ld
DataFrame[acct_nbr: string, primary_state: string, zip_code: string, eff_start_date: string, eff_end_date: string, eff_flag: string, account_sk_id: int]
The account_sk_id need to be in 2nd place. What's the best way to do this ?
Try selecting columns by just giving a list, not by iterating existing columns and ordering should be OK:
account_sk_df_ld = account_sk_df.select(*hist_columns)

Categories

Resources