How to get column names from txt file into pandas data frame? - python

I have a txt file below with different column names, I want to extract all the column names without the data types etc. from the table into the pandas dataframe.
create table ad.TrackData
(
track_id int unsigned auto_increment primary key,
ads_id int null,
ads_name varchar(45) null,
play_time timestamp null,
package_id int null,
package_name varchar(45) null,
company_id int null,
company_name varchar(45) null,
click_time timestamp null,
demographic varchar(300) null,
status tinyint(1) default 0 null
);
I have no idea how I am supposed to do this, it would be very much appreciated if anyone could teach me some ways to perform this.

First, you must save that text to file.txt. Then run a python script below:
with open('file.txt', 'r') as f:
txt = f.read()
lines = txt.split('\n')
columnNames = []
for line in lines:
if line.startswith(' '):
columnNames.append(line.split(' ')[1])
print(columnNames)
the output must be :
['track_id', 'ads_id', 'ads_name', 'play_time', 'package_id', 'package_name', 'company_id', 'company_name', 'click_time', 'demographic', 'status']

Related

insert a dictionary in a database SQLite3

here I try to put a dictionary in an SQLite database
data1 = {
'x': x,
'y': y,
'width': width,
'height':height,
'proba': proba,
'class': label,
'id_label': id_label
}
sqliteConnection = sqlite3.connect('SQL_bdd.db')
cursor = sqliteConnection.cursor()
sqliteConnection.execute('''CREATE TABLE dic (
x INT NOT NULL,
y INT NOT NULL,
width INT NOT NULL,
height INT NOT NULL,
proba BOOLEAN NOT NULL,
class VARCHAR (255) NOT NULL,
id_label INT NOT NULL
);''')
cursor.execute('INSERT INTO dic VALUES (?,?,?,?,?,?,?)', [dict['x'], dict['y'], dict['width'], dict['height'], dict['proba'], dict['class'], dict['id_label']]);
cursor.execute("SELECT * FROM dic")
the following error this occurs and I don't know how to fix it
cursor.execute('INSERT INTO dic VALUES (?,?,?,?,?,?,?)', [dict['x'], dict['y'], dict['width'], dict['height'], dict['proba'], dict['class'], dict['id_label']]);
TypeError: 'type' object is not subscriptable
Use data1['x'], data1['y'], etc. instead of dict['x'], dict['y'], etc. to index the data in your data1 dictionary.

Insert statement does not work for datatype mismatch

I am scraping data from a site using Scrapy and Python and storing the data in a csv file. Then I am trying to fetch values from the csv file and trying to store the values in a mysql database table. The insert statement is neither triggering error nor inserting any data to the database. I checked the data types of fields whose values are coming from the csv. all are strings. All the values stored in csv are in string format. That's why while storing the values in db, it's creating problem for all the datatypes except string/varchar. What should I do now? Apart from varchar, I have columns of int(6) and timestamp datatypes in my database table.
import csv
import re
import pymysql
import sys
connection = pymysql.connect (host = "localhost", user = "root", passwd = ".....", db = "city_details")
cursor = connection.cursor ()
def insert_articles2(rows):
rowcount = 0
for row in rows:
if rowcount!= 0:
sql = "INSERT IGNORE INTO articles2 (country, event_name, md5, date_added, profile_image, banner, sDate, eDate, address_line1, address_line2, pincode, state, city, locality, full_address, latitude, longitude, start_time, end_time, description, website, fb_page, fb_event_page, event_hashtag, source_name, source_url, email_id_organizer, ticket_url) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %d, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"
cursor.execute = (sql, (row[0], row[1], row[2], row[3], row[4], row[5], row[6], row[7], row[8], row[9], row[10], row[11], row[12], row[13], row[14], row[15], row[16], row[17], row[18], row[19], row[20], row[21], row[22], row[23], row[24], row[25], row[26], row[27]))
rowcount+=1
rows = csv.reader(open("items.csv", "r"))
insert_articles2(rows)
connection.commit()
Table structure for table articles2
CREATE TABLE IF NOT EXISTS `articles2` (
`id` int(6) NOT NULL AUTO_INCREMENT,
`country` varchar(45) NOT NULL,
`event_name` varchar(200) NOT NULL,
`md5` varchar(35) NOT NULL,
`date_added` timestamp NULL DEFAULT NULL,
`profile_image` varchar(350) NOT NULL,
`banner` varchar(350) NOT NULL,
`sDate` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`eDate` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`address_line1` mediumtext,
`address_line2` mediumtext,
`pincode` int(7) NOT NULL,
`state` varchar(30) NOT NULL,
`city` text NOT NULL,
`locality` varchar(50) NOT NULL,
`full_address` varchar(350) NOT NULL,
`latitude` varchar(15) NOT NULL,
`longitude` varchar(15) NOT NULL,
`start_time` time NOT NULL,
`end_time` time NOT NULL,
`description` longtext CHARACTER SET utf16 NOT NULL,
`website` varchar(50) DEFAULT NULL,
`fb_page` varchar(200) DEFAULT NULL,
`fb_event_page` varchar(200) DEFAULT NULL,
`event_hashtag` varchar(30) DEFAULT NULL,
`source_name` varchar(30) NOT NULL,
`source_url` varchar(350) NOT NULL,
`email_id_organizer` varchar(100) NOT NULL,
`ticket_url` mediumtext NOT NULL,
PRIMARY KEY (`id`),
KEY `full_address` (`full_address`),
KEY `full_address_2` (`full_address`),
KEY `id` (`id`),
KEY `event_name` (`event_name`),
KEY `sDate` (`sDate`),
KEY `eDate` (`eDate`),
KEY `id_2` (`id`),
KEY `country` (`country`),
KEY `event_name_2` (`event_name`),
KEY `sDate_2` (`sDate`),
KEY `eDate_2` (`eDate`),
KEY `state` (`state`),
KEY `locality` (`locality`),
KEY `start_time` (`start_time`),
KEY `start_time_2` (`start_time`),
KEY `end_time` (`end_time`),
KEY `id_3` (`id`),
KEY `id_4` (`id`),
KEY `event_name_3` (`event_name`),
KEY `md5` (`md5`),
KEY `sDate_3` (`sDate`),
KEY `eDate_3` (`eDate`),
KEY `latitude` (`latitude`),
KEY `longitude` (`longitude`),
KEY `start_time_3` (`start_time`),
KEY `end_time_2` (`end_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=4182 ;
Regardless of this particular SQL related error, that is very likely depending on some data mismatch, I can strongly suggest to avoid the step of exporting to CSV and instead adding the scrapy-mysql-pipeline , this will export your scraped items directly into a MySQL table and from there you can move the date easily to other tables or process it ...
If you can't use the pipeline and/or you want something more customizable then you can have a look at this answer here on stackoverflow and you'll find useful information on how to write your own custom mysql pipeline...

How to get latest bitcoin price in MySQL using Python?

I'm trying to get the latest bitcoin price and save it in my database. I keep getting the error NameError: name 'price_usd' is not defined when I execute my python script:
getdata.py
import requests
import urllib
import json
import pymysql
con = pymysql.connect(host = 'localhost',user = 'dbuser',passwd = 'dbpass',db = 'bitcoinprice')
cursor = con.cursor()
url = 'example.com'
urllib.urlopen(url).read()
response = urllib.urlopen(url).read()
print(response)
json_obj = str(response)
cursor.execute("INSERT INTO bitcoinprice (list_price_usd) VALUES (%s)", (price_usd))
con.commit()
con.close()
print (json_obj)
Returned JSON from API
[
{
"id": "bitcoin",
"name": "Bitcoin",
"symbol": "BTC",
"rank": "1",
"price_usd": "11117.3",
"price_btc": "1.0",
"24h_volume_usd": "9729550000.0",
"market_cap_usd": "187080534738",
"available_supply": "16827875.0",
"total_supply": "16827875.0",
"max_supply": "21000000.0",
"percent_change_1h": "0.09",
"percent_change_24h": "-0.9",
"percent_change_7d": "-4.32",
"last_updated": "1516991668"
}
]
Schema
CREATE TABLE `bitcoinprice` (
`list_id` varchar(7) CHARACTER SET utf8 DEFAULT NULL,
`list_name` varchar(7) CHARACTER SET utf8 DEFAULT NULL,
`list_symbol` varchar(3) CHARACTER SET utf8 DEFAULT NULL,
`list_rank` int(11) DEFAULT NULL,
`list_price_usd` decimal(7,6) DEFAULT NULL,
`list_price_btc` decimal(9,8) DEFAULT NULL,
`list_24h_volume_usd` decimal(10,1) DEFAULT NULL,
`list_market_cap_usd` decimal(12,1) DEFAULT NULL,
`list_available_supply` decimal(12,1) DEFAULT NULL,
`list_total_supply` bigint(20) DEFAULT NULL,
`list_max_supply` int(11) DEFAULT NULL,
`list_percent_change_1h` decimal(2,1) DEFAULT NULL,
`list_percent_change_24h` decimal(3,2) DEFAULT NULL,
`list_percent_change_7d` decimal(3,1) DEFAULT NULL,
`list_last_updated` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Assuming your "returned json from api" is correct:
Replace
cursor.execute("INSERT INTO bitcoinprice (list_price_usd) VALUES (%s)", (price_usd))
With
cursor.execute("INSERT INTO bitcoinprice (list_price_usd) VALUES (%s)",
(json.loads(json_obj)[0]['price_usd']))
For whatever reason, you seem to have imported the json module (the solution to your problem) without actually using it.
json.loads converts a json string into a python object, which in your case is a list containing one value, a dict with the data you want. [0] gets the dictionary from the list, and ['price_usd'] gets the value you were expecting to be stored in a variable named price_usd from the dict.

Can you use a where constraint on an inner joined table to limit results?

SOLVED
sql = "SELECT Title, Year, TmdbId, MovieFileId, Quality " \
"FROM Movies " \
"INNER JOIN MovieFiles on MovieFiles.Id = Movies.MovieFileId " \
"WHERE Quality = 7"
sql = "SELECT Title, Year, TmdbId, MovieFileId, MovieFiles.Quality " \
"FROM Movies " \
"INNER JOIN MovieFiles on MovieFiles.Id = Movies.MovieFileId " \
"WHERE MovieFiles.Quality = 7"
Neither of these work for me, I am getting no rows returned. Quality is part of the MovieFiles table. And Yes, rows exist that meet this criteria.
sql = "SELECT Title, Year, TmdbId, MovieFileId, Quality " \
"FROM Movies " \
"INNER JOIN MovieFiles on MovieFiles.Id = Movies.MovieFileId "
This works just fine, and returns the correct quality. But, if I add the where, it does not work and just returns nothing. Am I doing something wrong? (Using python)
Edit: Schema request
CREATE TABLE "Movies" ("Id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT, "ImdbId" TEXT, "Title" TEXT NOT NULL, "TitleSlug" TEXT NOT NULL, "SortTitle" TEXT, "CleanTitle" TEXT NOT NULL, "Status" INTEGER NOT NULL, "Overview" TEXT, "Images" TEXT NOT NULL, "Path" TEXT NOT NULL, "Monitored" INTEGER NOT NULL, "ProfileId" INTEGER NOT NULL, "LastInfoSync" DATETIME, "LastDiskSync" DATETIME, "Runtime" INTEGER NOT NULL, "InCinemas" DATETIME, "Year" INTEGER, "Added" DATETIME, "Actors" TEXT, "Ratings" TEXT, "Genres" TEXT, "Tags" TEXT, "Certification" TEXT, "AddOptions" TEXT, "MovieFileId" INTEGER NOT NULL, "TmdbId" INTEGER NOT NULL, "Website" TEXT, "PhysicalRelease" DATETIME, "YouTubeTrailerId" TEXT, "Studio" TEXT, "MinimumAvailability" INTEGER NOT NULL, "HasPreDBEntry" INTEGER NOT NULL, "PathState" INTEGER NOT NULL, "PhysicalReleaseNote" TEXT, "SecondaryYear" INTEGER, "SecondaryYearSourceId" INTEGER)
CREATE TABLE "MovieFiles" ("Id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT, "MovieId" INTEGER NOT NULL, "Path" TEXT, "Quality" TEXT NOT NULL, "Size" INTEGER NOT NULL, "DateAdded" DATETIME NOT NULL, "SceneName" TEXT, "MediaInfo" TEXT, "ReleaseGroup" TEXT, "RelativePath" TEXT, "Edition" TEXT)

Select into subdict from multiple tables

I have a database structure like this:
CREATE TABLE person (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
age INTEGER NOT NULL,
hometown_id INTEGER REFERENCES town(id)
);
CREATE TABLE town (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
population INTEGER NOT NULL
);
And I want to get the following result when selecting:
{
"name": "<person.name>",
"age": "<person.age>"
"hometown": {
"name": "<tometown.name>",
"population": "<tometown.population>"
}
}
I'm already using psycopg2.extras.DictCursor, so I think I need to play with SQL's SELECT AS.
Here's an example of what I tried with no resullt, I've done many similar with minor adjustments, all of them raising different errors:
SELECT
person(name, age),
town(name, population) as town,
FROM person
JOIN town ON town.id = person.hometown_id
Any way to do this, or should I just select all columns individually and build the dict inside of Python?
Postgres version info:
psql (9.4.6, server 9.5.2)
WARNING: psql major version 9.4, server major version 9.5.
Some psql features might not work.
smth like?..
t=# with t as (
select to_json(town),* from town
)
select json_build_object('name',p.name,'age',age,'hometown',to_json) "NameItAsYou Wish"
from person p
join t on t.id=p.hometown_id
;
NameItAsYou Wish
--------------------------------------------------------------------------------
{"name" : "a", "age" : 23, "hometown" : {"id":1,"name":"tn","population":100}}
(1 row)

Categories

Resources