I have a csv that is being imported from url and placed into a database, however it imports with quotes around the names and id like to remove them. The originial format of the csv file is
"Apple Inc.",113.08,113.07
"Alphabet Inc.",777.61,777.30
"Microsoft Corporation",57.730,57.720
the code I currently have is as follows.
def csv_new(conn, cursor, filename):
with open(filename, 'rt') as csv_file:
csv_data = csv.reader(csv_file)
for row in csv_data:
if(not row[0][0].isdigit()):
continue
split = [int(x) for x in row[0].split('/')]
row[0] = datetime.datetime(split[2], split[0],
split[1]).date().isoformat()
print(row);
cursor.execute('INSERT INTO `trade_data`.`import_data`'
'(date, name, price) VALUES(%s, "%s", %s)',
row)
conn.commit()
final database looks like this
Name | Price1| Price 2|
'Apple Inc.' 113.08 113.07
'Alphabet Inc.' 777.61 777.30
'Microsoft Corporation' 57.730 57.720
and I would like it to look like
Name | Price1| Price 2|
Apple Inc. 113.08 113.07
Alphabet Inc. 777.61 777.30
Microsoft Corporation 57.730 57.720
I tried using for row in csv.reader(new_data.splitlines(), delimiter=', skipinitialspace=True): but it threw errors
csv.reader removes the quotes properly. You may be viewing a quoted string representation of the text instead of the actual text.
>>> new_data = '''"Apple Inc.",113.08,113.07
... "Alphabet Inc.",777.61,777.30
... "Microsoft Corporation",57.730,57.720'''
>>>
>>> import csv
>>>
>>> for row in csv.reader(new_data.splitlines()):
... print(','.join(row))
...
Apple Inc.,113.08,113.07
Alphabet Inc.,777.61,777.30
Microsoft Corporation,57.730,57.720
>>>
Figured it out, the problem was as tdelaney mentioned was that the quotes were not acually in the string it was python, so my changing value in
cursor.execute('INSERT INTO `trade_data`.`import_data`'
'(date, name, price) VALUES(%s, "%s", %s)',
row)
to %s instead of "%s" it fixed the problem and removed the extra quotes.
Related
I have a text file (applications.txt) containing n number of lines of data (with 2 delimiters) in the below format.
1) mytvScreen|mytvScreen|Mi TV,Mí TV, My Tv, TV
2) watchNextScreen|watchNextScreen|Seguir viendo,Mi TV Seguir viendo
3) recordingsScreen|recordingsScreen|Grabaciones,Mis Grabaciones,Mi TV
Note: 1,2,3 are just line numbers for reference. Original file doesn't contain the number.
I am trying to write a function that would read each line and convert it into a dictionary using the value before the first delimiter and the values after the second delimiter, like the example shown below.
eg:
The below line one should be converted into dictionary as expected below.
1) mytvScreen|mytvScreen|Mi TV,Mí TV, My Tv, TV
Expected Format:
mytvScreen : Mi TV, Mí TV, My Tv, TV
Also, Upon giving any value which are comma separated, it should return the value before the colan .
Eg:
When the value Mi TV is given, it should return mytvScreen or for the other comma separated values also, it should return mytvScreen
I was able to read the file and print the values as expected.
But not sure how can i convert each line into a dictionary.
with open('applications.txt') as f:
for line in f:
details=line.split("|",2)
print (details[0] + ' : '+ details[2])
Your Help is highly appreciated.
you should be adding items to dictionary, one of the approach is given below.
s_dict= dict()
with open('applications.txt') as f:
for line in f:
details=line.split("|",2)
print (details[0] + ' : '+ details[2])
s_dict[details[0]] = details[2]
print (s_dict)
I build a file-object to mimic applications.txt content:
>>> import io
>>> f = io.StringIO("""mytvScreen|mytvScreen|Mi TV,Mí TV, My Tv, TV
... watchNextScreen|watchNextScreen|Seguir viendo,Mi TV Seguir viendo
... recordingsScreen|recordingsScreen|Grabaciones,Mis Grabaciones,Mi TV""")
Your file is obviously a csv file, thus you should use the csv module to parse it:
>>> import csv
>>> reader = csv.reader(f, delimiter='|')
The last part is just a dict-comprehension:
>>> {row[0]: row[2] for row in reader}
{'mytvScreen': 'Mi TV,Mí TV, My Tv, TV', 'watchNextScreen': 'Seguir viendo,Mi TV Seguir viendo', 'recordingsScreen': 'Grabaciones,Mis Grabaciones,Mi TV'}
This is my code
import pymysql
import csv
conn=pymysql.connect("localhost", "root","****","finance")
cursor=conn.cursor()
print "done"
csv_data = csv.reader(file('naver_loan_output.csv'))
for row in csv_data:
cursor.execute('INSERT INTO 'daily_new' (date,cust_bal, cust_credit, fund_stock, fund_hyb, fund_bond )' 'VALUES("%s", "%s", "%s", "%s", "%s", "%s")',row)
conn.commit()
cursor.close()
print "Done"
And this is the error:
File "D:\dropbox\Dropbox\lab\7_0218\insert_daily_new.py", line 13
cursor.execute('INSERT INTO 'daily_new' (date,cust_bal, cust_credit,
fund_stock, fund_hyb, fund_bond )' 'VALUES("%s", "%s", "%s", "%s",
"%s", "%s")',row)
^ SyntaxError: invalid syntax [Finished in 0.104s]
I tried a lot, but I'm not sure about the proper SQL insert query syntax. How do I get columns from csv? There are 6 columns in my csv file.
With this updated example, code works:
import pymysql
import csv
csv_data= csv.reader(file('naver_loan_output.csv'))
conn=pymysql.connect("localhost","finance")
cursor=conn.cursor()
print "Done"
for row in csv_data:
#cursor.execute('INSERT INTO \'daily_new\' (date, cust_bal, cust_credit, fund_stock, fund_hyb, fund_bond ) VALUES({}, {}, {}, {}, {}, {})'.format(row[0], row[1], row[2], row[3], row[4], row[5]))
sql="INSERT INTO daily_n (date,cust_bal, cust_credit, fund_stock, fund_hyb, fund_bond ) VALUES('2017-01-01','2','2','2','2','2')"
cursor.execute(sql)
conn.commit()
cursor.close()
So, I think the for row or %s is the problem.
Mainly, your quotes is the issue.
You need to escape single quotes if larger sql string is wrapped in single quotes; or simply wrap larger string in double quotes like your updated example. And note: the SyntaxError is a Python compile error (not MySQL runtime exception).
For parameterized queries, do not quote the placeholder, %s.
MySQL (and practically for all RDMS's) do not use single quotes to enclose table and column name references as you do with 'daily_new'. Use backticks for names in MySQL. A few databases and the ANSI standard allows the double quote for object identifiers (not string literals).
Consider the following adjustment with a more efficient read process of csv file using with() as opposed to all at once as you have it with file(). And as shown with parameterization, you can prepare the sql statement once and then just bind values iteratively in loop.
strSQL = "INSERT INTO `daily_new` (`date`, cust_bal, cust_credit, fund_stock, fund_hyb, fund_bond )" + \
" VALUES(%s, %s, %s, %s, %s, %s)"
with open('naver_loan_output.csv', 'r') as f:
csv_data = csv.reader(f)
for row in csv_data:
cursor.execute(strSQL, row)
conn.commit()
cursor.close()
conn.close()
print "Done"
I think that you can try something like this:
query = """'INSERT INTO 'daily_new' (date,cust_bal, cust_credit, fund_stock, fund_hyb, fund_bond )' 'VALUES("""+row+""")'"""
cursor.execute(query)
Just see this code fragment from CSV File Reading and Writing doc:
>>> import csv
>>> with open('eggs.csv', 'rb') as csvfile:
... spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
... for row in spamreader:
... print ', '.join(row)
Spam, Spam, Spam, Spam, Spam, Baked Beans
Spam, Lovely Spam, Wonderful Spam
I hope it is useful to you or that puts you in the way.
There are three problems tripping you up:
String escape characters
In your code is that the single quote before daily_new is stopping the string and python is interpreting daily_new as a variable, which is not defined.
To solve this you should use the escape character "\" directly before each single quote you want in the string like this:
cursor.execute('INSERT INTO \'daily_new\' (date,cust_bal, cust_credit, fund_stock, fund_hyb, fund_bond ) VALUES("%s", "%s", "%s", "%s", "%s", "%s")',row)
Column access
The csv module returns rows as a list. To access an element in a list (or row in your case), use bracket notation. For row[0] accesses the first element (column) in a row, and row[5] accesses the 6th column.
String substitution
The third problem you are facing is how to pass the values into the string substitution correctly. While there are many ways to do this, an easy one is the format() method. For example: if I wanted to build a string that says "Count is 1", I could run "Count is {}".format(1).
In your case you want to add 6 values to the string, so you add a pair of {} wherever you want a value substituted into a string and add another parameter to the format() function.
Putting it all together
So, to correct your loop code you would want something like this:
csv_data = csv.reader(file('naver_loan_output.csv'))
for row in csv_data:
cursor.execute('INSERT INTO daily_new (date, cust_bal, cust_credit, fund_stock, fund_hyb, fund_bond ) VALUES ({}, {}, {}, {}, {}, {})'.format(row[0], row[1], row[2], row[3], row[4], row[5]))
Basically my problem is this: I have a CSV excel file with info on Southpark characters and I and I have an HTML template and what I have to do is take the data by rows (stored in lists) for each character and using the HTML template given implement that data to create 5 seperate HTML pages with the characters last names.
Here is an image of the CSV file: i.imgur.com/rcIPW.png
This is what I have so far:
askfile = raw_input("What is the filename?")
southpark = []
filename = open(askfile, 'rU')
for row in filename:
print row[0:105]
filename.close()
The above prints out all the info on the IDLE shell in five rows but I have to find a way to separate each row AND column and store it into a list (which I don't know how to do). It's pretty rudimentary code I know I'm trying to figure out a way to store the rows and columns first, then I will have to use a function (def) to first assign the data to the HTML template and then create an HTML file from that data/template..and I'm so far a noob I tried searching through the net but I just don't understand the stuff.
I am not allowed to use any downloadable modules but I can use things built in Python like import csv or whatnot, but really its supposed to be written with a couple functions, list, strings, and loops..
Once I figure out how to separate the rows and columns and store them then I can work on implementing into HTML template and creating the file.
I'm not trying to have my HW done for me it's just that I pretty much suck at programming so any help is appreciated!
BTW I am using Python 2.7.2 and if you want to DL the CSV file click here.
UPDATE:
Okay, thanks a lot! That helped me understand what each row was printing and what info is being read by the program. Now since I have to use functions in this program somehow this is what I was thinking.
Each row (0-6) prints out separate values, but just the print row function prints out one character and all his corresponding values which is what I need. What I want is to print out data like "print row" would but I have to store each of those 5 characters in a separate list.
Basically "print row" prints out all 5 characters with each of their corresponding attributes, how can I split each of them into 5 variables and store them as a list?
When I do print row[0] it only prints out the names, or print row1 only prints the DOB. I was thinking of creating a def function that takes only print "row" and splits into 5 variables in a loop and then another def function takes those variables/lists of data and combines them with the HTML template, and at the end I have to figure out how to create HTML files in Python..
Sorry if I sound confusing just trying to make sense of it all. This is my code right now it gives an error that there are too many values to unpack but I am just trying to fiddle around and try different things and see if they work. Based on what I wanted to do above I will probably have to delete all most of this code and find a way to rewrite it with list type functions like .append or .strip, etc which I am not very familiar with..
import csv
original = file('southpark.csv', 'rU')
reader = csv.reader(original)
# List of Data
name, dob, descript, phrase, personality, character, apparel = []
count = 0
def southparkinfo():
for row in reader:
count += 1
if count == 0:
row[0] = name
print row[0] # Name (ex. Stan Marsh)
print "----------------"
elif count == 1:
row[1] = dob
print row[1] # DOB
print "----------------"
elif count == 2:
row[2] = descript
print row[2] # Descriptive saying (ex. Respect My Authoritah!)
print "----------------"
elif count == 3:
row[3] = phrase
print row[3] # Catch Phrase (ex. Mooom!)
print "----------------"
elif count == 4:
row[4] = personality
print row[4] # Personality (ex. Jewish)
print "----------------"
elif count == 5:
row[5] = character
print row[5] # Characteristic (ex. Politically incorrect)
print "----------------"
elif count == 6:
row[6] = apparel
print row[6] # Apparel (ex. red gloves)
return
reader.close()
First and foremost, have a look at the CSV docs.
Once you understand the basics take a look at this code. This should get you started on the right path:
import csv
original = file('southpark.csv', 'rU')
reader = csv.reader(original)
for row in reader:
#will print each row by itself (all columns from names up to what they wear)
print row
print "-----------------"
#will print first column (character names only)
print row[0]
You want to import csv module so you can work with the CSV filetype. Open the file in universal newline mode and read it with csv.reader. Then you can use a for loop to begin iterating through the rows depending on what you want. The first print row will print a single line of all a single character's data (ie: everything from their name up to their clothing type) like so:
['Stan Marsh', 'DOB: October 19th', 'Dude!', 'Aww #$%^!', 'Star Quarterback', 'Wendy', 'red gloves']
-----------------
['Kyle Broflovski', 'DOB: May 26th', 'Kick the baby!', 'You ***!', 'Jewish', 'Canadian', 'Ushanka']
-----------------
['Eric Theodore Cartman', 'DOB: July 1', 'Respect My Authroitah!', 'Mooom!', 'Big-boned', 'Political
ly incorrect', 'Knit-cap!']
-----------------
['Kenny McCormick', 'DOB: March 22', 'DOD: Every other week', 'Mmff Mmff', 'MMMFFF!!!', 'Mysterion!'
, 'Orange Parka']
-----------------
['Leopold Butters Stotch', 'DOB:Younger than the others!', 'The 4th friend', 'Professor chaos', 'stu
tter', 'innocent', 'nerdy']
-----------------
Finally, the second statement print row[0] will provide you with the character names only. You can change the number and you'll be able to grab the other data as necessary. Remember, in a CSV file everything starts at 0, so in your case you can only go up to 6 because A=0, B=1, C=2, etc... To see these outputs more clearly, it's probably best if you comment out one of the print statements so you get a clearer picture of what you are grabbing.
-----------------
Stan Marsh
-----------------
Kyle Broflovski
-----------------
Eric Theodore Cartman
-----------------
Kenny McCormick
-----------------
Leopold Butters Stotch
Note I threw in that print "-----------------" so you would be able to see the different outputs.
Hope this helps you get you off to a start.
Edit To answer your second question: The easiest way (although probably not the best way) to grab all of a single character's info would be to do something like this:
import csv
original = file('southpark.csv', 'rU')
reader = csv.reader(original)
stan = reader.next()
kyle = reader.next()
eric = reader.next()
kenny = reader.next()
butters = reader.next()
print eric
which outputs:
['Eric Theodore Cartman', 'DOB: July 1', 'Respect My Authroitah!', 'Mooom!', 'Big-boned', 'Politically incorrect', 'Knit-cap!']
Take note that if your CSV is modified such that the order of the characters are moved (ex: butters is moved to top) you will output the info of another character.
Hi I want the following output from my query:
OK|Abortedclients=119063 Aborted_connects=67591 Binlog_cache_disk_use=0
But I dont know how to generate it. this is my script:
#!/usr/bin/env python
import MySQLdb
conn = MySQLdb.connect (host = "...", user="...", passwd="...")
cursor = conn.cursor ()
cursor.execute ("SHOW GLOBAL STATUS")
rs = cursor.fetchall ()
#print rs
print "OK|"
for row in rs:
print "%s=%s" % (row[0], row[1])
cursor.close()
this is what I get now:
OK|
Aborted_clients=119063
Aborted_connects=67591
Binlog_cache_disk_use=0
You can print the rows together in one string:
output = []
for row in rs:
output.append('%s=%s' % (row[0], row[1])
print ''.join(output)
Build the string using join:
print('OK|'+' '.join(['{0}={1}'.format(*row) for row in rs]))
' '.join(iterable) creates a string out of the strings in iterable, joined together with a space ' ' in between the strings.
To fix the code you posted with minimal changes, you could add a comma at the end of the print statements:
print "OK|",
for row in rs:
print "%s=%s" % (row[0], row[1]),
This suppresses the automatic addition of a newline after each print statement.
It does, however, add a space (which is not what you said you wanted):
OK| Aborted_clients=0 ...
Join each pair with '=' and then each result with ' ', appended to 'OK|':
'OK|' + (' '.join(['='.join(r) for r in rs]))
Hello everyone i currently have this:
import feedparser
d = feedparser.parse('http://store.steampowered.com/feeds/news.xml')
for i in range(10):
print d.entries[i].title
print d.entries[i].date
How would i go about making it so that the title and date are on the same line? Also it doesn't need to print i just have that in there for testing, i would like to dump this output into a mysql db with the title and date, any help is greatly appreciated!
If you want to print on the same line, just add a comma:
print d.entries[i].title, # <- comma here
print d.entries[i].date
To insert to MySQL, you'd do something like this:
to_db = []
for i in range(10):
to_db.append((d.entries[i].title, d.entries[i].date))
import MySQLdb
conn = MySQLdb.connect(host="localhost",user="me",passwd="pw",db="mydb")
c = conn.cursor()
c.executemany("INSERT INTO mytable (title, date) VALUES (%s, %s)", to_db)
Regarding your actual question: if you want to join two strings with a comma you can use something like this:
print d.entries[i].title + ', ' + str(d.entries[i].date)
Note that I have converted the date to a string using str.
You can also use string formatting instead:
print '%s, %s' % (d.entries[i].title, str(d.entries[i].date))
Or in Python 2.6 or newer use str.format.
But if you want to store this in a database it might be better to use two separate columns instead of combining both values into a single string. You might want to consider adjusting your schema to allow this.