Python: Search in an online CSV file - python

I am currently stuck with a Python project where I want to get the information out of an online CSV file and want to let the user search via an input function. At the moment, I am able to get the information from the online CSV file via a link but I cannot make the connection so that it searches the exact word in that CSV file.
I currently have tried multiple tutorials but most of them aren't solving my issue. So with a lot of pain, I am writing this message here, hoping someone can help me out.
The code I have so far is:
import csv
import urllib.request
metar_search = input('Enter the ICAO station\n')
url = 'https://www.aviationweather.gov/adds/dataserver_current/current/metars.cache.csv'
response = urllib.request.urlopen(url)
lines = [l.decode('utf-8') for l in response.readlines()]
cr = csv.reader(lines)
for row in cr:
if metar_search == row[0]:
print(row)
In the CSV file, the first row is what I am looking for. It is the METAR information of an airport. So, I want the user to type the ICAO code (for example KJFK), then I want the line of text the weather information of that station (example: KJFK 051851Z 15010KT 10SM FEW017 FEW035 FEW250 27/19 A3006 RMK AO2 SLP177 T02670194).
When I currently type KJFk, it is not returning any information back.
The current code is probably a bit messy because I have tried several things, I also tried to make a function of it but without luck. What am I doing wrong?
I hope someone is able to help me out with this question.
Thank you so much in advance.

Try
...
for row in cr:
if row[0].startswith(metar_search):
print(row)
or
...
lines = [l.decode('utf-8') for l in response.readlines()[5:]]
cr = csv.reader(lines)
for row in cr:
if metar_search == row[1]:
print(row)
Hint: Take a closer look at the data.
If you know that there's only one result then you could stop searching after you found the row:
...
print(row)
break

Related

How to use csv with python as an expert?

I'm just getting started on python programming.
Here is an example of my CSV file :
Name
tag.
description
Cool
cool,fun
cool ...
Cell
Cell,phone
Cell ...
Rang
first,third
rang ...
The print with the CSV module gives me a list of all rows, either:
['cool',''cool,fun'','cool...']
['cell',''cell,phone'','cell...']
What I want to do is to printer that cool or cell, phone
I'm also new to programming, but I think I know what you're asking.
How to use CSV module in python
The answer for your question
What you asked "printer that cool or cell, phone" is easy to implement, you can try below code in terminal:
import csv
with open('your_file_path', 'r', newline='', encoding='utf-8') as f:
reader = csv.reader(f)
rows = list(reader)
print(rows[1][0])
print(rows[2][1])
My thoughts
Actually, you should consider the following two points when understanding this problem:
Content, that is, the content you want to print in your terminal, you need to first make sure that what you want is a specific row or column or a specific cell;
The format, that is, the list you side or those quotation marks, these are the types of data in the file, and they must be carefully distinguished.
In addition, it would be better for you to read some articles or materials processed about CSV module, such as the following:
https://docs.python.org/3/library/csv.html#reader-objects
https://www.geeksforgeeks.org/reading-rows-from-a-csv-file-in-python
https://www.tutorialspoint.com/working-with-csv-files-in-python-programming
I am also unskilled in many places, please forgive me if there are mistakes or omissions.

KeyError: 'primaryName' with csv.DictReader

first time poster - long time lurker.
I'm a little rough around the edges when it comes to Python and I'm running into an issue that I imagine has an easy fix.
I have a CSV file that I'm looking to read and perform a semi-advanced lookup.
The CSV in question is not really comma delimited when inspecting it in VS Code except the last "column".
Example (direct format screenshot from the file):
screenshot
The line that seems to have issues is:
import csv
import sys
from util import Node, StackFrontier, QueueFrontier
names = {}
people = {}
titles = {}
def load_data(directory):
with open(f"{directory}/file.csv", encoding="utf-8") as f:
reader = csv.DictReader(f)
for row in reader:
people[row["id"]] = {
"primaryName": row["primaryName"],
"birthYear": row["birthYear"],
"titles": set()
}
if row["primaryName"].lower() not in names:
names[row["primaryName"].lower()] = {row["nconst"]}
else:
names[row["primaryName"].lower()].add(row["nconst"])
The error I receive is:
File "C:\GitHub\Project\data-test.py", line 24, in load_data
"primaryName": row["primaryName"],
~~~^^^^^^^^^^^^^^^
KeyError: 'primaryName'
I've tried this with other CSV files where they are comma delimited, (screenshot example below):
screenshot
And that works perfectly fine. I noticed the above CSV file has the names in ""s which I imagine could be part of the solution.
Ultimately, if I can get it to work with the code above that would be great - otherwise, is there an easy way to automatically format the CSV file to put quotations around the names and separate the value by commas like the csv that's working above?
Thanks in advance for any help.

Is there a way to quickly get rid of a lot of excess data with regex searches?

I'm trying to pull a few pieces of data for data entry into a server. I've gotten the data from a web API, and they include a lot of information that to me, is garbage. I need to get rid of a ton of it, but I'm having issues with where to start. The data I need is up until "abilities", and then starts again at "name":"Contherious". And here's that link. Most of the data processing I've been doing has been trying to use regex searches to try to process this, and the only search I can think of is that between the names that I need versus the names that I don't need have a space and lead to ID directly after them. I'm just unclear as to how to grab each and every one of these names and any help would be appreciated.
I've tried
DMG_DONE_FILE = "rawDmgDoneData.txt"
out = []
with open(DMG_DONE_FILE, 'r') as f:
line = f.readline()
while line:
regex_id = search('^+"name":"\s"+(\w+)+"id":',line)
if regex_id:
out.append(regex_id.group(1))
line = f.readline()
and I get errors because I generally don't know what I'm doing with regex searches
import sys
import json
# use urllib to fetch from api
# example here for testing is reading from local file
f=open('file.json','r')
data=f.read()
f.close()
entries = json.loads(data)
Now you have a data structure that you can easily address
e.g. entries['entries'][0]['name']
alternatively using jq https://stedolan.github.io/jq/
cat file.json |jq '.entries[]| {name:.name,id:.id,type:.type,itemLevel:.itemLevel,icon:.icon,total:.total,activeTime:.activeTime,activeTimeReduced:.activeTimeReduced}'

Python loop to create MYSQL insert into statements from data in the CSV

I need to write a python code that converts all the entries of a CSV to a mySQL insert into statement through a loop. I have csv files with about 6 million entries.
This code below can probably read a row.. Has some syntactically errors though. Can't really pin point as I don't have a background in coding.
file = open('physician_data.csv','r')
for row in file:
header_string = row
header_list = list(header_string.split(','))
number_of_columns = len(header_list)
insert_into_query= INSERT INTO physician_data (%s)
for i in range(number_of_columns):
if i != number_of_columns-1:
insert_into_query+="%s," %(header_list[i])
else:
# no comma after last column
insert_into_query += "%s)" %(header_list[i])
print(insert_into_query)
file.close
Can someone tell me how to make this work?
Please include error messages when you describe a problem (https://stackoverflow.com/help/mcve).
You may find the documentation for the CSV library quite helpful.
Use quotes where appropriate, e.g. insert_into_query = "INSERT..."
Call the close function like this: file.close()

Django get_or_create returning models.DoesNotExist while importing a CSV

I have spent quite sometime to figure this out. I am simply trying to import a CSV file using Python's csv module and Django's get_or_create().
This is my simple code (built upon this code):
import csv
from .models import Person
def import_data():
with open('/path/to/csv/people_list.csv') as f:
reader = csv.reader(f)
for row in reader:
_, created = Person.objects.get_or_create(
name=row[0],
p_id=row[1],
current_status=row[2],
)
I get the following error when I run import_data() on the shell
peoplelisting.models.DoesNotExist: Person matching query does not exist.
Yes, this particular Person does not exist but isnt that the whole point of using get_or_create()? If it doesnt exist, create it?
After a lot of playing around, finally figured the issue was the following:
My csv also contained a header row which I was not ignoring. I had thought I'll proceed piece-meal and would ignore the header only after I get the csv-importing to work but the header itself was creating the problem (thanks to this post which (indirectly) helped a lot). The values in the header did not match the schema (max_length etc.) and thats what Person matching query does not exist was referring to. Ignoring the header made it work. I just hope that the error message was more descriptive though. Hope it helps someone else save the hours I spent debugging a simple thing. Here's the correct code:
import csv
from .models import Person
def import_data():
with open('/path/to/csv/people_list.csv') as f:
reader = csv.reader(f)
for row in reader:
if row[0] != 'Person_name': #where Person_name is first column's name
_, created = Org.objects.get_or_create(
name=row[0],
p_id=row[1],
current_status=row[2],
)
Instead of having to check row[0] every time, you could just skip the first row:
next(reader, None) # skip the headers
source:
Skip the headers when editing a csv file using Python

Categories

Resources