Loading a csv file into spyder(Python 3.6) - python

I am currently trying to load a csv file of data into spyder and I just cant figure it out. Also my code below gets a value error stating "could not convert string to float:"
My Code:
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('magnet lab.csv',delimiter=',',skiprows=2)
kimberlite = np.array(data[:,0])
forcekimberlite = np.array(data[:1])
plt.scatter(kimberlite,forcekimberlite,s=5,c='red',marker='o')
plt.xlim(0,5)
plt.xlabel('distance from center of magnet to kimberlite')
plt.ylabel('Force')
plt.title('Kimberlite results')
plt.show()

Try this for loading the csv file.
data = np.genfromtxt('magnet lab.csv',delimiter=',')

Related

Loading a parquet file from a GitHub repository

I tried to read a parquet (.parq) file I have stored in a GitHub project, using the following script:
import pandas as pd
import numpy as np
import ipywidgets as widgets
import datetime
from ipywidgets import interactive
from IPython.display import display, Javascript
import warnings
warnings.filterwarnings('ignore')
parquet_file = r'https://github.com/smaanan/sev.en_commodities/blob/main/random_deals.parq'
df = pd.read_parquet(parquet_file, engine='auto')
and it gave me this error:
ArrowInvalid: Could not open Parquet input source '': Parquet
magic bytes not found in footer. Either the file is corrupted or this
is not a parquet file.
Does anyone know what this error message means and how I can load the file in my GitHub repository? Thank you in advance.
You should use the URL under the domain raw.githubusercontent.com.
As for your example:
parquet_file = 'https://raw.githubusercontent.com/smaanan/sev.en_commodities/main/random_deals.parq'
df = pd.read_parquet(parquet_file, engine='auto')
You can read parquet files directly from a web URL like this. However, when reading a data file from a git repository you need to make sure it is the raw file url:
url = 'https://github.com/smaanan/sev.en_commodities/blob/main/random_deals.parq?raw=true'

Extracting data from table and saving it. Layout parser package

I am new in working with python and I am using Melissa Dell's package to extract data from a table image. My image looks like this:
enter image description here
And my code, for now, is the following one:
pip install layoutparser[ocr]
import layoutparser as lp
import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd
import numpy as np
import cv2
from google.cloud.vision_v1 import types
import json
import re
from google.cloud import vision
pip show google-cloud-vision
ocr_agent = lp.GCVAgent.with_credential('mycredebtials.json',
languages = ['es'])
img = plt.imread(r'D:\pdfDispacher.do_Página_2.jpg', cv2.IMREAD_COLOR)
print(img)
plt.imshow(img)
res = ocr_agent.detect(img, return_response=True)
texts = ocr_agent.gather_text_annotations(res)
layout = ocr_agent.gather_full_text_annotation(res, agg_level=lp.GCVFeatureType.WORD)
lp.draw_box(img, layout)
lp.draw_text(img, layout, font_size=12, with_box_on_text=True,
text_box_width=1)
What I need is to tell python to get all the columns and rows and save them in CSV format. But I am not able to get this done.
I really appreciate it if anyone can help me with the next lines.

Viewing images from pickle file in python

I am a newbie in Machine Learning.I have a dataset of images present in .p format(pickle).
How to view the images present inside the file ? I seached the internet but I didn't any appropriate answers.
Please help me to solve this issue.
Code I used:
import pandas as pd
import pickle
objects = []
with (open("full_CNN_train.p", "rb")) as openfile:
while True:
try:
objects.append(pickle.load(openfile))
except EOFError:
break
print(objects)
My output when I tried pickle.load()
It's probably just a standard image matrix, just try using matplotlib.pyplot.imshow()
from matplotlib import pyplot as plt
img = ... # pickle load or whatever
plt.imshow(img)
plt.show()
Just load the pickle file, and use the "imshow" method to visualize.
import pickle
import matplotlib.pyplot as plt
pkl = open('pickled_image.pickle', 'rb')
im = pickle.load(pkl)
plt.imshow(im)

Can't save data from yfinance into a CSV file

I found library that allows me to get data from yahoo finance very efficiently. It's a wonderful library.
The problem is, I can't save the data into a csv file.
I've tried converting the data to a Panda Dataframe but I think I'm doing it incorrectly and I'm getting a bunch of 'NaN's.
I tried using Numpy to save directly into a csv file and that's not working either.
import yfinance as yf
import csv
import numpy as np
urls=[
'voo',
'msft'
]
for url in urls:
tickerTag = yf.Ticker(url)
print(tickerTag.actions)
np.savetxt('DivGrabberTest.csv', tickerTag.actions, delimiter = '|')
I can print the data on console and it's fine. Please help me save it into a csv. Thank you!
If you want to store the ticker results for each url in different csv files you can do:
for url in urls:
tickerTag = yf.Ticker(url)
tickerTag.actions.to_csv("tickertag{}.csv".format(url))
if you want them all to be in the same csv file you can do
import pandas as pd
tickerlist = [yf.Ticker.url for url in urls]
pd.concat(tickerlist).to_csv("tickersconcat.csv")

How to load a json file in jupyter notebook using pandas?

I am trying to load a json file in my jupyter notebook
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib as plt
import json
%matplotlib inline
with open("pud.json") as datafile:
data = json.load(datafile)
dataframe = pd.DataFrame(data)
I am getting the following error
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Please help
If you want to load a json file use pandas.read_json.
pandas.read_json("pud.json")
This will load the json as a dataframe.
The function usage is as shown below
pandas.read_json(path_or_buf=None, orient=None, typ='frame', dtype=True, convert_axes=True, convert_dates=True, keep_default_dates=True, numpy=False, precise_float=False, date_unit=None, encoding=None, lines=False, chunksize=None, compression='infer')
You can get more information about the parameters here
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_json.html
Another way using json!
import pandas as pd
import json
with open('File_location.json') as f:
data = json.load(f)
df=pd.DataFrame(data)
with open('pud.json', 'r') as file:
variable_name = json.load(file)
The json file will be loaded as python dictionary.
This code you are writing here is completely okay . The problem is the .json file that you are loading is not a JSON file. Kindly check that file.

Categories

Resources