I fetched some tweets with tweepy api and saved them in a txt file. Now I want to extract them into data frame with panda, like, the content of the tweet and maybe the date.
Any ideas how I can do it ?
Btw. I'm really new in python.
Thanks in advance
Depending on the format of your txt file,your approach may vary but overall:
You want to open the file in python, read it (probably line by line), and parse it into a panda dataframe.
for example to extract a line of your document:
file = open(“testfile.txt”, “r”)
for line in file:
# do something (like parsing) on your line.
Related
I have an html form where I am getting the user to select their bank and upload a CSV file of their transactions to handle financial data:
I can store the file in a variable named 'file' but can't find a way to open it with traditional methods:
e.g. this doesn't work
I know the file is valid in the python code because I can open it with pandas, it messes up the column headings as there is some preamble data in the file.
Here is the file:
I am trying to do this so I can search for a row number by string. I need to know what row number 'Date' is on so I can pass that value into skiprows() with pandas in order to get a correct dataframe. This is what I came up with so far:
But obviously I cannot open the file in the first place. Ideally my output would be 7. I can't just use a static value of 7 for skiprows() with pandas as the amount of preamble data before the table changes from file to file.
This may not be an optimal answer, but maybe it will work for you:
file_content = file.stream.read().decode("UTF8")
lines = file_content.split('\n')
Then you can look for the line starting with Date to figure your skiprows value.
I have mutiple text files that contains data like file1 ,file2,file3. Its just an example, I am wondering how to populate specific data in an Excel sheet like this excel sheet
I am new to learning python and the combination of text to excel through python that's why finding it hard to approch
Basically what you need is to Parse and write a new File in the csv File format for the use in excel
file1 -> PythonScript.py -> excel.csv
File Parser Python Tutorial Tutorial
The .csv File looks like this. You have a header and the data seperated with commas.
excel.csv:
Name,Data
hibiscus_3,54k
hibiscus_7,67k
Rose_3,87MB
Hope i could help you
Please help me in extracting important data from a .csv file using python. I got .csv file from 'citrine'.
I want to extract the element name and atomic percentage in the form of "Al2.5B0.02C0.025Co14.7Cr16.0Mo3.0Ni57.48Ti5.0W1.25Zr0.03"
ORIGINAL
[{""element"":""Al"",""idealAtomicPercent"":{""value"":""5.4""}},{""element"":""B"",""idealAtomicPercent"":{""value"":""0.02""}},{""element"":""C"",""idealAtomicPercent"":{""value"":""0.13""}},{""element"":""Co"",""idealAtomicPercent"":{""value"":""7.5""}},{""element"":""Cr"",""idealAtomicPercent"":{""value"":""6.1""}},{""element"":""Mo"",""idealAtomicPercent"":{""value"":""2.0""}},{""element"":""Nb"",""idealAtomicPercent"":{""value"":""0.5""}},{""element"":""Ni"",""idealAtomicPercent"":{""value"":""61.0""}},{""element"":""Re"",""idealAtomicPercent"":{""value"":""0.5""}},{""element"":""Ta"",""idealAtomicPercent"":{""value"":""9.0""}},{""element"":""Ti"",""idealAtomicPercent"":{""value"":""1.0""}},{""element"":""W"",""idealAtomicPercent"":{""value"":""5.8""}},{""element"":""Zr"",""idealAtomicPercent"":{""value"":""0.13""}}]
Original CSV
Expected output
Without having the file structure it is hard to tell.
Try to load the file using:
import csv
with open(file_path) as file:
reader = csv.DictReader(...)
You will have to figure out the arguments for the function which depend on the file.
I'm trying to search a csv file having 150K+ row using keywords stored in a csv file with several dozen row. What's the best way to go about this? I've tried a few things but nothing has gotten me very far.
Current Code:
import csv
import pandas as pd
data = pd.read_csv('mycsv.csv')
for line in data:
if 'Apple' in line:
print(line)
This isn't what I want, it's just what I currently have. The for loop is my attempt at just getting output using one of the keywords from the smaller csv file. So far I'm either getting errors or there is no output.
Edit: I forgot to mention that the large csv file I'm trying to search from is from a web link, so I don't think with open is going to work
Supposing that your keywords are stored in a file named keys.csv and in each row of that file, there's only one keyword, like this:
Orange
Apple
...
then try this:
with open('mycsv.csv') as mycsv, open('keys.csv') as keys:
keys = keys.readlines()
for line in mycsv:
if any([key[:-2] in line for key in keys]):
print(line)
I used PyPDF2 to retrieve tabular data and I put it into a text file. It comes out with each word or number as a new line. I want to specify that the "states" be rows, the "years" to be column headers, and the following numbers to be put into the rows 10 at a time.
The PDF of the file has a good illustration of what I am trying to do. Does anyone have some good ideas as to how to reformat the text file to do so?
Provided here is the link to the pdf which I am taking data from.
file:///V:/Final%20Project/Data/DuckStampSales.pdf
My text file looks as such.
textfile