How to implement the server URL in a python program - python

I'm new to python. And I extracted some links from twitter and tried to use the Memento Aggregator to create a histogram. I've already used docker online playground to open the Memento Aggregator and how do I write a program to implement these links from a txt to the Memento Aggregator?
My current code is
import requests
host = 'https://www.katacoda.com/courses/docker/playground'
with open('Extracted links.txt', 'w') as f:
for line in f:
response = requests.get(host+"/timemap/json/"+line)
I used tweepy to extract some links. Here are some samples.
https://www.mytownneo.com/sports/20200215/no-comeback-this-time-barberton-crushes-tallmadge-boys-basketball
https://twitter.com/i/web/status/1228709479589404672
https://www.sctimes.com/story/sports/2020/02/14/albany-beats-no-1-sauk-centre-final-seconds/4768857002/
http://www.fiba.basketball/fiba-once-again-top-in-international-sports-federations-social-media-ranking-report-for-2019
https://twitter.com/i/web/status/1228709487600640000
How to write a python program to implement them in the docker containment and use the memogator to process these links?
I use this playground to run memogator.(https://www.katacoda.com/courses/docker/playground) I tried to let Memogator timemap to process these links but I got stuck at implementing these url links to the MemoGator.This is MemoGator which can process links. And here is the GitHub link. github.com/oduwsdl/memgator enter image description here

You can use wget package to download the file and read the file:
import wget
url="your URL"
filename = wget.download(url)
#Read csv/text
pdf = pd.read_csv(filename)
print ("Shape of dataset: ", pdf.shape)
print(pdf.head(5))

Related

Download GitLab file with gitlab-python

I am trying to download a file or folder from my gitlab repository, but they only way I have seen to do it is using CURL and command line. Is there any way to download files from the repository with just the python-gitlab API? I have read through the API and have not found anything, but other posts said it was possible, just gave no solution.
You can do like this:
import requests
response = requests.get('https://<your_path>/file.txt')
data = response.text
and then save the contents (data) as file...
Otherwise use the API:
f = project.files.get(path='<folder>/file.txt',ref='<branch or commit>')
and then decode using:
import base64
content = base64.b64decode(f.content)
and then save content as file...

How to download Flickr images using photos url (does not contain .jpg, .png, etc.) using Python

I want to download image from Flickr using following type of links using Python:
https://www.flickr.com/photos/66176388#N00/2172469872/
https://www.flickr.com/photos/clairity/798067744/
This data is obtained from xml file given at https://snap.stanford.edu/data/web-flickr.html
Is there any Python script or way to download images automatically.
Thanks.
I try to find answer from other sources and compiled the answer as follows:
import re
from urllib import request
def download(url, save_name):
html = request.urlopen(url).read()
html=html.decode('utf-8')
img_url = re.findall(r'https:[^" \\:]*_b\.jpg', html)[0]
print(img_url)
with open(save_name, "wb") as fp:
fp.write(request.urlopen(img_url).read())
download('https://www.flickr.com/photos/clairity/798067744/sizes/l/', 'image.jpg')

Taking list of mp4 URLs and downloading file python

How would I go about navigating to a URL that's stored in a list and downloading the file? I'd preferably like to be able to store the MP4 file as it's clip title. I've used requests to retrieve the urls.
Thanks
list_clips = ['https://clips.twitch.tv/SpeedySneakyHeronKappaClaus', 'https://clips.twitch.tv/SplendidGiantPuffinThunBeast', 'https://clips.twitch.tv/ArtsyAuspiciousHamburgerThisIsSparta', 'https://clips.twitch.tv/BoringNiceHerbsSaltBae']
You can use python's requests module to download the file. Please refer the code below
import requests, os
for clips in list_clips:
clip_title = os.path.basename(clips)
r = requests.get(clips)
with open(clip_title+'.mp4', 'wb') as f:
f.write(r.content)

Download excel file using python

I have a web link which downloads an excel file directly. It opens a page writing "your file is downloading" and starts downloading the file.
Is there any way i can automate it using requests module ?
I am able to do it with selenium but i want it to run in background so i was wondering if i can use request module.
I have used request.get but it simply gives the text i.e "your file is downloading" but somehow i am not able to get the file.
This Python3 code downloads any file from web to a memory:
import requests
from io import BytesIO
url = 'your.link/path'
def get_file_data(url):
response = requests.get(url)
f = BytesIO()
for chunk in response.iter_content(chunk_size=1024):
f.write(chunk)
f.seek(0)
return f
data = get_file_data(url)
You can use next code to read the Excel file:
import pandas as pd
xlsx = pd.read_excel(data, skiprows=0)
print(xlsx)
It sounds like you don't actually have a direct URL to the file, and instead need to engage with some javascript. Perhaps there is an underlying network call that you can find by inspecting the page traffic in your browser that shows a direct URL for downloading the file. With that you can actually just read the excel file URL directly with pandas:
import pandas as pd
url = "https://example.com/some_file.xlsx"
df = pd.read_excel(url)
print(df)
This is nice and tidy, but if you really want to use requests (or avoid pandas) you can download the raw file content as shown in this answer and then use the pyexcel_xlsx package's get_xlsx function to read it without any pandas involvement.

download content torrent files point to in python3

Perhaps I'm misunderstanding how .torrent files work, but is there a way in python to download the actual referenced content the .torrent file points to that would be downloaded say using a torrent client such as utorrent, but from the shell/command line using python?
The following works for simply downloading a .torrent file and sure I could open the torrent client as well do download the .torrent, but I'd rather streamline the process in the command line. Can't seem to find much online about doing this...
torrent = torrentutils.parse_magnet(magnet)
infohash = torrent['infoHash']
session = requests.Session()
session.headers.update({'User-Agent': 'Mozilla/5.0'})
url = "http://torcache.net/torrent/" + infohash + ".torrent"
answer = session.get(url)
torrent_data = answer.content
buffer = BytesIO(torrent_data)
gz = gzip.GzipFile(fileobj = buffer)
output = open(torrent['name'], "wb")
output.write(torrent_data)
As far as I know i can't use libtorrent for python3 on a 64 bit windows os.
If magnet: links work in your web browser then a simple way to start a new torrent download from your Python script is to open the url using your web browser:
import webbrowser
webbrowser.open(magnet_link)
Or from the command-line:
$ python -m webbrowser "magnet:?xt=urn:btih:ebab37b86830e1ed624c1fdbb2c59a1800135610&dn=StackOverflow201508.7z"
The download is performed by your actual torrent client such as uTorrent.
BitTornado works on Windows and has a command line interface. Take a look at btdownloadheadless.py. But this written in Python 2.
http://www.bittornado.com/download.html

Categories

Resources