Download YFCC-100M dataset - python

I need to download Flicker YFCC-100M dataset. I have amazon AWS account but could not figure out way to download dataset.
There is blog but it is not clear for me to download the dataset
With flicker API, I can download images but that will not be YFCC100M.
Here is one suggestion but awscli could not installed on my system.
>> sudo apt install awscli
>> ..........
>> Error: Unable to correct problems, you have held broken packages.
Is there any easy way to get this dataset downloaded.

This assumes that you already have pip and either Python 2.6.5+ or Python 3.3+ installed on your system. If you want to install awscli, you'll need to run
pip install awscli --upgrade --user
You can read more about installing the AWS Command Line Interface (CLI) here.
In addition, i think this link would let you gain access to the dataset that you are looking for.

You need to register on the Yahoo Webscope website and add this dataset to the "Cart".
After submitting your request for the dataset, you should receive an email with instructions. I am reproducing a part of this email, after scrubbing out some of the details and privileged information.
Download and install s3cmd from http://s3tools.org/download (or using an appropriate package manager for your platform)
Run 's3cmd --configure' and enter your access key and secret ( available via XXXXXXXX <-- the actual link will be in their email
). Here you can also specify additional options, such as enabling
encryption during transfer, and enabling a proxy.
Run 's3cmd ls s3://yahoo-webscope/XXXXXXX/' to view the S3 objects for I3 - Yahoo Flickr Creative Commons 100M (14G) (Hosted on AWS)
Run 's3cmd get --recursive s3://yahoo-webscope/XXXXXXX/' to download a local copy of I3 - Yahoo Flickr Creative Commons 100M (14G)
(Hosted on AWS)
It should be easy for you to follow these steps and get the dataset. I agree, the steps are not very transparent in their website!

Related

pip install azure-functions in azure pipeline fails with pip authenticate task

I am building a CI/CD azure pipeline to build and publish an azure function from a DevOps repo to Azure. The function in question uses a custom SDK stored as a Python package Artifact in an organisation scoped feed.
If I use a pip authenticate task to be able to access the SDK, the task passes but the pipeline then crashes when installing the requirements.txt. Strangely, before we get to the SDK, there is an error installing the azure-functions package. If I remove the SDK requirement and the pip authenticate task this error does not occur however. So something about the authenticate task means the agent cannot access azure-functions.
Additionally, if I swap the order of 'azure-functions' and 'CustomSDK' in the requierments.txt, the agent is still unable to install the SDK artifact so something must be wrong with the authentication task:
steps:
- task: PipAuthenticate#1
displayName: 'Pip Authenticate'
inputs:
artifactFeeds: <organisation-scoped-feed>
pythonDownloadServiceConnections: <service-connection-to-SDK-URL>
Why can I not download these packages?
This was due to confusion around the extra index url. In order to access both PyPI and the artifact feed, the following settings need to be set:
- task: PipAuthenticate#1
displayName: 'Pip Authenticate'
inputs:
pythonDownloadServiceConnections: <service-connection-to-SDK-Feed>
onlyAddExtraIndex: true
This way pip will consult PyPI first, and then the artifact feed.
Try running the function while the _init_.py file is active on the screen.
If you're just trying out the Quickstart, you shouldn't need to change anything in the function.json file. When you start debugging, make sure you're looking at the _init_.py file.
When you run the trigger, make sure you're on the _init_ .py file. Otherwise, VS Code will try to run the current active window's file.

How to delete python package from pypi-type repository using command line?

I need to delete a python package from a private python package index (a.k.a. repository) using the command line. This repo is on artifactory, but I am not able to use the artifactory portal UI to do this.
The package was originally uploaded using twine. However, it looks like there is no delete functionality in twine.
Twine was able to succeed at the upload despite being agnostic of it being an artifactory repo... so I assume there is some kind of standardized pypi-type api...?
(This question is similar but different to How to remove a package from Pypi , because it is asking specifically about a private repo and asking specifically about a CLI solution)
As you have mentioned that the package has been uploaded to the Artifactory's repository using "twine", I assume the package currently exists in a PyPI local repository of the Artifactory instance.
Since you are looking for an option to delete this artifact from the repository via the command line, please check if this REST API call to delete the artifact is an option for you.
I am sharing a sample command here for your reference.
curl -u <USERNAME>:<PASSWORD> -X DELETE "http://ARTIFACTORY_HOST:ARTIFACTORY_PORT/artifactory/admin-pypi-remote-cache/bd/"
admin-pypi-remote is my repository name and bd is the target folder/package of the deletion task.
if you are using a local pypi server, to delete an uploaded artifact use the following command:
curl -u <USERNAME>:<PASSWORD> --form ":action=remove_pkg" --form "name=<PACKAGE_NAME>" --form "version=<VERSION>" <PYPI_SERVER_URL>

How to specify authentication for Pip Project setup pip with extra-index-url in pip.ini (Windows) or pip.conf (Mac/Linux) on azure pipelines/artifacts

Azure Artifacts allows posting a module to an Artifactory that can then be installed by using pip by setting extra-index-url in pip.ini (Windows) or pip.conf (Mac/Linux)
However, when using pip install, the system is asking for a user/password
Is it possible to setup this inside pip.conf and / or even better use .ssh signatures?
I was facing the same issue, a workaround solution that worked for me. To bypass the entire process Lance Li-MSFT mentioned:
It will ask your credentials and keep it in local cache, and it won't ask for user and password again if everything is ok
)
In the pip.ini / pip.conf file, add:
[global]
extra-index-url=https://<Personal Access Token>#pkgs.dev.azure.com/<Organization Name>/_packaging/<Feed Name>/pypi/simple/
This will be useful if you are in an environment where you can't do the first interactive login (Example Use-Case: Setting up an Azure Databricks from Azure Machine Learning Workspace and installing required packages).
Is it possible to setup this inside pip.conf and / or even better use
.ssh signatures?
What you met is expected behavior if it's the first time that you try to connect to the Azure Devops feed.
It will ask your credentials and keep it in local cache, and it won't ask for user and password again if everything is ok. Check my log:
We should note:
1.The Python Credential Provider is an artifacts-keyring package. It's used to keep the credentials instead of other options like pip.conf or .ssh.
2.What it asks for is a PAT. For me, I enter the pat in both User and Password inputs.
3.If you still need to enter the password every time when connecting to the feed, there must be something wrong with your Python Credential Provider(artifacts-keyring) package. Make sure you install this package successfully before running the pip install command.
4.There're two options(It seems you're using option2) to connect to the feed, they both need the artifacts keyring package to save the credentials. For me in windows environemnt, it's easy to install that package. But if you're in Linux environment, you should check the step4 in Get Tools button carefully:
Here's the link of prerequisites above.
Hope all above helps :)

How do I install this PIP package from Github?

I am basically trying to access crunchbase data through their REST API using python. There is a package available on github that gives me the following documentation. How do I get this "package" ?
The CrunchBase API provides a RESTful interface to the data found on CrunchBase. The response is in JSON format.
Register
Follow the steps below to start using the CrunchBase API:
Sign Up
Login & get API key
Browse the documentation.
Setup
pip install git+git://github.com/anglinb/python-crunchbase**
Up & Running
Import Crunchbase then intialize the Crunchbase object with your api key.
git+git://github.com/anglinb/python-crunchbase
pip install git+https://github.com/anglinb/python-crunchbase.git
You are missing the https
Update: make sure you have git installed on your system.
Add this in your requirements.txt file.
git+https://github.com/user_name/project_name.git
=========
Ideally requirements.txt or reqs.txt will exist in your project's root folder. This file is where all the python libraries' names will be stored along with precise version number.
Here is great deal of information with easy examples related to this topic
https://pip.readthedocs.io/en/1.1/requirements.html

How Can I deploy OVA file on Vsphere Client with python

I want to automate deploying OVA image on VSphere with python.
I looked up at some packages viz. Pysphere, psphere but didn't find direct method to do so.
is there any Library I'm missing or is there any other way to deploy OVA/OVF files/templates on VSphere with Python.
Pls help!!!
I have the same situation here and found that there is vSphere automation API here made in Python. Github clone here.
All you need to do is extract SDK and download deploy_ovf_template.py for usage here or from github clone here. This template will work with OVF, but since you want to work with OVA you'll need to do extra work and extract OVA (you'll get OVF and vmdk files).
For other scenarios, check PDF documentation here.
Be aware that this is supported 6.5>= vSphere
As far I know there are no appropriate api for deploying ovf template using python package. You can use ovftool, VMware OVF Tool is a command-line utility that allows you to import and export OVF packages to and from many VMware products.
download ovftool from vmware site https://my.vmware.com/web/vmware/details?productId=352&downloadGroup=OVFTOOL350
to install ovftool:-
sudo /bin/sh VMware-ovftool-3.5.0-1274719-lin.x86_64.bundle
to deploy ova image as template.
syntax:-
ovftool -dm=thick -ds=3par1 -n=abhi_vm /root/lab/extract/overcloud-esx-ovsvapp.ova vi://root:pwd#10.1.2**.**/datacenter/host/cluster
use os.system(ovftool_syntax) to use in your python script.

Categories

Resources