Science and technology

Access Python package deal index JSON APIs with requests

PyPI, the Python package deal index, gives a JSON API for details about its packages. This is basically a machine-readable supply of the identical type of information you’ll be able to entry whereas looking the web site. For instance, as a human, I can head to the NumPy challenge web page in my browser, click on round, and see which variations there are, what information can be found, and issues like launch dates and which Python variations are supported:

But if I need to write a program to entry this information, I can use the JSON API as a substitute of getting to scrape and parse the HTML on these pages.

As an apart: On the previous PyPI web site, when it was hosted at pypi.python.org, the NumPy challenge web page was at pypi.python.org/pypi/numpy, and accessing the JSON was a easy matter of including a /json on the tip, therefore https://pypi.org/pypi/numpy/json. Now the PyPI web site is hosted at pypi.org, and NumPy’s challenge web page is at pypi.org/challenge/numpy. The new website does not embrace rendering the JSON, however it nonetheless runs because it was earlier than. So now, reasonably than including /json to the URL, it’s a must to bear in mind the URL the place they’re.

You can open up the JSON for NumPy in your browser by heading to its URL. Firefox renders it properly like this:

You can open information, releases, and urls to examine the contents inside. Or you’ll be able to load it right into a Python shell. Here are a couple of strains to get began:

import requests
url = "https://pypi.org/pypi/numpy/json"
r = requests.get(url)
information = r.json()

Once you may have the info (calling .json() gives a dictionary of the info), you’ll be able to examine it:

Open releases, and examine the keys inside it:

This exhibits that releases is a dictionary with model numbers as keys. Pick one (say, the most recent one) and examine that:

Each launch is a listing, and this one accommodates 24 gadgets. But what’s every merchandise? Since it is a checklist, you’ll be able to index the primary one and have a look:

This merchandise is a dictionary containing particulars a couple of specific file. So every of the 24 gadgets within the checklist pertains to a file related to this specific model quantity, i.e., the 24 information listed at https://pypi.org/project/numpy/1.20.1/#files.

You might write a script that appears for one thing inside the obtainable information. For instance, the next loop appears to be like for variations with sdist (supply distribution) information that specify a requires_python attribute and prints them:

for model, information in information['releases'].gadgets():
    for f in information:
        if f.get('packagetype') == 'sdist' and f.get('requires_python'):
            print(model, f['requires_python'])

piwheels

Last 12 months I implemented a similar API on the piwheels web site. piwheels.org is a Python package deal index that gives wheels (precompiled binary packages) for the Raspberry Pi structure. It’s basically a mirror of the package deal set on PyPI, however with Arm wheels as a substitute of information uploaded to PyPI by package deal maintainers.

Since piwheels mimics the URL construction of PyPI, you’ll be able to change the pypi.org a part of a challenge web page’s URL to piwheels.org. It’ll present you an identical type of challenge web page with particulars about which variations we have now constructed and which information can be found. Since I appreciated how the previous website allowed you so as to add /json to the tip of the URL, I made ours work that means, so NumPy’s challenge web page on PyPI is pypi.org/project/numpy. On piwheels, it’s piwheels.org/project/numpy, and the JSON is at piwheels.org/project/numpy/json.

There’s no have to duplicate the contents of PyPI’s API, so we offer details about what’s obtainable on piwheels and embrace a listing of all identified releases, some fundamental data, and a listing of information we have now:

Similar to the earlier PyPI instance, you might create a script to investigate the API contents, for instance, to point out the variety of information piwheels has for every model of NumPy:

import requests

url = "https://www.piwheels.org/project/numpy/json"
package deal = requests.get(url).json()

for model, information in package deal['releases'].gadgets():
    if information['information']:
        print(': information'.format(model, len(information['information'])))
    else:
        print(': No information'.format(model))

Also, every file accommodates some metadata:

One helpful factor is the apt_dependencies area, which lists the Apt packages wanted to make use of the library. In the case of this NumPy file, in addition to putting in NumPy with pip, you may additionally want to put in libatlas3-base and libgfortran utilizing Debian’s Apt package deal supervisor.

Here is an instance script that exhibits the Apt dependencies for a package deal:

import requests

def get_install(package deal, abi):
    url = 'https://piwheels.org/project//json'.format(package deal)
    r = requests.get(url)
    information = r.json()
    for model, launch in sorted(information['releases'].gadgets(), reverse=True):
        for filename, file in launch['information'].gadgets():
            if abi in filename:
                deps = ' '.be part of(file['apt_dependencies'])
                print("sudo apt install ".format(deps))
                print("sudo pip3 install ==".format(package deal, model))
                return

get_install('opencv-python', 'cp37m')
get_install('opencv-python', 'cp35m')
get_install('opencv-python-headless', 'cp37m')
get_install('opencv-python-headless', 'cp35m')

We additionally present a basic API endpoint for the checklist of packages, which incorporates obtain stats for every package deal:

import requests

url = "https://www.piwheels.org/packages.json"
packages = requests.get(url).json()
packages =

package deal = 'numpy'
d_month, d_all = packages[package deal]

print(package deal, "has had", d_month, "downloads in the last month")
print(package deal, "has had", d_all, "downloads in total")

Since pip search is at the moment disabled as a result of its XMLRPC interface being overloaded, folks have been in search of alternate options. You can use the piwheels JSON API to seek for package deal names as a substitute because the set of packages is similar:

#!/usr/bin/python3
import sys

import requests

PIWHEELS_URL = 'https://www.piwheels.org/packages.json'

r = requests.get(PIWHEELS_URL)
packages = p[zero] for p in r.json()

def search(time period):
    for pkg in packages:
        if time period in pkg:
            yield pkg

if __name__ == '__main__':
    if len(sys.argv) == 2:
        outcomes = search(sys.argv[1].decrease())
        for res in outcomes:
            print(res)
    else:
        print("Usage: pip_search TERM")

For extra data, see the piwheels JSON API documentation.


This article initially appeared on Ben Nuttall’s Tooling Tuesday blog and is reused with permission.

Most Popular

To Top