Science and technology

Use Python to parse configuration information

Sometimes, a program wants sufficient parameters that placing all of them as command-line arguments or surroundings variables just isn’t nice nor possible. In these instances, you’ll want to use a configuration file.

There are a number of well-liked codecs for configuration information. Among them are the venerable (though sometimes under-defined) INI format, the favored however generally arduous to put in writing by hand JSON format, the in depth but sometimes shocking in particulars YAML format, and the latest addition, TOML, which many individuals haven’t heard of but.

Your first process is to decide on a format after which to doc that alternative. With this simple half out of the way in which, it’s time to parse the configuration.

It is usually a good suggestion to have a category that corresponds to the “abstract” knowledge within the configuration. Because this code will do nothing with the configuration, that is the best approach to present parsing logic.

Imagine the configuration for a file processor: it consists of an enter listing, an output listing, and which information to select up.

The summary definition for the configuration class would possibly look one thing like:

from __future__ import annotations
import attr

@attr.frozen
class Configuration:
    @attr.frozen
    class Files:
        input_dir: str
        output_dir: str
    information: Files
    @attr.frozen
    class Parameters:
        patterns: List[str]
    parameters: Parameters

To make the format-specific code easier, additionally, you will write a perform to parse this class out of dictionaries. Note that this assumes the configuration will use dashes, not underscores. This form of discrepancy just isn’t unusual.

def configuration_from_dict(particulars):
    information = Configuration.Files(
        input_dir=particulars["files"]["input-dir"],
        output_dir=particulars["files"]["output-dir"],
    )
    parameters = Configuration.Paraneters(
        patterns=particulars["parameters"]["patterns"]
    )
    return Configuration(
        information=information,
        parameters=parameters,
    )

JSON

JSON (JavaScript Object Notation) is a JavaScript-like format.

Here is an instance configuration in JSON format:

json_config = """

"
""

The parsing logic parses the JSON into Python’s built-in knowledge buildings (dictionaries, lists, strings) utilizing the json module after which creates the category from the dictionary:

import json
def configuration_from_json(knowledge):
    parsed = json.hundreds(knowledge)
    return configuration_from_dict(parsed)

INI

The INI format, initially well-liked on Windows, turned a de facto configuration normal.

Here is similar configuration as an INI:

ini_config="""
[files]
input-dir = inputs
output-dir = outputs

[parameters]
patterns = ['*.txt', '*.md']
"

""

Python can parse it utilizing the built-in configparser module. The parser behaves as a dict-like object, so it may be handed on to configuration_from_dict:

import configparser

def configuration_from_ini(knowledge):
    parser = configparser.ConfigParser()
    parser.read_string(knowledge)
    return configuration_from_dict(parser)

YAML

YAML (Yet Another Markup Language) is an extension of JSON that’s designed to be simpler to put in writing by hand. It accomplishes this, partially, by having a protracted specification.

Here is similar configuration in YAML:

yaml_config = """
information:
  input-dir: inputs
  output-dir: outputs
parameters:
  patterns:
  - '*.txt'
  - '*.md'
"
""

For Python to parse this, you will have to put in a third-party module. The hottest is PyYAML (pip set up pyyaml). The YAML parser additionally returns built-in Python knowledge varieties that may be handed to configuration_from_dict. However, the YAML parser expects a stream, so that you must convert the string right into a stream.

import io
import yaml
def configuration_from_yaml(knowledge):
    fp = io.StringIO(knowledge)
    parsed = yaml.safe_load(fp)
    return configuration_from_dict(parsed)

TOML

TOML (Tom’s Own Markup Language) is designed to be a light-weight various to YAML. The specification is shorter, and it’s already well-liked in some locations (for instance, Rust’s package deal supervisor, Cargo, makes use of it for package deal configuration).

Here is similar configuration as a TOML:

toml_config = """
[files]
input-dir = "inputs"
output-dir = "outputs"

[parameters]
patterns = [ "*.txt", "*.md",]
"""

In order to parse TOML, that you must set up a third-party package deal. The hottest one is named, merely, toml. Like YAML and JSON, it returns fundamental Python knowledge varieties.

import toml
def configuration_from_toml(knowledge):
    parsed = toml.hundreds(knowledge)
    return configuration_from_dict(parsed)

Summary

Choosing a configuration format is a delicate tradeoff. However, when you make the choice, Python can parse a lot of the well-liked codecs utilizing a handful of strains of code.

Most Popular

To Top