Skip to content

The Validation Config

You can add a new validation/classification pipeline to Workflow Web by providing a YAML-based configuration file. Based on the configuration, Workflow Web will automatically generate the pipeline for you and display the results/stats for it.

Example YAML Configruation File

Here is an example validation config YAML file:

YAML
name: 'Example Validation Config'
version: 0
results:
  query:
    pipeline: 'example-pipeline'
    user: 'afrokk'
    tag:
      - "catalog-2"
      - "realtime"
    start: '2020-01-01'
    stop: "2024-01-02"
    status:
      - 'success'
    plots:
      - '.png'
    products:
      - '.npz'
    site:
      - 'chime'
    display:
      - field: 'example.field'
        display-name: 'Example Field'
      - field: 'example.field2'
        display-name: 'Example Field 2'
classification:
  name: 'example-pipeline-feedback'
  for: work
  requirements:
    count: 3
  kind:
    labels:
    # single choice
      - name: 'Example Label'
        type: 'single'
        prompt: 'Does this look good?'
        choices: ['yes', 'no', 'maybe']
        default: 'maybe'
    # multiple choice
      - name: 'Example Issues'
        type: 'multiple'
        prompt: 'What issues did you notice?'
        choices:
          - 'Unrealistic Values'
          - 'Missing Values'
          - 'Bad Data'
workers:
  organization: 'ExampleOrg'
  team: 'ExampleTeam'
  user-quota: 100

YAML Specification

This section describes the YAML specification for the validation pipeline configuration file. It explains the structure of the YAML file and the meaning of each field.

name


The name of the validation pipeline as it appears in the Workflow Web UI (sidebar, breadcrumbs, etc.).

version


The version of the validation pipeline. This is used to track changes to the pipeline configuration. Usually, this is set to 0.

results


This section describes the results that the validation pipeline will display. The results section has the following field:

query

This is the query that will be used to fetch the results. The query section has the following fields:

  • pipeline: The name of the pipeline that the query will be run on. For example, header-localization.
  • user: GitHub username of the user who ran the pipeline.

Optional Fields:

  • tag: A list of tags that the pipeline must have. Results that don't have these tags will be excluded from the results.
  • start: The start date of the results. Results that were created before this date will be excluded from the results.
  • stop: The stop date of the results. Results that were created after this date will be excluded from the results.
  • status: This field is used to filter the results based on their status. For example, success, failed, etc.
  • plots: The file extension of the plots that the results must have. For example, .png, .jpg, etc.
  • products: The file extension of the products that the results must have. For example, .npz, .h5, etc.
  • site: The site where to fetch results from. For example, chime, canfar, etc.
  • display: You can specify the fields that you want to display in the results. The display section has the following fields:
    • field: The field in the results that you want to display. For example, fit_statistics.snr.
    • display-name: The name that you want to display for the field on the UI. For example, SNR.

classification


This section describes the classification feedback options that the validation pipeline will display. The classification section has the following fields:

name

The name of the classification feedback as it appears in the Workflow Web UI.

for

The type of work that the classification feedback is for. For example, work, results, etc.

requirements

This field is used to set the requirements for each result. The requirements section has the following fields:

  • count: The minimum number of classifications required for each result for it to be considered "classified". For example, setting it to 3 would mean you require three people to validate. Once this number is reached, the result will be marked as "classified" and added to the classification stats.

kind

This field is used to define the type of classification feedback. The kind section has the following fields:

  • labels: Labels are the questions that the user will answer. The labels section has the following fields:
    • name: The name of the label. For example, dispersion.
    • type: The type of the label. For example, single or multiple. Single means that the user can only choose one option, while multiple means that the user can choose multiple options.
    • prompt: The prompt that will be displayed to the user. For example, Does this Dispersion look right?.
    • choices: A list of choices that the user can choose from. For example, ['yes', 'no', 'maybe'].
    • default: The default choice. There is no default choice for multiple choice labels. For single choice labels, the default choice will be selected if the user does not select any choice.

workers


This section describes the workers (users) that will be able to classify the results. The workers section has the following fields:

organization

The organization that the workers belong to. For example, CHIMEFRB.

team

The GitHub team that the workers belong to, within the organization.

user-quota

The number of workers that will be able to classify the results. For example, 100.

Now that you have created the validation pipeline configuration file, you can add it to Workflow Web by following the steps in the next section.