Baseband Analysis on the CHIME/FRB on-site cluster¶
On-site Cluster¶
We have a 10-node cluster consisting of nodes frb-analysis[2-5]
and cfen[4-9]
with a combined computing power of 200 cores and 2TB ram. We operate the cluster using serverless systems using docker swarm.
Baseband Pipeline¶
The baseband pipeline services are spawned using a docker compose file located in docker/baseband_cluster_docker_compose.yml
.
It creates the following services:
baseband_pipeline_manager
: which has the control logic for executing pipeline steps in a systematic order.baseband_initialize
: which sets up everything needed for the pipeline to run.baseband_beamform
: The most expensive part of the pipeline. Performs beamforming on each frequency depending on the stage of the pipeline.baseband_merge
: Merges the beamformed files outputed by all thebeamforming
replicas.baseband_analysis
: Performs analysis on the merged file (MCMC localization/waterfalling etc.).baseband_finalize
: Does the clean up of metadata, creates a final diagnostic plot and uploads results to the database.
Each service communicates with the buckets server
which holds a json encoded workload for each stage of the pipeline. The service runs persistently performing the following steps:
- withdraw work from its bucket.
- If work is found.
- spawn the pipeline inside the container using
subprocess.Popen(cluster_cli.py)
and output the stdout and stderr to a log file on the archiver for that event. - sleep for a while and repeat the cycle.
Service | cores, ram, containers | bucket name | work payload |
---|---|---|---|
baseband_pipeline_manager |
1, 0.5G, 1 | baseband-pipeline |
{"event_number": 12345, "priority[OPTIONAL]": "low/medium/high", "run_stage[OPTIONAL]": "all/ refinement/localization/singlebeam"} |
baseband_initialize |
2, 1G, 1 | baseband-initialize |
{"event_number": 12345} |
baseband_beamform |
2, [10-16]G, 32 | baseband-beamform |
{"event_number": 12345, "job_id": 1, "baseband_pipeline": "singlebeam/localization/refinement", "parameters[OPTIONAL]": {"ra": 123,...}} |
baseband_merge |
2, 16G, 1 | baseband-merge |
{"event_number": 12345, "baseband_pipeline": "singlebeam/localization/refinement"} |
baseband_analysis |
2, 16G, 1 | baseband-analysis |
{"event_number": 12345, "baseband_pipeline": "singlebeam/localization/refinement"} |
baseband_finalize |
2, 1G, 1 | baseband-finalize |
{"event_number": 12345} |
How do you start the baseband pipeline cluster?¶
go to https://frb.chimenet.ca/ops/operations and click on Baseband Pipeline
.
How do you monitor the baseband pipeline while it is running?¶
go to https://frb.chimenet.ca/frb-web/baseband-pipeline
How do you monitor the baseband pipeline cluster?¶
You need to be frbadmin for detailed inspection.
[chitrang@frb-vsop ~]$ docker service ls | grep baseband
ypepapebrtv4 baseband_analysis replicated 1/1 chimefrb/baseband-localization:cluster
ggxy9laxbx0p baseband_beamform replicated 32/32 chimefrb/baseband-localization:cluster
cpfgedzo7eed baseband_finalize replicated 1/1 chimefrb/baseband-localization:cluster
ddyjwizkaayq baseband_initialize replicated 1/1 chimefrb/baseband-localization:cluster
pnwbypxqfg0f baseband_merge replicated 1/1 chimefrb/baseband-localization:cluster
3a4qyveqjn1b baseband_pipeline_manager replicated 1/1 chimefrb/baseband-localization:cluster
How do you run the pipeline on an event?¶
You just need to add an event number to the bucket: baseband-pipeline
which can be done from any where you have your FRB Master access and refresh tokens. You can also add different priorities for events (default is low
). In this case, events with high
priority will be analyzed first after finishing the current event.
In [2]: import chime_frb_api
In [3]: buck = chime_frb_api.bucket.Bucket(base_url="https://frb.chimenet.ca/maestro/buckets")
In [4]: work = {"event_number": 195718061}# Optionally you can add "priority": "low/medium/high", "run_stage": "all/localization/refinement/singlebeam"
In [5]: buck.deposit('baseband-pipeline', work=work, priority="low")
How do you get pipeline logs for an event?¶
The logs for the pipeline are in /data/chime/baseband/processed/YYYY/MM/DD/astro_<EVENT_NO/logs/>
.