Validation of Fitburst results¶
This guide will walk through the procedures needed to set up a validation workflow for fitburst results using the Workflow System.
1. Get a list of events to validate¶
All the fitburst process results are stored under /data/user-data/efonseca/runs/catalog_two/current_results/best_models/
.
- Name of the plots are of the form summary.{event_id}.png
- Name of the products are of the form covariance_matrices_{event_id}.npz
- Name of the results are of the form results_fitburst_{event_id}.json
To get the list of event ids, you can use the following command in your home directory on frb-analysis
:
ls /data/user-data/efonseca/runs/catalog_two/current_results/best_models/ | grep -o 'summary\.[^.]*' | awk -F'.' '{print $2}' | sort -u > ./fitburst_event_ids.txt
Which will create a file fitburst_event_ids.txt
in your home directory. You can then copy this file to your local machine using scp
or rsync
, or simply open it in vim
/nano
and copy the contents to your clipboard to paste into a local file.
2. Extract event results from json files¶
Now that we have the list of event id, we can read their corresponding json files and store them as a txt file. To do this, we first need to create a bash script in your home directory on frb-analysis
called fitburst_results.sh
:
#!/bin/bash
# Input file containing a list of events
events_file="fitburst_event_ids.txt"
# Directory path where JSON files are located
json_files_directory="/data/user-data/efonseca/runs/catalog_two/current_results/best_models/"
# Output file to store event names and corresponding JSON content
output_file="event_results.txt"
# Remove existing output file if it exists
rm -f "$output_file"
# Process each event from fitburst_event_ids.txt
while IFS= read -r event || [[ -n "$event" ]]; do
# Construct the file path
file_path="${json_files_directory}results_fitburst_${event}.json"
# Check if the file exists
if [ -e "$file_path" ]; then
# Read the JSON content
json_content=$(cat "$file_path")
# Write event name and JSON content to the output file
echo "$event" >> "$output_file"
echo "$json_content" >> "$output_file"
echo "" >> "$output_file" # Add an empty line for better readability
echo "Processed event: $event"
else
echo "File not found for event: $event"
fi
done < "$events_file"
Then, run the bash script:
which will create a file event_results.txt
in your home directory. You can then copy this file to your local machine using scp
or rsync
.
How to copy files from frb-analysis
to your local machine¶
There is a 2-step process to access frb-analysis
from your local machine:
1. ssh into frb-vsop.chime
2. ssh into frb-analysis
Similarly, when you try to copy files from frb-analysis
to your local machine, you need to do it in 2 steps:
1. Copy files from frb-analysis
to frb-vsop.chime
To do that, you can use the following command on frb-vsop
:
frb-vsop.chime
to your local machine
Run the following command on your local machine:
scp <username>@frb-vsop.chime:<source_file_path_on_vsop> <destination_file_path_on_local_machine>
3. Format results into a .py file¶
Now that we have the results in a txt file, we can format them into a .py file. The text editor in VSCode can be very helpful. The event_results.txt
file should look like this:
event_id_1
{
"model_parameters": {},
"fit_statistics": {},
"fit_logistics": {}
}
event_id_2
{
"model_parameters": {},
"fit_statistics": {},
"fit_logistics": {}
}
First, we need to change the file extension from .txt
to .py
, then we need to format it into a .py file that looks like this using the regex find and replace function in VSCode:
event_results = {
"event_id_1": {
"model_parameters": {},
"fit_statistics": {},
"fit_logistics": {}
},
"event_id_2": {
"model_parameters": {},
"fit_statistics": {},
"fit_logistics": {}
}
}
4. Manually create Work objects and insert into Results Database backend¶
Now that we have the results in a .py file, we can manually create Work objects and insert them into the Results Database backend. To do this, we first need to create a python script in the same folder as the event_results.py
in your local machine called add_fitburst_results.py
:
from chime_frb_api.workflow import Work
from fitburst_results import event_results
from time import time
from chime_frb_api.modules.results import Results
from bson import ObjectId
results = Results()
works_to_deposit = []
datetime = "20231218"
username = "siqiliu"
pipeline_name = "test-fitburst-classification"
event_workflow_id_mapping = {}
for event, work_results in event_results.items():
work_id = str(ObjectId())
work = Work(pipeline=pipeline_name, user=username, site="chime")
work.config = {"archive": {"results": True}}
work.parameters = {"event_number": event}
work.event = [event]
work.status = "success"
work.creation = time()
work.start = time()
work.stop = time()
work.plots = [f"/data/chime/baseband/processed/workflow/{datetime}/{pipeline_name}/{work['id']}/summary.{event}.png"]
work.products = [f"covariance_matrices_{event}.npz"]
work.results = work_results
work.id = work_id
works_to_deposit.append(work.payload)
event_workflow_id_mapping[event] = work_id
results.deposit(works_to_deposit)
print(event_workflow_id_mapping)
Then, run the python script:
And you should see the results in the Workflow Web at https://frb.chimenet.ca/workflow/results. In the terminal, there will be a dictionary mapping event ids to the workflow ids. You can copy this dictionary to a new python script called copy_plots.py
:
import os
import subprocess
# Source directory where the files are located
source_directory = "/data/user-data/efonseca/runs/catalog_two/current_results/best_models/"
# Destination root directory where you want to move the files
destination_root_directory = f"/data/chime/baseband/processed/workflow/20231218/test-fitburst-classification/"
event_id_mapping = {} # Copy the dictionary from the terminal here
# Ensure the destination root directory exists
subprocess.call(['sudo', 'mkdir', destination_root_directory])
# Iterate over each event in the mapping
for event, event_id in event_id_mapping.items():
source_file = f"summary.{event}.png"
source_path = os.path.join(source_directory, source_file)
destination_directory = os.path.join(destination_root_directory, event_id)
destination_path = os.path.join(destination_directory, source_file)
# Ensure the destination directory exists
subprocess.call(['sudo', 'mkdir', destination_directory])
try:
# Copy the file to the destination directory
subprocess.call(['sudo', 'cp', source_path, destination_path])
print(f"Copied {source_file} to {destination_path}")
except FileNotFoundError:
print(f"File not found: {source_path}")
except Exception as e:
print(f"Error copying {source_file}: {str(e)}")
The event_id_mapping should look like this:
event_id_mapping = {'99907418': '657feab5458ede32814116e5', '99862604': '657feab5458ede32814116e4'}
5. Copy the plots to the Workflow system compatible directory¶
We now need to copy the copy_plots.py
script to frb-analysis
and run it there. To do this, you can either copy the content of the script, ssh into frb-analysis
and paste it into a new file called copy_plots.py
, or transfer it using scp:
first, to copy the script to frb-vsop.chime
:
$ scp local/chime/path/to/move_plots.py <username>@frb-vsop.chime:/home/<username>/move_plots.py
Then, ssh into frb-vsop.chime
and copy the script to frb-analysis
:
Then, ssh into frb-analysis
and run the script:
And you should see the plots in the Workflow Web at https://frb.chimenet.ca/workflow/results.