CHIME/FRB Guidelines when using Python¶
Here we assume that you have knowledge of python and can produce basic scripts. If this is not the case, please refer to website.
Documentation¶
In Code¶
For in-code documentation we follow the standards developed by the NumPy community. Some of these standards are shared below, but for an in-depth read, checkout their official guide.
DocStrings¶
A documentation string (docstring) is a string that describes a module, function, class, or method definition. A docstring should include the following components:
- A short summary
- An optional extended summary
- Parameters
- Attributes - required by class
__init__
- Methods - maps to exposed functions for class
- Returns or Yields for generators
- Raises - if required
- Notes (optional)
- References (optional)
- Examples (optional)
Example
class Foo(object):
"""Short description of the class
Optional longer description of the class
Attributes
----------
x : float
The X coordinate
y : float
The Y coordinate
Methods
-------
bar(var1: list, var2: int, var3: str ='hi', *args, **kwargs)
Some small description of bar
"""
def __init__(self, x: float, y: float):
self.x = x
self.y = y
def bar(var1: list, var2: int, var3: str = "hi", *args, **kwargs):
"""Short summary of the code
Several sentences providing an extended description. Refer to
variables using back-ticks, e.g. `var`.
Parameters
----------
var1 : array_like
Array_like means all those objects -- lists, nested lists, etc. --
that can be converted to an array. We can also refer to
variables like `var1`.
var2 : int
The type above can either refer to an actual Python type
(e.g. ``int``), or describe the type of the variable in more
detail, e.g. ``(N,) ndarray`` or ``array_like``.
var3: {'hi', 'ho'}, optional
Choices in brackets, default first when optional.
*args : iterable
Other arguments.
**kwargs : dict
Keyword arguments.
Returns
-------
describe : type
Explanation of return value named `describe`.
out : type
Explanation of `out`.
Raises
------
BadException
Because you shouldn't have done that.
Notes
-----
Notes about the implementation algorithm (if needed).
This can have multiple paragraphs.
You may include some math:
.. math:: X(e^{j\omega } ) = x(n)e^{ - j\omega n}
And even use a Greek symbol like :math:`\omega` inline.
References
----------
Cite the relevant literature, e.g. [1]_. You may also cite these
references in the notes section above.
.. [1] O. McNoleg, "The integration of GIS, remote sensing,
expert systems and adaptive co-kriging for environmental habitat
modelling of the Highland Haggis using object-oriented, fuzzy-logic
and neural-network techniques," Computers & Geosciences, vol. 22,
pp. 585-588, 1996.
Examples
--------
These are written in doctest format, and should illustrate how to
use the function.
>>> a = [1, 2, 3]
>>> print([x + 3 for x in a])
[4, 5, 6]
"""
pass
Comments¶
Always make sure that comments are up-to-date with changes in the code, as comments that contradict the code are worse than no comments at all!
Comments should be written in complete sentences with the first word capitalised. Block comments generally consist of one or more paragraphs built from complete sentences ending in a full stop. You should use two spaces after a sentence-ending full stop in multi-sentence comments, except for after the final sentence.
Block comments¶
These apply to some or all of the code that follow them, and are indented to the same level as the
code they are for. Each line starts with a hash followed by a single space #
. Paragraphs inside
block comments are separated by a line containing a single hash.
Correct
Inline comments¶
These should be used sparingly. Inline comments are placed on the same line as the code they describe.
They should be separated from the code by at least two spaces and start with a hash and single space #
.
Inline comments are distracting and so should not be used if they state the obvious.
But sometimes, this is useful:
Style Conventions¶
Import Statements¶
Use import statements for packages and modules only, not for individual classes or functions. Note: that there is an explicit exemption for imports from the typing module.
Correct
import x # For importing packages and modules.
from x import y # Where x is the package prefix and y is the module name with no prefix.
from x import y as z # If two modules named y are to be imported or if y is inconveniently long.
import y as z # Only when z is a standard abbreviation, e.g. import numpy as np
Imports should usually be on separate lines:
However, it is okay to do this:
Imports are always put at the top of file, just after module comments and docstrings, but before constants. Imports should be grouped in the following order: 1. Standard library imports. 2. Related third party imports. 3. Local application/library specific imports. A blank line should be placed between groups.
Naming Convention¶
When naming anything, the names that you give should be meaningful and descriptive. Avoid using throwaway names such as "temp". They should also follow the naming convention below:
package_name
module_name
ClassName
method_name
ExceptionName
function_name
CONSTANT_NAME
var_name
function_parameter_name
local_var_name
Type Annotations¶
Get to know PEP 484. You are not required to annotate all the functions in a module, but - Use judgement to get to a good balance between safety and clarity on the one hand, and flexibility on the other. - Annotate code that is prone to type-related errors (previous bugs or complexity). - Annotate code that is hard to understand. - Annotate code as it becomes stable from a types perspective. In many cases, you can annotate all the functions in mature code without losing too much flexibility.
Indentation¶
Indentations should use four spaces per indentation level.
Correct
# Aligned with opening delimiter:
foo = long_function_name(var_one, var_two,
var_three, var_four)
# Add 4 spaces (extra indentation level) to distinguish args
def long_function_name(
var_one, var_two, var_three,
var_four):
print(var_one)
# Hanging indents should add a level
foo = long_function_name(
var_one, var_two,
var_three, var_four)
Wrong
The closing brace/bracket/parenthesis on multi-line constructs may be ligned up with the first non-whitespace character:
or it may be lined up under the first character the starts the multi-line construct:
Testing¶
Writing test code and running these tests in parallel is best practice. It can also be a useful tool to help define your code's purpose more precisely.
General guidelines:
-
Each test unit should focus on a tiny bit of functionality and prove it correct.
-
Each test should run independently, regardless of the order that they are called in. The implication of this rule is that each test must be loaded with a fresh dataset and may have to do some cleanup afterwards.
-
Try hard to write tests that run fast, of the order of a few milliseconds.
-
Always run the full testing suite both before and after a coding session. This will give you confidence that you did not break anything.
-
Use hooks that run tests before pushing to a repository.
-
Long test function names are fine, as they are never called explicitly.
test_square_of_number2()
andtest_square_negative_number2()
are fine and descriptive, which is important as test names are displayed when a test fails. -
As with all of your code, write descriptive and clear comments.
Testing with pytest¶
Test files are located in the tests
directory and have either the test_
prefix or _test
suffix. Within the test file, all test functions should start with test_
.
The simplest type of test is an assertion, this is where we assert that something is true. To see this in action, let's create a new test file called test_simple.py
, paste in the following code and save the file.
Simple test
Here we have a simple function func()
which takes an integer and adds one to it. The test is the function below, test_answer()
, which inputs the value three into func()
and expects an output of four. When we start the test by running pytest
in the same directory, we get the following output:
============================= test session starts ==============================
platform darwin -- Python 3.9.12, pytest-7.1.2, pluggy-1.0.0
rootdir: /Users/name/dir
collected 1 item
test_sample.py . [100%]
============================== 1 passed in 0.03s ===============================
You can see that the function works as intended and the test passes. Now if we change line 6 to falsely assert that it should equal 5, this is the output that we receive when a test fails.
============================= test session starts ==============================
platform darwin -- Python 3.9.12, pytest-7.1.2, pluggy-1.0.0
rootdir: /Users/name/dir
collected 1 item
test_sample.py F [100%]
=================================== FAILURES ===================================
_________________________________ test_answer __________________________________
def test_answer():
> assert func(3) == 5
E assert 4 == 5
E + where 4 = func(3)
test_sample.py:6: AssertionError
=========================== short test summary info ============================
FAILED test_sample.py::test_answer - assert 4 == 5
============================== 1 failed in 0.10s ===============================
pytest
tells exactly where the error occurred and it shows in the summary that the test failed because we are trying to assert that four equals five.
Multiple tests in a file¶
Using pytest
it is also possible to place multiple tests in a single file. May also want to group them in a class, as done below.
Grouping tests in a Class
Danger
Be aware though that each test has a unique instance of the class. Try to foresee what will happen in the example test below, then run the test.
Test fixtures¶
Test functions can request fixtures they require by declaring them as arguments. When pytest runs a test, it looks at the parameters in that test function, and searches for fixtures that have the same names as those parameters. Once pytest finds them, it runs those fixtures, captures what they returned (if anything), and passes those objects into the test function as arguments.
Essentially, the pytest fixture system gives the ability to define a generic setup step that can be reused over and over, just like a normal function would be used. Two different tests can request the same fixture and have pytest give each test their own result from that fixture.
This is extremely useful for making sure tests aren’t affected by each other. We can use this system to make sure each test gets its own fresh batch of data and is starting from a clean state so it can provide consistent, repeatable results.
Fixture example
import pytest
class Fruit:
def __init__(self, name):
self.name = name
self.cubed = False
def cube(self):
self.cubed = True
class FruitSalad:
def __init__(self, *fruit_bowl):
self.fruit = fruit_bowl
self._cube_fruit()
def _cube_fruit(self):
for fruit in self.fruit:
fruit.cube()
# Arrange
@pytest.fixture
def fruit_bowl():
return [Fruit("apple"), Fruit("banana")]
def test_fruit_salad(fruit_bowl):
# Act
fruit_salad = FruitSalad(*fruit_bowl)
# Assert
assert all(fruit.cubed for fruit in fruit_salad.fruit)
Parameterising of test arguments¶
A built-in decorator, pytest.mark.parameterize
, enables the parameterisation of arguments for a test function. This allows a compact way of testing various argument values in a test.
Parameterisation example
Note that it is also possible to combine multiple parameterised arguments by stacking the parameterized
decorators in order to work through multiple permutations. The example below will set the values of (x,y) in the following order: (0,2), (1,2), (0,3), and (1,3).
Stacking parameterised decorators
Code Test Coverage¶
This is an important metric to help you understand how well tested your software is. There are multiple criteria that can be used to determine if your code was tested. A few examples include: function coverage, branch coverage, and line coverage. It is important that when working on a repository, you try not to decrease the test coverage with your pull request.
Installing the pyest-cov coverage tool
To measure the coverage of our code we can use pytest-cov
, which is installed by simply running.
Below is an example script and corresponding tests:
from ..calculator import Calculator
def test_add():
assert Calculator.add(1, 2) == 3.0
assert Calculator.add(1.0, 2.0) == 3.0
assert Calculator.add(0, 2.0) == 2.0
assert Calculator.add(2.0, 0) == 2.0
assert Calculator.add(-4, 2.0) == -2.0
def test_subtract():
assert Calculator.subtract(1, 2) == -1.0
assert Calculator.subtract(2, 1) == 1.0
assert Calculator.subtract(1.0, 2.0) == -1.0
assert Calculator.subtract(0, 2.0) == -2.0
assert Calculator.subtract(2.0, 0.0) == 2.0
assert Calculator.subtract(-4, 2.0) == -6.0
def test_multiply():
assert Calculator.multiply(1, 2) == 2.0
assert Calculator.multiply(1.0, 2.0) == 2.0
assert Calculator.multiply(0, 2.0) == 0.0
assert Calculator.multiply(2.0, 0.0) == 0.0
assert Calculator.multiply(-4, 2.0) == -8.0
def test_divide():
assert Calculator.divide(1, 2) == 0.5
assert Calculator.divide(1.0, 2.0) == 0.5
assert Calculator.divide(0, 2.0) == 0
assert Calculator.divide(-4, 2.0) == -2.0
Copy and paste the above script and tests to files with the corresponding names. To run the tests and see their coverage use the following.
Result
============================= test session starts ==============================
platform darwin -- Python 3.9.12, pytest-7.1.2, pluggy-1.0.0
rootdir: /Users/name/dir
plugins: cov-3.0.0
collected 2 items
test_calculator.py .... [100%]
---------- coverage: platform darwin, python 3.9.12-final-0 ----------
Name Stmts Miss Cover
--------------------------------------------------
calculator.py 11 1 91%
test_calculator.py 25 0 100%
--------------------------------------------------
TOTAL 36 1 97%
============================== 2 passed in 0.11s ===============================
Codecov
An additional tool for visualising code coverage and integrating it into your repositories is CodeCov. To familiarise yourself with it, I recommend using their tutorial which can be found here.
Further reading for testing¶
For further information on using pytest, please refer to the documentation.
Practice what you have learnt here¶
There is an interactive repository designed to help you digest and practice the skills that you have learnt here. Navigate to the repository and follow the instructions: https://github.com/tjzegmott/python-style_guide