Design¶
Prefer a usable API to simple implementation¶
It's more important to provide a user-friendly API to many people than to save some of our time when implementing the functionality.
Prefer named functions to overloaded operators¶
The names can better convey the intention of the programmer and enable better auto-completion.
Prefer methods to global functions¶
This aids discoverability and again enables better auto-completion. It also supports polymorphism.
Prefer separate functions to functions with a flag parameter¶
Some flag parameters drastically alter the semantics of a function. This can lead to confusion, and, if the parameter is optional, to errors if the default value is kept unknowingly. In such cases having two separate functions is preferable.
Return copies of objects¶
Modifying objects in-place can lead to surprising behaviour and hard-to-find bugs. Methods shall never change the object they're called on or any of their parameters.
DO (library code):
The corresponding docstring should explicitly state that a method returns a copy:
DO (library code):
Avoid uncommon abbreviations¶
Write full words rather than abbreviations. The increased verbosity is offset by better readability, better functioning auto-completion, and a reduced need to consult the documentation when writing code. Common abbreviations like CSV or HTML, and well-known mathematical terms like min or max are fine though, since they rarely require explanation.
Place more important parameters first¶
Parameters that are more important to the user should be placed first. This also applies to keyword-only parameters, since they still have a fixed order in the documentation. In particular, parameters of model constructors should have the following order:
- Model hyperparameters (e.g., the number of trees in a random forest)
- Algorithm hyperparameters (e.g., the learning rate of a gradient boosting algorithm)
- Regularization hyperparameters (e.g., the maximum depth of a decision tree)
- Other parameters (e.g., the random seed)
DO (library code):
DON'T (library code):
Consider marking optional parameters as keyword-only¶
Keyword-only parameters are parameters that can only be passed by name. It prevents users from accidentally passing a value to the wrong parameter. This can happen easily if several parameters have the same type. Moreover, marking a parameter as keyword-only allows us to change the order of parameters without breaking client code. Because of this, strongly consider marking optional parameters as keyword-only. In particular, optional hyperparameters of models should be keyword-only.
Specify types of parameters and results¶
Use type hints to describe the types of parameters and results of functions. This enables static type checking of client code.
Use narrow data types¶
Use data types that can accurately model the legal values of a declaration. This improves static detection of wrong client code.
Check preconditions of functions and fail early¶
Not all preconditions of functions can be described with type hints but must instead be checked at runtime. This should be done as early as possible, usually right at the top of the body of a function. If the preconditions fail, execution of the function should halt and either a sensible value be returned (if possible) or an exception with a descriptive message be raised.
DO (library code):
Raise either Python exceptions or custom exceptions¶
The user should not have to deal with exceptions that are defined in the wrapper libraries. So, any exceptions that may be raised when a third-party function is called should be caught and a core Python exception or a custom exception should be raised instead. The exception to this rule is when we call a callable created by the user: In this case, we just pass any exceptions thrown by this callable along.
DO (library code):
DON'T (library code):
Group API elements by task¶
Packages should correspond to a specific task like classification or imputation. This eases discovery and makes it easy to switch between different solutions for the same task.
Group values that are used together into an object¶
Passing values that are commonly used together around separately is tedious, verbose, and error-prone. Group them into an object instead.