Fitting Algorithms ================== Summary of Available Algorithms ------------------------------- +-----------------------------+------------------+-----------------+---------------------------------------------------------------------------+ | Algorithm | Class | Parallelization | Applications | +=============================+==================+=================+===========================================================================+ | `Differential Evolution`_ | Population-based | Synchronous or | General-purpose parameter fitting | | | | Asynchronous | | +-----------------------------+------------------+-----------------+---------------------------------------------------------------------------+ | `Scatter Search`_ | Population-based | Synchronous | General-purpose parameter fitting, especially difficult problems with high| | | | | dimensions or many local minima | +-----------------------------+------------------+-----------------+---------------------------------------------------------------------------+ | `Particle Swarm`_ | Population-based | Asynchronous | Fitting models with high variability in runtime | +-----------------------------+------------------+-----------------+---------------------------------------------------------------------------+ | `Metropolis-Hastings MCMC`_ | Metropolis | Independent | Finding probability distributions of parameters (deprecated) | | | sampling | Markov Chains | | +-----------------------------+------------------+-----------------+---------------------------------------------------------------------------+ | `Simulated Annealing`_ | Metropolis | Independent | Problem-specific applications (deprecated) | | | sampling | Markov Chains | | +-----------------------------+------------------+-----------------+---------------------------------------------------------------------------+ | `Parallel Tempering`_ | Metropolis | Synchronized | Finding probability distributions in challenging probablity landscapes | | | sampling | Markov Chains | (deprecated) | +-----------------------------+------------------+-----------------+---------------------------------------------------------------------------+ | `Simplex`_ | Local search | Synchronous | Local optimization, or refinement of a result from another algorithm. | +-----------------------------+------------------+-----------------+---------------------------------------------------------------------------+ | `Adaptive MCMC`_ | Metropolis | Independent | Finding probability distributions in challenging probablity landscapes | | | sampling | Markov Chains | | +-----------------------------+------------------+-----------------+---------------------------------------------------------------------------+ .. | `DREAM`_ | Hybrid | Synchronous | \?\?\?\? | .. | | Population / | | | .. | | Metropolis | | | .. +-----------------------------+------------------+-----------------+---------------------------------------------------------------------------+ General implementation features for all algorithms -------------------------------------------------- All algorithms in PyBNF keep track of a list of parameter sets (a "population"), and over the course of the simulation, submit new parameter sets to run on the simulator. Algorithms periodically output the file ``sorted_params.txt`` containing the best parameter sets found so far, and the corresponding objective function values. Initialization ^^^^^^^^^^^^^^ The initial population of parameter sets is generated based on the keys specified for each free parameter: ``uniform_var``, ``loguniform_var``, ``normal_var`` or ``lognormal_var``. The value of the parameter in each new random parameter set is drawn from the specified probability distribution. The ``latin_hypercube`` option for initialization is enabled by default. This option only affects initialization of ``uniform_var``\ s and ``loguniform_var``\ s. When enabled, instead of drawing an independent random value for each starting parameter set, the starting parameter sets are generated with Latin hypercube sampling, which ensures a roughly even distribution of the parameter sets throughout the search space. .. _objective: Objective functions ^^^^^^^^^^^^^^^^^^^ All algorithms use an objective function to evaluate the quality of fit for each parameter set. The objective function is set with the ``objfunc`` key. The following options are available. Note that :math:`y_i` are the experimental data points and :math:`a_i` are the simulated data points. The summation is over all experimental data points. * Chi squared (``obj_func = chi_sq``): :math:`f(y, a) = \sum_i \frac{(y_i - a_i)^2}{2 \sigma_i^2}` , where :math:`\sigma_i` is the standard deviation of point :math:`y_i`, which must be specified in the :ref:`exp file `. * Sum of squares (``obj_func = sos``): :math:`f(y, a) = \sum_i (y_i - a_i)^2` * Sum of differences (``obj_func = sod``): :math:`f(y, a) = \sum_i |y_i - a_i|` * Normalized sum of squares (``obj_func = norm_sos``): :math:`f(y, a) = \sum_i \frac{(y_i - a_i)^2}{y_i^2}` * Average-normalized sum of squares (``obj_func = ave_norm_sos``): :math:`f(y, a) = \sum_i \frac{(y_i - a_i)^2}{\bar{y}^2}`, where :math:`\bar{y}` is the average of the entire data column :math:`y`. If you include any :ref:`constraints ` in your fit, the constraints add extra terms to the objective function. Changing parameter values ^^^^^^^^^^^^^^^^^^^^^^^^^ All algorithms perform changes to parameter values as the fitting proceeds. The way these changes are calculated depends on the type of parameter. ``loguniform_var``\ s and ``lognormal_var``\ s are moved in logarithmic space (base 10) throughout the entire fitting run. ``uniform_var``\ s and ``loguniform_var``\ s avoid moving outside the defined initialization range. If a move is attempted that would take the parameter outside the bounds, the parameter value is reflected over the boundary, back within bounds. This feature can be disabled by appending ``U`` to the end of the variable definition (e.g. ``uniform_var = x__FREE 10 30 U``) .. _alg-de: Differential Evolution ---------------------- Algorithm ^^^^^^^^^ A population of individuals (points in parameter space) are iteratively evaluated with an objective function. Parent individuals from the current iteration are selected to form new individuals in the next iteration. The new individual's parameters are derived by combining parameters from the parents. New individuals are accepted into the population if they have an objective value lower than that of a member of the current population. Parallelization ^^^^^^^^^^^^^^^ Three versions of differential evolution are available: All run in parallel, but they differ in their level of synchronicity. Asynchronous differential evolution (``fit_type = ade``) never allows processors to sit idle. One new simulation is started every time a simulation completes. This version is the best choice when a large number of processors are available. Synchronous differential evolution (``fit_type = de``) consists of discrete iterations. In each iteration, n simulations are run in parallel, but all must complete before moving on to the next iteration. Island-based differential evolution [Penas2015]_ is partially asynchronous algorithm. To use this version, set ``fit_type = de`` and set a value greater than 1 for the ``islands`` key. In this version, the current population consists of m islands. Each island is able to move on to the next iteration even if other islands are still in progress. If m is set to the number of available processors, then processors will never sit idle. Note however that this might still underperform compared to the synchronous algorithm run on the same number of processors. Implementation details ^^^^^^^^^^^^^^^^^^^^^^ We maintain a list of ``population_size`` current parameter sets, and in each iteration, ``population_size`` new parameter sets are proposed. The method to propose a new parameter set is specified by the config key ``de_strategy``. The default setting ``rand1`` works best for most problems, and runs as follows: We choose 3 random parameter sets p1, p2, and p3 in the current population. For each free parameter P, the new parameter set is assigned the value p1[P] + ``mutation_factor`` * (p2[P]-p3[P]) with probability ``mutation_rate``, or p1[P] with probability 1 - ``mutation_rate``. The new parameter set replaces the parameter set with the same index in the current population if it has a lower objective value. With ``de_strategy`` of ``best1`` or ``best2``, we force the above p1 to be the parameter set with the lowest objective value. With ``de_strategy`` of ``all1`` or ``all2``, we force p1 to be the parameter set at the same index we are proposing to replace. The ``best`` strategy results in fast convergence to what is likely only a local optimum. The ``all`` strategy converges more slowly, and prevents the entire population from converging to the same value. However, there is still a risk of each member of the population becoming stuck in its own local minimum. For the ``de_strategy``\ s ending in ``2``, we instead choose a total of 5 parameter sets, p1 through p5, and set the new parameter value as p1[P] + ``mutation_factor`` * (p2[P]-p3[P] + p4[P]-p5[P]) Asynchronous version """""""""""""""""""" The asynchronous version of the algorithm is identical to the sychronous algorithm, except that whenever a simulation completes, a new parameter set is immediately proposed based on the current population. Therefore, the random parameter sets p1, p2, and p3 might come from different iteration numbers. .. _alg-island: Island-based version """""""""""""""""""" In the island-based version of the algorithm [Penas2015]_, the population is divided into ``num_islands`` islands, which each follow the above update procedure independently. Every ``migrate_every`` iterations, a migration step occurs in which ``num_to_migrate`` individuals from each island are transferred randomly to others (according to a random permutation of the islands, keeping the number of individuals on each island constant). The migration step does not require synchronization of the islands; it is performed when the last island reaches the appropriate iteration number, regardless of whether other islands are already further along. Applications ^^^^^^^^^^^^ In our experience, differential evolution tends to be a good general-purpose algorithm. The asynchronous version has similar advantages to `Particle Swarm`_. .. _alg-ss: Scatter Search -------------- Algorithm ^^^^^^^^^ Scatter Search [Glover2000]_ functions similarly to differential evolution, but maintains a smaller current population than the number of available processors. In each iteration, every possible pair of individuals are combined to propose a new individual. Parallelization ^^^^^^^^^^^^^^^ In a scatter search run of population size n, each iteration requires n\*(n-1) independent simulations that can all be run in parallel. Scatter search requires synchronization at the end of each iteration, waiting for all simulations to complete before moving to the next iteration. Implementation details ^^^^^^^^^^^^^^^^^^^^^^ The PyBNF implementation follows the outline presented in the introduction of [Penas2017]_ and uses the recombination method described in [Egea2009]_. We maintain a reference set of ``population_size`` individuals, recommended to be a small number (~ 9-18). Each newly proposed parameter set is based on a "parent" parameter set and a "helper" parameter set, both from the current reference set. In each iteration, we consider all possible parent-helper combinations, for a total of n\*(n-1) parameter sets. The new parameter set depends on the rank of the parent and helper (call them :math:`p_i` and :math:`h_i`) when the reference set is sorted from best to worst. Then we apply a series of formulas to choose the next parameter value. Let :math:`\alpha` = -1 if :math:`h_i>p_i` or 1 if :math:`p_i