API Reference¶
Functions that generate parameters and data.
choix.generate_params |
Generate random model parameters. |
choix.generate_pairwise |
Generate pairwise comparisons from a Bradley–Terry model. |
choix.generate_rankings |
Generate rankings according to a Plackett–Luce model. |
choix.compare |
Generate a comparison outcome that follows Luce’s axiom. |
Various utilities such as distance functions, etc.
choix.probabilities |
Compute the comparison outcome probabilities given a subset of items. |
choix.footrule_dist |
Compute Spearman’s footrule distance between two models. |
choix.kendalltau_dist |
Compute the Kendall tau distance between two models. |
Functions that process pairwise comparisons.
choix.lsr_pairwise |
Compute the LSR estimate of model parameters. |
choix.ilsr_pairwise |
Compute the ML estimate of model parameters using I-LSR. |
choix.lsr_pairwise_dense |
Compute the LSR estimate of model parameters given dense data. |
choix.ilsr_pairwise_dense |
Compute the ML estimate of model parameters given dense data. |
choix.rank_centrality |
Compute the Rank Centrality estimate of model parameters. |
choix.opt_pairwise |
Compute the ML estimate of model parameters using scipy.optimize . |
choix.ep_pairwise |
Compute a distribution of model parameters using the EP algorithm. |
choix.mm_pairwise |
Compute the ML estimate of model parameters using the MM algorithm. |
choix.log_likelihood_pairwise |
Compute the log-likelihood of model parameters. |
Functions that process rankings.
choix.lsr_rankings |
Compute the LSR estimate of model parameters. |
choix.ilsr_rankings |
Compute the ML estimate of model parameters using I-LSR. |
choix.opt_rankings |
Compute the ML estimate of model parameters using scipy.optimize . |
choix.mm_rankings |
Compute the ML estimate of model parameters using the MM algorithm. |
choix.log_likelihood_rankings |
Compute the log-likelihood of model parameters. |
Functions that process top-1 lists.
choix.lsr_top1 |
Compute the LSR estimate of model parameters. |
choix.ilsr_top1 |
Compute the ML estimate of model parameters using I-LSR. |
choix.opt_top1 |
Compute the ML estimate of model parameters using scipy.optimize . |
choix.mm_top1 |
Compute the ML estimate of model parameters using the MM algorithm. |
choix.log_likelihood_top1 |
Compute the log-likelihood of model parameters. |
Functions that process choices in a network.
choix.choicerank |
Compute the MAP estimate of a network choice model’s parameters. |
choix.log_likelihood_network |
Compute the log-likelihood of model parameters. |
Generators¶
-
choix.
generate_params
(n_items, interval=5.0, ordered=False)[source]¶ Generate random model parameters.
This function samples a parameter independently and uniformly for each item.
interval
defines the width of the uniform distribution.Parameters: - n_items (int) – Number of distinct items.
- interval (float) – Sampling interval.
- ordered (bool, optional) – If true, the parameters are ordered from lowest to highest.
Returns: params – Model parameters.
Return type: numpy.ndarray
-
choix.
generate_pairwise
(params, n_comparisons=10)[source]¶ Generate pairwise comparisons from a Bradley–Terry model.
This function samples comparisons pairs independently and uniformly at random over the
len(params)
choose 2 possibilities, and samples the corresponding comparison outcomes from a Bradley–Terry model parametrized byparams
.Parameters: - params (array_like) – Model parameters.
- n_comparisons (int) – Number of comparisons to be returned.
Returns: data – Pairwise-comparison samples (see Pairwise comparisons).
Return type: list of (int, int)
-
choix.
generate_rankings
(params, n_rankings, size=3)[source]¶ Generate rankings according to a Plackett–Luce model.
This function samples subsets of items (of size
size
) independently and uniformly at random, and samples the correspoding partial ranking from a Plackett–Luce model parametrized byparams
.Parameters: - params (array_like) – Model parameters.
- n_rankings (int) – Number of rankings to generate.
- size (int, optional) – Number of items to include in each ranking.
Returns: data – A list of (partial) rankings generated according to a Plackett–Luce model with the specified model parameters.
Return type: list of numpy.ndarray
-
choix.
compare
(items, params, rank=False)[source]¶ Generate a comparison outcome that follows Luce’s axiom.
This function samples an outcome for the comparison of a subset of items, from a model parametrized by
params
. Ifrank
is True, it returns a ranking over the items, otherwise it returns a single item.Parameters: - items (list) – Subset of items to compare.
- params (array_like) – Model parameters.
- rank (bool, optional) – If true, returns a ranking over the items instead of a single item.
Returns: outcome – The chosen item, or a ranking over
items
.Return type: int or list of int
Utilities¶
-
choix.
probabilities
(items, params)[source]¶ Compute the comparison outcome probabilities given a subset of items.
This function computes, for each item in
items
, the probability that it would win (i.e., be chosen) in a comparison involving the items, given model parameters.Parameters: - items (list) – Subset of items to compare.
- params (array_like) – Model parameters.
Returns: probs – A probability distribution over
items
.Return type: numpy.ndarray
-
choix.
footrule_dist
(params1, params2=None)[source]¶ Compute Spearman’s footrule distance between two models.
This function computes Spearman’s footrule distance between the rankings induced by two parameter vectors. Let \(\sigma_i\) be the rank of item
i
in the model described byparams1
, and \(\tau_i\) be its rank in the model described byparams2
. Spearman’s footrule distance is defined by\[\sum_{i=1}^N | \sigma_i - \tau_i |\]By convention, items with the lowest parameters are ranked first (i.e., sorted using the natural order).
If the argument
params2
isNone
, the second model is assumed to rank the items by their index: item0
has rank 1, item1
has rank 2, etc.Parameters: - params1 (array_like) – Parameters of the first model.
- params2 (array_like, optional) – Parameters of the second model.
Returns: dist – Spearman’s footrule distance.
Return type: float
-
choix.
kendalltau_dist
(params1, params2=None)[source]¶ Compute the Kendall tau distance between two models.
This function computes the Kendall tau distance between the rankings induced by two parameter vectors. Let \(\sigma_i\) be the rank of item
i
in the model described byparams1
, and \(\tau_i\) be its rank in the model described byparams2
. The Kendall tau distance is defined as the number of pairwise disagreements between the two rankings, i.e.,\[\sum_{i=1}^N \sum_{j=1}^N \mathbf{1} \{ \sigma_i > \sigma_j \wedge \tau_i < \tau_j \}\]By convention, items with the lowest parameters are ranked first (i.e., sorted using the natural order).
If the argument
params2
isNone
, the second model is assumed to rank the items by their index: item0
has rank 1, item1
has rank 2, etc.If some values are equal within a parameter vector, all items are given a distinct rank, corresponding to the order in which the values occur.
Parameters: - params1 (array_like) – Parameters of the first model.
- params2 (array_like, optional) – Parameters of the second model.
Returns: dist – Kendall tau distance.
Return type: float
Processing pairwise comparisons¶
-
choix.
lsr_pairwise
(n_items, data, alpha=0.0, initial_params=None)[source]¶ Compute the LSR estimate of model parameters.
This function implements the Luce Spectral Ranking inference algorithm [MG15] for pairwise-comparison data (see Pairwise comparisons).
The argument
initial_params
can be used to iteratively refine an existing parameter estimate (see the implementation ofilsr_pairwise()
for an idea on how this works). If it is set to None (the default), the all-ones vector is used.The transition rates of the LSR Markov chain are initialized with
alpha
. Whenalpha > 0
, this corresponds to a form of regularization (see Notes on Regularization for details).Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Pairwise-comparison data.
- alpha (float, optional) – Regularization parameter.
- initial_params (array_like, optional) – Parameters used to build the transition rates of the LSR Markov chain.
Returns: params – An estimate of model parameters.
Return type: numpy.ndarray
-
choix.
ilsr_pairwise
(n_items, data, alpha=0.0, initial_params=None, max_iter=100, tol=1e-08)[source]¶ Compute the ML estimate of model parameters using I-LSR.
This function computes the maximum-likelihood (ML) estimate of model parameters given pairwise-comparison data (see Pairwise comparisons), using the iterative Luce Spectral Ranking algorithm [MG15].
The transition rates of the LSR Markov chain are initialized with
alpha
. Whenalpha > 0
, this corresponds to a form of regularization (see Notes on Regularization for details).Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Pairwise-comparison data.
- alpha (float, optional) – Regularization parameter.
- initial_params (array_like, optional) – Parameters used to initialize the iterative procedure.
- max_iter (int, optional) – Maximum number of iterations allowed.
- tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns: params – The ML estimate of model parameters.
Return type: numpy.ndarray
-
choix.
lsr_pairwise_dense
(comp_mat, alpha=0.0, initial_params=None)[source]¶ Compute the LSR estimate of model parameters given dense data.
This function implements the Luce Spectral Ranking inference algorithm [MG15] for dense pairwise-comparison data.
The data is described by a pairwise-comparison matrix
comp_mat
such thatcomp_mat[i,j]
contains the number of times that itemi
wins against itemj
.In comparison to
lsr_pairwise()
, this function is particularly efficient for dense pairwise-comparison datasets (i.e., containing many comparisons for a large fraction of item pairs).The argument
initial_params
can be used to iteratively refine an existing parameter estimate (see the implementation ofilsr_pairwise()
for an idea on how this works). If it is set to None (the default), the all-ones vector is used.The transition rates of the LSR Markov chain are initialized with
alpha
. Whenalpha > 0
, this corresponds to a form of regularization (see Notes on Regularization for details).Parameters: - comp_mat (np.array) – 2D square matrix describing the pairwise-comparison outcomes.
- alpha (float, optional) – Regularization parameter.
- initial_params (array_like, optional) – Parameters used to build the transition rates of the LSR Markov chain.
Returns: params – An estimate of model parameters.
Return type: np.array
-
choix.
ilsr_pairwise_dense
(comp_mat, alpha=0.0, initial_params=None, max_iter=100, tol=1e-08)[source]¶ Compute the ML estimate of model parameters given dense data.
This function computes the maximum-likelihood (ML) estimate of model parameters given dense pairwise-comparison data.
The data is described by a pairwise-comparison matrix
comp_mat
such thatcomp_mat[i,j]
contains the number of times that itemi
wins against itemj
.In comparison to
ilsr_pairwise()
, this function is particularly efficient for dense pairwise-comparison datasets (i.e., containing many comparisons for a large fraction of item pairs).The transition rates of the LSR Markov chain are initialized with
alpha
. Whenalpha > 0
, this corresponds to a form of regularization (see Notes on Regularization for details).Parameters: - comp_mat (np.array) – 2D square matrix describing the pairwise-comparison outcomes.
- alpha (float, optional) – Regularization parameter.
- initial_params (array_like, optional) – Parameters used to initialize the iterative procedure.
- max_iter (int, optional) – Maximum number of iterations allowed.
- tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns: params – The ML estimate of model parameters.
Return type: numpy.ndarray
-
choix.
rank_centrality
(n_items, data, alpha=0.0)[source]¶ Compute the Rank Centrality estimate of model parameters.
This function implements Negahban et al.’s Rank Centrality algorithm [NOS12]. The algorithm is similar to
ilsr_pairwise()
, but considers the ratio of wins for each pair (instead of the total count).The transition rates of the Rank Centrality Markov chain are initialized with
alpha
. Whenalpha > 0
, this corresponds to a form of regularization (see Notes on Regularization for details).Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Pairwise-comparison data.
- alpha (float, optional) – Regularization parameter.
Returns: params – An estimate of model parameters.
Return type: numpy.ndarray
-
choix.
opt_pairwise
(n_items, data, alpha=1e-06, method='Newton-CG', initial_params=None, max_iter=None, tol=1e-05)[source]¶ Compute the ML estimate of model parameters using
scipy.optimize
.This function computes the maximum-likelihood estimate of model parameters given pairwise-comparison data (see Pairwise comparisons), using optimizers provided by the
scipy.optimize
module.If
alpha > 0
, the function returns the maximum a-posteriori (MAP) estimate under an isotropic Gaussian prior with variance1 / alpha
. See Notes on Regularization for details.Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Pairwise-comparison data.
- alpha (float, optional) – Regularization strength.
- method (str, optional) – Optimization method. Either “BFGS” or “Newton-CG”.
- initial_params (array_like, optional) – Parameters used to initialize the iterative procedure.
- max_iter (int, optional) – Maximum number of iterations allowed.
- tol (float, optional) – Tolerance for termination (method-specific).
Returns: params – The (penalized) ML estimate of model parameters.
Return type: numpy.ndarray
Raises: ValueError
– If the method is not “BFGS” or “Newton-CG”.
-
choix.
ep_pairwise
(n_items, data, alpha, model='logit', max_iter=100, initial_state=None)[source]¶ Compute a distribution of model parameters using the EP algorithm.
This function computes an approximate Bayesian posterior probability distribution over model parameters, given pairwise-comparison data (see Pairwise comparisons). It uses the expectation propagation algorithm, as presented, e.g., in [CG05].
The prior distribution is assumed to be isotropic Gaussian with variance
1 / alpha
. The posterior is approximated by a a general multivariate Gaussian distribution, described by a mean vector and a covariance matrix.Two different observation models are available.
logit
(default) assumes that pairwise-comparison outcomes follow from a Bradley-Terry model.probit
assumes that the outcomes follow from Thurstone’s model.Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Pairwise-comparison data.
- alpha (float) – Inverse variance of the (isotropic) prior.
- model (str, optional) – Observation model. Either “logit” or “probit”.
- max_iter (int, optional) – Maximum number of iterations allowed.
- initial_state (tuple of array_like, optional) – Natural parameters used to initialize the EP algorithm.
Returns: - mean (numpy.ndarray) – The mean vector of the approximate Gaussian posterior.
- cov (numpy.ndarray) – The covariance matrix of the approximate Gaussian posterior.
Raises: ValueError
– If the observation model is not “logit” or “probit”.
-
choix.
mm_pairwise
(n_items, data, initial_params=None, alpha=0.0, max_iter=10000, tol=1e-08)[source]¶ Compute the ML estimate of model parameters using the MM algorithm.
This function computes the maximum-likelihood (ML) estimate of model parameters given pairwise-comparison data (see Pairwise comparisons), using the minorization-maximization (MM) algorithm [Hun04], [CD12].
If
alpha > 0
, the function returns the maximum a-posteriori (MAP) estimate under a (peaked) Dirichlet prior. See Notes on Regularization for details.Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Pairwise-comparison data.
- initial_params (array_like, optional) – Parameters used to initialize the iterative procedure.
- alpha (float, optional) – Regularization parameter.
- max_iter (int, optional) – Maximum number of iterations allowed.
- tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns: params – The ML estimate of model parameters.
Return type: numpy.ndarray
Processing rankings¶
-
choix.
lsr_rankings
(n_items, data, alpha=0.0, initial_params=None)[source]¶ Compute the LSR estimate of model parameters.
This function implements the Luce Spectral Ranking inference algorithm [MG15] for ranking data (see Rankings).
The argument
initial_params
can be used to iteratively refine an existing parameter estimate (see the implementation ofilsr_rankings()
for an idea on how this works). If it is set to None (the default), the all-ones vector is used.The transition rates of the LSR Markov chain are initialized with
alpha
. Whenalpha > 0
, this corresponds to a form of regularization (see Notes on Regularization for details).Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Ranking data.
- alpha (float, optional) – Regularization parameter.
- initial_params (array_like, optional) – Parameters used to build the transition rates of the LSR Markov chain.
Returns: params – An estimate of model parameters.
Return type: numpy.ndarray
-
choix.
ilsr_rankings
(n_items, data, alpha=0.0, initial_params=None, max_iter=100, tol=1e-08)[source]¶ Compute the ML estimate of model parameters using I-LSR.
This function computes the maximum-likelihood (ML) estimate of model parameters given ranking data (see Rankings), using the iterative Luce Spectral Ranking algorithm [MG15].
The transition rates of the LSR Markov chain are initialized with
alpha
. Whenalpha > 0
, this corresponds to a form of regularization (see Notes on Regularization for details).Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Ranking data.
- alpha (float, optional) – Regularization parameter.
- initial_params (array_like, optional) – Parameters used to initialize the iterative procedure.
- max_iter (int, optional) – Maximum number of iterations allowed.
- tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns: params – The ML estimate of model parameters.
Return type: numpy.ndarray
-
choix.
opt_rankings
(n_items, data, alpha=1e-06, method='Newton-CG', initial_params=None, max_iter=None, tol=1e-05)[source]¶ Compute the ML estimate of model parameters using
scipy.optimize
.This function computes the maximum-likelihood estimate of model parameters given ranking data (see Rankings), using optimizers provided by the
scipy.optimize
module.If
alpha > 0
, the function returns the maximum a-posteriori (MAP) estimate under an isotropic Gaussian prior with variance1 / alpha
. See Notes on Regularization for details.Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Ranking data.
- alpha (float, optional) – Regularization strength.
- method (str, optional) – Optimization method. Either “BFGS” or “Newton-CG”.
- initial_params (array_like, optional) – Parameters used to initialize the iterative procedure.
- max_iter (int, optional) – Maximum number of iterations allowed.
- tol (float, optional) – Tolerance for termination (method-specific).
Returns: params – The (penalized) ML estimate of model parameters.
Return type: numpy.ndarray
Raises: ValueError
– If the method is not “BFGS” or “Newton-CG”.
-
choix.
mm_rankings
(n_items, data, initial_params=None, alpha=0.0, max_iter=10000, tol=1e-08)[source]¶ Compute the ML estimate of model parameters using the MM algorithm.
This function computes the maximum-likelihood (ML) estimate of model parameters given ranking data (see Rankings), using the minorization-maximization (MM) algorithm [Hun04], [CD12].
If
alpha > 0
, the function returns the maximum a-posteriori (MAP) estimate under a (peaked) Dirichlet prior. See Notes on Regularization for details.Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Ranking data.
- initial_params (array_like, optional) – Parameters used to initialize the iterative procedure.
- alpha (float, optional) – Regularization parameter.
- max_iter (int, optional) – Maximum number of iterations allowed.
- tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns: params – The ML estimate of model parameters.
Return type: numpy.ndarray
Processing top-1 lists¶
-
choix.
lsr_top1
(n_items, data, alpha=0.0, initial_params=None)[source]¶ Compute the LSR estimate of model parameters.
This function implements the Luce Spectral Ranking inference algorithm [MG15] for top-1 data (see Top-1 lists).
The argument
initial_params
can be used to iteratively refine an existing parameter estimate (see the implementation ofilsr_top1()
for an idea on how this works). If it is set to None (the default), the all-ones vector is used.The transition rates of the LSR Markov chain are initialized with
alpha
. Whenalpha > 0
, this corresponds to a form of regularization (see Notes on Regularization for details).Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Top-1 data.
- alpha (float) – Regularization parameter.
- initial_params (array_like) – Parameters used to build the transition rates of the LSR Markov chain.
Returns: params – An estimate of model parameters.
Return type: numpy.ndarray
-
choix.
ilsr_top1
(n_items, data, alpha=0.0, initial_params=None, max_iter=100, tol=1e-08)[source]¶ Compute the ML estimate of model parameters using I-LSR.
This function computes the maximum-likelihood (ML) estimate of model parameters given top-1 data (see Top-1 lists), using the iterative Luce Spectral Ranking algorithm [MG15].
The transition rates of the LSR Markov chain are initialized with
alpha
. Whenalpha > 0
, this corresponds to a form of regularization (see Notes on Regularization for details).Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Top-1 data.
- alpha (float, optional) – Regularization parameter.
- initial_params (array_like, optional) – Parameters used to initialize the iterative procedure.
- max_iter (int, optional) – Maximum number of iterations allowed.
- tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns: params – The ML estimate of model parameters.
Return type: numpy.ndarray
-
choix.
opt_top1
(n_items, data, alpha=1e-06, method='Newton-CG', initial_params=None, max_iter=None, tol=1e-05)[source]¶ Compute the ML estimate of model parameters using
scipy.optimize
.This function computes the maximum-likelihood estimate of model parameters given top-1 data (see Top-1 lists), using optimizers provided by the
scipy.optimize
module.If
alpha > 0
, the function returns the maximum a-posteriori (MAP) estimate under an isotropic Gaussian prior with variance1 / alpha
. See Notes on Regularization for details.Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Top-1 data.
- alpha (float, optional) – Regularization strength.
- method (str, optional) – Optimization method. Either “BFGS” or “Newton-CG”.
- initial_params (array_like, optional) – Parameters used to initialize the iterative procedure.
- max_iter (int, optional) – Maximum number of iterations allowed.
- tol (float, optional) – Tolerance for termination (method-specific).
Returns: params – The (penalized) ML estimate of model parameters.
Return type: numpy.ndarray
Raises: ValueError
– If the method is not “BFGS” or “Newton-CG”.
-
choix.
mm_top1
(n_items, data, initial_params=None, alpha=0.0, max_iter=10000, tol=1e-08)[source]¶ Compute the ML estimate of model parameters using the MM algorithm.
This function computes the maximum-likelihood (ML) estimate of model parameters given top-1 data (see Top-1 lists), using the minorization-maximization (MM) algorithm [Hun04], [CD12].
If
alpha > 0
, the function returns the maximum a-posteriori (MAP) estimate under a (peaked) Dirichlet prior. See Notes on Regularization for details.Parameters: - n_items (int) – Number of distinct items.
- data (list of lists) – Top-1 data.
- initial_params (array_like, optional) – Parameters used to initialize the iterative procedure.
- alpha (float, optional) – Regularization parameter.
- max_iter (int, optional) – Maximum number of iterations allowed.
- tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns: params – The ML estimate of model parameters.
Return type: numpy.ndarray
Processing choices in a network¶
-
choix.
choicerank
(digraph, traffic_in, traffic_out, weight=None, initial_params=None, alpha=1.0, max_iter=10000, tol=1e-08)[source]¶ Compute the MAP estimate of a network choice model’s parameters.
This function computes the maximum-a-posteriori (MAP) estimate of model parameters given a network structure and node-level traffic data (see Choices in a network), using the ChoiceRank algorithm [MG17], [KTVV15].
The nodes are assumed to be labeled using consecutive integers starting from 0.
Parameters: - digraph (networkx.DiGraph) – Directed graph representing the network.
- traffic_in (array_like) – Number of arrivals at each node.
- traffic_out (array_like) – Number of departures at each node.
- weight (str, optional) – The edge attribute that holds the numerical value used for the edge weight. If None (default) then all edge weights are 1.
- initial_params (array_like, optional) – Parameters used to initialize the iterative procedure.
- alpha (float, optional) – Regularization parameter.
- max_iter (int, optional) – Maximum number of iterations allowed.
- tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns: params – The MAP estimate of model parameters.
Return type: numpy.ndarray
Raises: ImportError
– If the NetworkX library cannot be imported.