API Reference¶

Functions that generate parameters and data.

`choix.generate_params`	Generate random model parameters.
`choix.generate_pairwise`	Generate pairwise comparisons from a Bradley–Terry model.
`choix.generate_rankings`	Generate rankings according to a Plackett–Luce model.
`choix.compare`	Generate a comparison outcome that follows Luce’s axiom.

Various utilities such as distance functions, etc.

`choix.probabilities`	Compute the comparison outcome probabilities given a subset of items.
`choix.footrule_dist`	Compute Spearman’s footrule distance between two models.
`choix.kendalltau_dist`	Compute the Kendall tau distance between two models.

Functions that process pairwise comparisons.

`choix.lsr_pairwise`	Compute the LSR estimate of model parameters.
`choix.ilsr_pairwise`	Compute the ML estimate of model parameters using I-LSR.
`choix.lsr_pairwise_dense`	Compute the LSR estimate of model parameters given dense data.
`choix.ilsr_pairwise_dense`	Compute the ML estimate of model parameters given dense data.
`choix.rank_centrality`	Compute the Rank Centrality estimate of model parameters.
`choix.opt_pairwise`	Compute the ML estimate of model parameters using `scipy.optimize`.
`choix.ep_pairwise`	Compute a distribution of model parameters using the EP algorithm.
`choix.mm_pairwise`	Compute the ML estimate of model parameters using the MM algorithm.
`choix.log_likelihood_pairwise`	Compute the log-likelihood of model parameters.

Functions that process rankings.

`choix.lsr_rankings`	Compute the LSR estimate of model parameters.
`choix.ilsr_rankings`	Compute the ML estimate of model parameters using I-LSR.
`choix.opt_rankings`	Compute the ML estimate of model parameters using `scipy.optimize`.
`choix.mm_rankings`	Compute the ML estimate of model parameters using the MM algorithm.
`choix.log_likelihood_rankings`	Compute the log-likelihood of model parameters.

Functions that process top-1 lists.

`choix.lsr_top1`	Compute the LSR estimate of model parameters.
`choix.ilsr_top1`	Compute the ML estimate of model parameters using I-LSR.
`choix.opt_top1`	Compute the ML estimate of model parameters using `scipy.optimize`.
`choix.mm_top1`	Compute the ML estimate of model parameters using the MM algorithm.
`choix.log_likelihood_top1`	Compute the log-likelihood of model parameters.

Functions that process choices in a network.

`choix.choicerank`	Compute the MAP estimate of a network choice model’s parameters.
`choix.log_likelihood_network`	Compute the log-likelihood of model parameters.

Generators¶

choix.generate_params(n_items, interval=5.0, ordered=False)[source]¶

Generate random model parameters.

This function samples a parameter independently and uniformly for each item. interval defines the width of the uniform distribution.

Parameters:	n_items (int) – Number of distinct items. interval (float) – Sampling interval. ordered (bool, optional) – If true, the parameters are ordered from lowest to highest.
Returns:	params – Model parameters.
Return type:	numpy.ndarray

choix.generate_pairwise(params, n_comparisons=10)[source]¶

Generate pairwise comparisons from a Bradley–Terry model.

This function samples comparisons pairs independently and uniformly at random over the len(params) choose 2 possibilities, and samples the corresponding comparison outcomes from a Bradley–Terry model parametrized by params.

Parameters:	params (array_like) – Model parameters. n_comparisons (int) – Number of comparisons to be returned.
Returns:	data – Pairwise-comparison samples (see Pairwise comparisons).
Return type:	list of (int, int)

choix.generate_rankings(params, n_rankings, size=3)[source]¶

Generate rankings according to a Plackett–Luce model.

This function samples subsets of items (of size size) independently and uniformly at random, and samples the correspoding partial ranking from a Plackett–Luce model parametrized by params.

Parameters:	params (array_like) – Model parameters. n_rankings (int) – Number of rankings to generate. size (int, optional) – Number of items to include in each ranking.
Returns:	data – A list of (partial) rankings generated according to a Plackett–Luce model with the specified model parameters.
Return type:	list of numpy.ndarray

choix.compare(items, params, rank=False)[source]¶

Generate a comparison outcome that follows Luce’s axiom.

This function samples an outcome for the comparison of a subset of items, from a model parametrized by params. If rank is True, it returns a ranking over the items, otherwise it returns a single item.

Parameters:	items (list) – Subset of items to compare. params (array_like) – Model parameters. rank (bool, optional) – If true, returns a ranking over the items instead of a single item.
Returns:	outcome – The chosen item, or a ranking over `items`.
Return type:	int or list of int

Utilities¶

choix.probabilities(items, params)[source]¶

Compute the comparison outcome probabilities given a subset of items.

This function computes, for each item in items, the probability that it would win (i.e., be chosen) in a comparison involving the items, given model parameters.

Parameters:	items (list) – Subset of items to compare. params (array_like) – Model parameters.
Returns:	probs – A probability distribution over `items`.
Return type:	numpy.ndarray

choix.footrule_dist(params1, params2=None)[source]¶

Compute Spearman’s footrule distance between two models.

This function computes Spearman’s footrule distance between the rankings induced by two parameter vectors. Let \(\sigma_i\) be the rank of item i in the model described by params1, and \(\tau_i\) be its rank in the model described by params2. Spearman’s footrule distance is defined by

\[\sum_{i=1}^N | \sigma_i - \tau_i |\]

By convention, items with the lowest parameters are ranked first (i.e., sorted using the natural order).

If the argument params2 is None, the second model is assumed to rank the items by their index: item 0 has rank 1, item 1 has rank 2, etc.

Parameters:	params1 (array_like) – Parameters of the first model. params2 (array_like, optional) – Parameters of the second model.
Returns:	dist – Spearman’s footrule distance.
Return type:	float

choix.kendalltau_dist(params1, params2=None)[source]¶

Compute the Kendall tau distance between two models.

This function computes the Kendall tau distance between the rankings induced by two parameter vectors. Let \(\sigma_i\) be the rank of item i in the model described by params1, and \(\tau_i\) be its rank in the model described by params2. The Kendall tau distance is defined as the number of pairwise disagreements between the two rankings, i.e.,

\[\sum_{i=1}^N \sum_{j=1}^N \mathbf{1} \{ \sigma_i > \sigma_j \wedge \tau_i < \tau_j \}\]

By convention, items with the lowest parameters are ranked first (i.e., sorted using the natural order).

If the argument params2 is None, the second model is assumed to rank the items by their index: item 0 has rank 1, item 1 has rank 2, etc.

If some values are equal within a parameter vector, all items are given a distinct rank, corresponding to the order in which the values occur.

Parameters:	params1 (array_like) – Parameters of the first model. params2 (array_like, optional) – Parameters of the second model.
Returns:	dist – Kendall tau distance.
Return type:	float

Processing pairwise comparisons¶

choix.lsr_pairwise(n_items, data, alpha=0.0, initial_params=None)[source]¶

Compute the LSR estimate of model parameters.

This function implements the Luce Spectral Ranking inference algorithm [MG15] for pairwise-comparison data (see Pairwise comparisons).

The argument initial_params can be used to iteratively refine an existing parameter estimate (see the implementation of ilsr_pairwise() for an idea on how this works). If it is set to None (the default), the all-ones vector is used.

The transition rates of the LSR Markov chain are initialized with alpha. When alpha > 0, this corresponds to a form of regularization (see Notes on Regularization for details).

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Pairwise-comparison data. alpha (float, optional) – Regularization parameter. initial_params (array_like, optional) – Parameters used to build the transition rates of the LSR Markov chain.
Returns:	params – An estimate of model parameters.
Return type:	numpy.ndarray

choix.ilsr_pairwise(n_items, data, alpha=0.0, initial_params=None, max_iter=100, tol=1e-08)[source]¶

Compute the ML estimate of model parameters using I-LSR.

This function computes the maximum-likelihood (ML) estimate of model parameters given pairwise-comparison data (see Pairwise comparisons), using the iterative Luce Spectral Ranking algorithm [MG15].

The transition rates of the LSR Markov chain are initialized with alpha. When alpha > 0, this corresponds to a form of regularization (see Notes on Regularization for details).

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Pairwise-comparison data. alpha (float, optional) – Regularization parameter. initial_params (array_like, optional) – Parameters used to initialize the iterative procedure. max_iter (int, optional) – Maximum number of iterations allowed. tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns:	params – The ML estimate of model parameters.
Return type:	numpy.ndarray

choix.lsr_pairwise_dense(comp_mat, alpha=0.0, initial_params=None)[source]¶

Compute the LSR estimate of model parameters given dense data.

This function implements the Luce Spectral Ranking inference algorithm [MG15] for dense pairwise-comparison data.

The data is described by a pairwise-comparison matrix comp_mat such that comp_mat[i,j] contains the number of times that item i wins against item j.

In comparison to lsr_pairwise(), this function is particularly efficient for dense pairwise-comparison datasets (i.e., containing many comparisons for a large fraction of item pairs).

The argument initial_params can be used to iteratively refine an existing parameter estimate (see the implementation of ilsr_pairwise() for an idea on how this works). If it is set to None (the default), the all-ones vector is used.

The transition rates of the LSR Markov chain are initialized with alpha. When alpha > 0, this corresponds to a form of regularization (see Notes on Regularization for details).

Parameters:	comp_mat (np.array) – 2D square matrix describing the pairwise-comparison outcomes. alpha (float, optional) – Regularization parameter. initial_params (array_like, optional) – Parameters used to build the transition rates of the LSR Markov chain.
Returns:	params – An estimate of model parameters.
Return type:	np.array

choix.ilsr_pairwise_dense(comp_mat, alpha=0.0, initial_params=None, max_iter=100, tol=1e-08)[source]¶

Compute the ML estimate of model parameters given dense data.

This function computes the maximum-likelihood (ML) estimate of model parameters given dense pairwise-comparison data.

The data is described by a pairwise-comparison matrix comp_mat such that comp_mat[i,j] contains the number of times that item i wins against item j.

In comparison to ilsr_pairwise(), this function is particularly efficient for dense pairwise-comparison datasets (i.e., containing many comparisons for a large fraction of item pairs).

The transition rates of the LSR Markov chain are initialized with alpha. When alpha > 0, this corresponds to a form of regularization (see Notes on Regularization for details).

Parameters:	comp_mat (np.array) – 2D square matrix describing the pairwise-comparison outcomes. alpha (float, optional) – Regularization parameter. initial_params (array_like, optional) – Parameters used to initialize the iterative procedure. max_iter (int, optional) – Maximum number of iterations allowed. tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns:	params – The ML estimate of model parameters.
Return type:	numpy.ndarray

choix.rank_centrality(n_items, data, alpha=0.0)[source]¶

Compute the Rank Centrality estimate of model parameters.

This function implements Negahban et al.’s Rank Centrality algorithm [NOS12]. The algorithm is similar to ilsr_pairwise(), but considers the ratio of wins for each pair (instead of the total count).

The transition rates of the Rank Centrality Markov chain are initialized with alpha. When alpha > 0, this corresponds to a form of regularization (see Notes on Regularization for details).

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Pairwise-comparison data. alpha (float, optional) – Regularization parameter.
Returns:	params – An estimate of model parameters.
Return type:	numpy.ndarray

choix.opt_pairwise(n_items, data, alpha=1e-06, method='Newton-CG', initial_params=None, max_iter=None, tol=1e-05)[source]¶

Compute the ML estimate of model parameters using scipy.optimize.

This function computes the maximum-likelihood estimate of model parameters given pairwise-comparison data (see Pairwise comparisons), using optimizers provided by the scipy.optimize module.

If alpha > 0, the function returns the maximum a-posteriori (MAP) estimate under an isotropic Gaussian prior with variance 1 / alpha. See Notes on Regularization for details.

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Pairwise-comparison data. alpha (float, optional) – Regularization strength. method (str, optional) – Optimization method. Either “BFGS” or “Newton-CG”. initial_params (array_like, optional) – Parameters used to initialize the iterative procedure. max_iter (int, optional) – Maximum number of iterations allowed. tol (float, optional) – Tolerance for termination (method-specific).
Returns:	params – The (penalized) ML estimate of model parameters.
Return type:	numpy.ndarray
Raises:	`ValueError` – If the method is not “BFGS” or “Newton-CG”.

choix.ep_pairwise(n_items, data, alpha, model='logit', max_iter=100, initial_state=None)[source]¶

Compute a distribution of model parameters using the EP algorithm.

This function computes an approximate Bayesian posterior probability distribution over model parameters, given pairwise-comparison data (see Pairwise comparisons). It uses the expectation propagation algorithm, as presented, e.g., in [CG05].

The prior distribution is assumed to be isotropic Gaussian with variance 1 / alpha. The posterior is approximated by a a general multivariate Gaussian distribution, described by a mean vector and a covariance matrix.

Two different observation models are available. logit (default) assumes that pairwise-comparison outcomes follow from a Bradley-Terry model. probit assumes that the outcomes follow from Thurstone’s model.

Parameters:

n_items (int) – Number of distinct items.
data (list of lists) – Pairwise-comparison data.
alpha (float) – Inverse variance of the (isotropic) prior.
model (str, optional) – Observation model. Either “logit” or “probit”.
max_iter (int, optional) – Maximum number of iterations allowed.
initial_state (tuple of array_like, optional) – Natural parameters used to initialize the EP algorithm.

Returns:

mean (numpy.ndarray) – The mean vector of the approximate Gaussian posterior.
cov (numpy.ndarray) – The covariance matrix of the approximate Gaussian posterior.

Raises:

ValueError – If the observation model is not “logit” or “probit”.

choix.mm_pairwise(n_items, data, initial_params=None, alpha=0.0, max_iter=10000, tol=1e-08)[source]¶

Compute the ML estimate of model parameters using the MM algorithm.

This function computes the maximum-likelihood (ML) estimate of model parameters given pairwise-comparison data (see Pairwise comparisons), using the minorization-maximization (MM) algorithm [Hun04], [CD12].

If alpha > 0, the function returns the maximum a-posteriori (MAP) estimate under a (peaked) Dirichlet prior. See Notes on Regularization for details.

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Pairwise-comparison data. initial_params (array_like, optional) – Parameters used to initialize the iterative procedure. alpha (float, optional) – Regularization parameter. max_iter (int, optional) – Maximum number of iterations allowed. tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns:	params – The ML estimate of model parameters.
Return type:	numpy.ndarray

choix.log_likelihood_pairwise(data, params)[source]¶: Compute the log-likelihood of model parameters.

Processing rankings¶

choix.lsr_rankings(n_items, data, alpha=0.0, initial_params=None)[source]¶

Compute the LSR estimate of model parameters.

This function implements the Luce Spectral Ranking inference algorithm [MG15] for ranking data (see Rankings).

The argument initial_params can be used to iteratively refine an existing parameter estimate (see the implementation of ilsr_rankings() for an idea on how this works). If it is set to None (the default), the all-ones vector is used.

The transition rates of the LSR Markov chain are initialized with alpha. When alpha > 0, this corresponds to a form of regularization (see Notes on Regularization for details).

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Ranking data. alpha (float, optional) – Regularization parameter. initial_params (array_like, optional) – Parameters used to build the transition rates of the LSR Markov chain.
Returns:	params – An estimate of model parameters.
Return type:	numpy.ndarray

choix.ilsr_rankings(n_items, data, alpha=0.0, initial_params=None, max_iter=100, tol=1e-08)[source]¶

Compute the ML estimate of model parameters using I-LSR.

This function computes the maximum-likelihood (ML) estimate of model parameters given ranking data (see Rankings), using the iterative Luce Spectral Ranking algorithm [MG15].

The transition rates of the LSR Markov chain are initialized with alpha. When alpha > 0, this corresponds to a form of regularization (see Notes on Regularization for details).

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Ranking data. alpha (float, optional) – Regularization parameter. initial_params (array_like, optional) – Parameters used to initialize the iterative procedure. max_iter (int, optional) – Maximum number of iterations allowed. tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns:	params – The ML estimate of model parameters.
Return type:	numpy.ndarray

choix.opt_rankings(n_items, data, alpha=1e-06, method='Newton-CG', initial_params=None, max_iter=None, tol=1e-05)[source]¶

Compute the ML estimate of model parameters using scipy.optimize.

This function computes the maximum-likelihood estimate of model parameters given ranking data (see Rankings), using optimizers provided by the scipy.optimize module.

If alpha > 0, the function returns the maximum a-posteriori (MAP) estimate under an isotropic Gaussian prior with variance 1 / alpha. See Notes on Regularization for details.

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Ranking data. alpha (float, optional) – Regularization strength. method (str, optional) – Optimization method. Either “BFGS” or “Newton-CG”. initial_params (array_like, optional) – Parameters used to initialize the iterative procedure. max_iter (int, optional) – Maximum number of iterations allowed. tol (float, optional) – Tolerance for termination (method-specific).
Returns:	params – The (penalized) ML estimate of model parameters.
Return type:	numpy.ndarray
Raises:	`ValueError` – If the method is not “BFGS” or “Newton-CG”.

choix.mm_rankings(n_items, data, initial_params=None, alpha=0.0, max_iter=10000, tol=1e-08)[source]¶

Compute the ML estimate of model parameters using the MM algorithm.

This function computes the maximum-likelihood (ML) estimate of model parameters given ranking data (see Rankings), using the minorization-maximization (MM) algorithm [Hun04], [CD12].

If alpha > 0, the function returns the maximum a-posteriori (MAP) estimate under a (peaked) Dirichlet prior. See Notes on Regularization for details.

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Ranking data. initial_params (array_like, optional) – Parameters used to initialize the iterative procedure. alpha (float, optional) – Regularization parameter. max_iter (int, optional) – Maximum number of iterations allowed. tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns:	params – The ML estimate of model parameters.
Return type:	numpy.ndarray

choix.log_likelihood_rankings(data, params)[source]¶: Compute the log-likelihood of model parameters.

Processing top-1 lists¶

choix.lsr_top1(n_items, data, alpha=0.0, initial_params=None)[source]¶

Compute the LSR estimate of model parameters.

This function implements the Luce Spectral Ranking inference algorithm [MG15] for top-1 data (see Top-1 lists).

The argument initial_params can be used to iteratively refine an existing parameter estimate (see the implementation of ilsr_top1() for an idea on how this works). If it is set to None (the default), the all-ones vector is used.

The transition rates of the LSR Markov chain are initialized with alpha. When alpha > 0, this corresponds to a form of regularization (see Notes on Regularization for details).

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Top-1 data. alpha (float) – Regularization parameter. initial_params (array_like) – Parameters used to build the transition rates of the LSR Markov chain.
Returns:	params – An estimate of model parameters.
Return type:	numpy.ndarray

choix.ilsr_top1(n_items, data, alpha=0.0, initial_params=None, max_iter=100, tol=1e-08)[source]¶

Compute the ML estimate of model parameters using I-LSR.

This function computes the maximum-likelihood (ML) estimate of model parameters given top-1 data (see Top-1 lists), using the iterative Luce Spectral Ranking algorithm [MG15].

The transition rates of the LSR Markov chain are initialized with alpha. When alpha > 0, this corresponds to a form of regularization (see Notes on Regularization for details).

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Top-1 data. alpha (float, optional) – Regularization parameter. initial_params (array_like, optional) – Parameters used to initialize the iterative procedure. max_iter (int, optional) – Maximum number of iterations allowed. tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns:	params – The ML estimate of model parameters.
Return type:	numpy.ndarray

choix.opt_top1(n_items, data, alpha=1e-06, method='Newton-CG', initial_params=None, max_iter=None, tol=1e-05)[source]¶

Compute the ML estimate of model parameters using scipy.optimize.

This function computes the maximum-likelihood estimate of model parameters given top-1 data (see Top-1 lists), using optimizers provided by the scipy.optimize module.

If alpha > 0, the function returns the maximum a-posteriori (MAP) estimate under an isotropic Gaussian prior with variance 1 / alpha. See Notes on Regularization for details.

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Top-1 data. alpha (float, optional) – Regularization strength. method (str, optional) – Optimization method. Either “BFGS” or “Newton-CG”. initial_params (array_like, optional) – Parameters used to initialize the iterative procedure. max_iter (int, optional) – Maximum number of iterations allowed. tol (float, optional) – Tolerance for termination (method-specific).
Returns:	params – The (penalized) ML estimate of model parameters.
Return type:	numpy.ndarray
Raises:	`ValueError` – If the method is not “BFGS” or “Newton-CG”.

choix.mm_top1(n_items, data, initial_params=None, alpha=0.0, max_iter=10000, tol=1e-08)[source]¶

Compute the ML estimate of model parameters using the MM algorithm.

This function computes the maximum-likelihood (ML) estimate of model parameters given top-1 data (see Top-1 lists), using the minorization-maximization (MM) algorithm [Hun04], [CD12].

If alpha > 0, the function returns the maximum a-posteriori (MAP) estimate under a (peaked) Dirichlet prior. See Notes on Regularization for details.

Parameters:	n_items (int) – Number of distinct items. data (list of lists) – Top-1 data. initial_params (array_like, optional) – Parameters used to initialize the iterative procedure. alpha (float, optional) – Regularization parameter. max_iter (int, optional) – Maximum number of iterations allowed. tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns:	params – The ML estimate of model parameters.
Return type:	numpy.ndarray

choix.log_likelihood_top1(data, params)[source]¶: Compute the log-likelihood of model parameters.

Processing choices in a network¶

choix.choicerank(digraph, traffic_in, traffic_out, weight=None, initial_params=None, alpha=1.0, max_iter=10000, tol=1e-08)[source]¶

Compute the MAP estimate of a network choice model’s parameters.

This function computes the maximum-a-posteriori (MAP) estimate of model parameters given a network structure and node-level traffic data (see Choices in a network), using the ChoiceRank algorithm [MG17], [KTVV15].

The nodes are assumed to be labeled using consecutive integers starting from 0.

Parameters:	digraph (networkx.DiGraph) – Directed graph representing the network. traffic_in (array_like) – Number of arrivals at each node. traffic_out (array_like) – Number of departures at each node. weight (str, optional) – The edge attribute that holds the numerical value used for the edge weight. If None (default) then all edge weights are 1. initial_params (array_like, optional) – Parameters used to initialize the iterative procedure. alpha (float, optional) – Regularization parameter. max_iter (int, optional) – Maximum number of iterations allowed. tol (float, optional) – Maximum L1-norm of the difference between successive iterates to declare convergence.
Returns:	params – The MAP estimate of model parameters.
Return type:	numpy.ndarray
Raises:	`ImportError` – If the NetworkX library cannot be imported.

choix.log_likelihood_network(digraph, traffic_in, traffic_out, params, weight=None)[source]¶

Compute the log-likelihood of model parameters.

If weight is not None, the log-likelihood is correct only up to a constant (independent of the parameters).