Skip to content

junipertcy/RegRank

Repository files navigation

regrank implements a suite of regularized models to infer the hierarchical structure in a directed network.

Docs · Discussions · Examples

This is the software repository behind the paper:

  • Tzu-Chi Yen and Stephen Becker, Regularized methods for efficient ranking in networks, in preparation.

RegRank depends on graph-tool. We recommend using conda to manage packages.

conda create --name regrank-dev -c conda-forge graph-tool
conda activate regrank-dev
pip install regrank

Example

# Import the library
import regrank as rr

# Load a data set
g = rr.datasets.us_air_traffic()

# Create a model
model = rr.SpringRank(method="annotated")

# Fit the model: We decided to analyze the `state_abr` nodal metadata,
# We may inspect `g.list_properties()` for other metadata to analyze.
result = model.fit(g, alpha=1, lambd=0.5, goi="state_abr")

# Now, result["primal"] should have the rankings. We can compute a summary.
summary = model.compute_summary(g, "state_abr", primal_s=result["primal"])

Let's plot the rankings, via rr.plot_hist(summary). Note that most of the node categories are regularized to have the same mean ranking.

A histogram of four ranking groups, where most of the metadata share the same mean ranking.

We provided a summary via rr.print_summary_table(summary).

  +-------+-------+--------+-----------------------------------------+--------+---------+
  | Group | #Tags | #Nodes | Members                                 |   Mean |     Std |
  +-------+-------+--------+-----------------------------------------+--------+---------+
  | 1     |     5 |    825 | CA, WA, OR, TT, AK                      |  0.047 | 1.1e-02 |
  | 2     |     4 |    206 | TX, MT, PA, ID                          | -0.006 | 4.2e-03 |
  | 3     |    43 |   1243 | MI, IN, TN, NC, VA, IL, CO, WV, MA, WI, | -0.035 | 4.3e-03 |
  |       |       |        | SC, KY, MO, MD, AZ, PR, LA, UT, MN, GA, |        |         |
  |       |       |        | MS, HI, DE, NM, ME, NJ, NE, VT, CT, SD, |        |         |
  |       |       |        | IA, NV, ND, AL, OK, AR, NH, RI, OH, FL, |        |         |
  |       |       |        | KS, NY, WY                              |        |         |
  | 4     |     1 |      4 | VI                                      | -0.072 | 0.0e+00 |
  +-------+-------+--------+-----------------------------------------+--------+---------+

The result suggests that states such as CA, WA, or AK are significantly more popular than other states.

Data sets

We have a companion repo, regrank-data, which stores the data sets used in the paper. These data can be loaded via the regrank.datasets submodule, and will load into a graph-tool graph object. See the docs for more description.

Development

The library uses pytest to ensure correctness. The test suite depends on mosek and gurobi.

License

regrank is open-source and licensed under the GNU Lesser General Public License v3.0.