Skip to content

Commit

Permalink
Merge pull request #36 from github/repo-lists
Browse files Browse the repository at this point in the history
Allow repository list input
  • Loading branch information
zkoppert committed Oct 19, 2023
2 parents 91ec311 + 5342d64 commit 01d5463
Show file tree
Hide file tree
Showing 5 changed files with 40 additions and 23 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[![Python package](https://github.com/github/contributors/actions/workflows/python-ci.yml/badge.svg)](https://github.com/github/contributors/actions/workflows/python-ci.yml) [![Docker Image CI](https://github.com/github/contributors/actions/workflows/docker-ci.yml/badge.svg)](https://github.com/github/contributors/actions/workflows/docker-ci.yml) [![CodeQL](https://github.com/github/contributors/actions/workflows/github-code-scanning/codeql/badge.svg)](https://github.com/github/contributors/actions/workflows/github-code-scanning/codeql)

This is a GitHub Action that given an organization or repository, produces information about the [contributors](https://chaoss.community/kb/metric-contributors/) over the specified time period (if specified).
This is a GitHub Action that given an organization or specified repositories, produces information about the [contributors](https://chaoss.community/kb/metric-contributors/) over the specified time period.

Similar actions to help you recognize contributors by putting them into a `README` or `CONTRIBUTORS.md` include:

Expand All @@ -21,7 +21,7 @@ If you need support using this project or have questions about it, please [open

## What is a contributor?

Contributors have made commits to the specified repository/organization on a default branch. The endpoint used may return information that is a few hours old because the GitHub REST API caches contributor data to improve performance.
Contributors have made commits to the specified repositories/organization on a default branch. Contributions can also be issue, pull request and discussion interactions. The endpoint used may return information that is a few hours old because the GitHub REST API caches contributor data to improve performance.

GitHub identifies contributors by author email address. Contribution counts are grouped by GitHub user, which includes all associated email addresses. To improve performance, only the first 500 author email addresses in the repository link to GitHub users. The rest will appear as anonymous contributors without associated GitHub user information.

Expand All @@ -32,7 +32,7 @@ Find out more in the [GitHub API documentation](https://docs.github.com/en/rest/
1. Create a repository to host this GitHub Action or select an existing repository.
1. Select a best fit workflow file from the [examples below](#example-workflows).
1. Copy that example into your repository (from step 1) and into the proper directory for GitHub Actions: `.github/workflows/` directory with the file extension `.yml` (ie. `.github/workflows/contributors.yml`)
1. Edit the values (`ORGANIZATION`, `REPOSITORY`, `START_DATE`, `END_DATE`) from the sample workflow with your information. If no start and end date are supplied, the action will consider the entire repo history and be unable to determine if contributors are new or returning. If running on a whole organization then no repository is needed. If running the action on just one repository, then no organization is needed.
1. Edit the values (`ORGANIZATION`, `REPOSITORY`, `START_DATE`, `END_DATE`) from the sample workflow with your information. If no start and end date are supplied, the action will consider the entire repo history and be unable to determine if contributors are new or returning. If running on a whole organization then no repository is needed. If running the action on just one repository or a list of repositories, then no organization is needed.
1. Also edit the value for `GH_ENTERPRISE_URL` if you are using a GitHub Server and not using github.com. For github.com users, don't put anything in here.
1. If you are running this action on an organization or repository other than the one where the workflow file is going to be, then update the value of `GH_TOKEN`. Do this by creating a [GitHub API token](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens#creating-a-personal-access-token-classic) with permissions to read the repository/organization and write issues. Then take the value of the API token you just created, and [create a repository secret](https://docs.github.com/en/actions/security-guides/encrypted-secrets) where the name of the secret is `GH_TOKEN` and the value of the secret the API token. Then finally update the workflow file to use that repository secret by changing `GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}` to `GH_TOKEN: ${{ secrets.GH_TOKEN }}`. The name of the secret can really be anything. It just needs to match between when you create the secret name and when you refer to it in the workflow file.
1. If you want the resulting issue with the output to appear in a different repository other than the one the workflow file runs in, update the line `token: ${{ secrets.GITHUB_TOKEN }}` with your own GitHub API token stored as a repository secret. This process is the same as described in the step above. More info on creating secrets can be found [here](https://docs.github.com/en/actions/security-guides/encrypted-secrets).
Expand All @@ -48,7 +48,7 @@ Below are the allowed configuration options:
| `GH_TOKEN` | True | "" | The GitHub Token used to scan the repository or organization. Must have read access to all repository you are interested in scanning. |
| `GH_ENTERPRISE_URL` | False | "" | The `GH_ENTERPRISE_URL` is used to connect to an enterprise server instance of GitHub. github.com users should not enter anything here. |
| `ORGANIZATION` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the GitHub organization which you want the contributor information of all repos from. ie. github.com/github would be `github` |
| `REPOSITORY` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the repository and organization which you want the contributor information from. ie. `github/contributors` |
| `REPOSITORY` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the repository and organization which you want the contributor information from. ie. `github/contributors` or a comma separated list of multiple repositories `github/contributor,super-linter/super-linter` |
| `START_DATE` | False | Beginning of time | The date from which you want to start gathering contributor information. ie. Aug 1st, 2023 would be `2023-08-01` If `start_date` and `end_date` are specified then the action will determine if the contributor is new. A new contributor is one that has contributed in the date range specified but not before the start date. **Performance Note:** Using start and end dates will reduce speed of the action by approximately 63X. ie without dates if the action takes 1.7 seconds, it will take 1 minute and 47 seconds.|
| `END_DATE` | False | Current Date | The date at which you want to stop gathering contributor information. Must be later than the `START_DATE`. ie. Aug 2nd, 2023 would be `2023-08-02` If `start_date` and `end_date` are specified then the action will determine if the contributor is new. A new contributor is one that has contributed in the date range specified but not before the start date. |
| `SPONSOR_INFO` | False | False | If you want to include sponsor information in the output. This will include the sponsor count and the sponsor URL. This will impact action performance. ie. SPONSOR_INFO = "False" or SPONSOR_INFO = "True" |
Expand Down
22 changes: 12 additions & 10 deletions contributors.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
"""This file contains the main() and other functions needed to get contributor information from the organization or repository"""

import sys
from typing import List
import env
import auth
import contributor_stats
Expand All @@ -14,7 +15,7 @@ def main():
# Get environment variables
(
organization,
repository,
repository_list,
token,
ghe,
start_date,
Expand All @@ -28,7 +29,7 @@ def main():
# Get the contributors
contributors = get_all_contributors(
organization,
repository,
repository_list,
start_date,
end_date,
github_connection,
Expand All @@ -40,7 +41,7 @@ def main():
# so we can see if contributors after start_date are new or returning
returning_contributors = get_all_contributors(
organization,
repository,
repository_list,
start_date="2008-02-29", # GitHub was founded on 2008-02-29
end_date=start_date,
github_connection=github_connection,
Expand All @@ -61,15 +62,15 @@ def main():
start_date,
end_date,
organization,
repository,
repository_list,
sponsor_info,
)
# write_to_json(contributors)


def get_all_contributors(
organization: str,
repository: str,
repository_list: List[str],
start_date: str,
end_date: str,
github_connection: object,
Expand All @@ -79,7 +80,7 @@ def get_all_contributors(
Args:
organization (str): The organization for which the contributors are being listed.
repository (str): The repository for which the contributors are being listed.
repository_list (List[str]): The repository list for which the contributors are being listed.
start_date (str): The start date of the date range for the contributor list.
end_date (str): The end date of the date range for the contributor list.
github_connection (object): The authenticated GitHub connection object from PyGithub
Expand All @@ -91,15 +92,16 @@ def get_all_contributors(
if organization:
repos = github_connection.organization(organization).repositories()
else:
owner, repo = repository.split("/")
repository_obj = github_connection.repository(owner, repo)
repos = []
for repo in repository_list:
owner, repo_name = repo.split("/")
repository_obj = github_connection.repository(owner, repo_name)
repos.append(repository_obj)

all_contributors = []
if repos:
for repo in repos:
all_contributors.append(get_contributors(repo, start_date, end_date))
else:
all_contributors.append(get_contributors(repository_obj, start_date, end_date))

# Check for duplicates and merge when usernames are equal
all_contributors = contributor_stats.merge_contributors(all_contributors)
Expand Down
25 changes: 20 additions & 5 deletions env.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ def get_env_vars() -> tuple[str, str, str, str, str, str, str]:
Returns:
str: the organization to get contributor information for
str: the repository to get contributor information for
List[str]: A list of the repositories to get contributor information for
str: the GitHub token to use for authentication
str: the GitHub Enterprise URL to use for authentication
str: the start date to get contributor information from
Expand All @@ -27,11 +27,11 @@ def get_env_vars() -> tuple[str, str, str, str, str, str, str]:
load_dotenv(dotenv_path)

organization = os.getenv("ORGANIZATION")
repository = os.getenv("REPOSITORY")
repositories_str = os.getenv("REPOSITORY")
# Either organization or repository must be set
if not organization and not repository:
if not organization and not repositories_str:
raise ValueError(
"ORGANIZATION and repository environment variables were both not set. Please enter a valid value for one of them."
"ORGANIZATION and REPOSITORY environment variables were both not set. Please enter a valid value for one of them."
)

token = os.getenv("GH_TOKEN")
Expand Down Expand Up @@ -59,4 +59,19 @@ def get_env_vars() -> tuple[str, str, str, str, str, str, str]:
"SPONSOR_INFO environment variable not a boolean. ie. True or False or blank"
)

return organization, repository, token, ghe, start_date, end_date, sponsor_info
# Separate repositories_str into a list based on the comma separator
repositories_list = []
if repositories_str:
repositories_list = [
repository.strip() for repository in repositories_str.split(",")
]

return (
organization,
repositories_list,
token,
ghe,
start_date,
end_date,
sponsor_info,
)
2 changes: 1 addition & 1 deletion test_contributors.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ def test_get_all_contributors_with_repository(self, mock_get_contributors):
]

result = get_all_contributors(
"", "owner/repo", "2022-01-01", "2022-12-31", mock_github_connection
"", ["owner/repo"], "2022-01-01", "2022-12-31", mock_github_connection
)

self.assertEqual(
Expand Down
6 changes: 3 additions & 3 deletions test_env.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ def test_get_env_vars(self, mock_getenv):
"""
mock_getenv.side_effect = [
"org",
"repo",
"repo,repo2",
"token",
"",
"2022-01-01",
Expand All @@ -27,7 +27,7 @@ def test_get_env_vars(self, mock_getenv):

(
organization,
repository,
repository_list,
token,
ghe,
start_date,
Expand All @@ -36,7 +36,7 @@ def test_get_env_vars(self, mock_getenv):
) = env.get_env_vars()

self.assertEqual(organization, "org")
self.assertEqual(repository, "repo")
self.assertEqual(repository_list, ["repo", "repo2"])
self.assertEqual(token, "token")
self.assertEqual(ghe, "")
self.assertEqual(start_date, "2022-01-01")
Expand Down

0 comments on commit 01d5463

Please sign in to comment.