Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change repository layout #22

Merged
merged 10 commits into from
Apr 22, 2021
Merged

Change repository layout #22

merged 10 commits into from
Apr 22, 2021

Conversation

CasperWA
Copy link
Contributor

Closes #21.

Change and update the repository according to the suggested layout in #21.
Remove/Cull unnecessary files and implement minor updates.

Divide into `dic2owl` and `ontology`.
Remove `.gitlab-ci.yml` completely as it's not used.
Update GitHub Actions CI (`ci_emmocheck.yml`) to point at new paths and
to use Python 3.8.
Changing some naming from `domain-crystallography` -> `cif-ontology`.
Copy link
Collaborator

@jesper-friis jesper-friis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except from the comment, I think it looks very good

[crystallography.owl](https://raw.githubusercontent.com/emmo-repo/domain-crystallography/master/crystallography.ttl)
in Protege, the correct version of emmo-inferred will be downloaded
and imported.
When opening [cif.ttl](https://raw.githubusercontent.com/emmo-repo/CIF-ontology/main/ontology/cif.ttl) in Protégé, the correct version of emmo-inferred will be downloaded and imported.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to store the generated ontology in the github repo? I think it would be better to create a folder structure on GitHub Pages with the ontology files for each release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, for the README I simply updated the text to match the current files, I didn't update any meaning.
The README needs a proper overhaul all the way through to catch things like this.

Feel free to checkout the branch, update the README as you see fit, and commit it here.

The PR is still a draft, I'm thus not finished tinkering with the layout, but we seem to past the first hurdle (#23). From there I'll wrap the script in some CLI and add a setup.py file, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to store the generated ontology in the github repo? I think it would be better to create a folder structure on GitHub Pages with the ontology files for each release.

Note that GitHub Pages takes all its files from a branch in this repository.
So all the files we serve on GitHub pages will be present in the repository, but only in a specific branch. I don't know if there's a way to let git know locally to not automatically fetch a specific branch from a remote, but if not, then there's no getting around the fact that a lot of data will have to be downloaded every time a git clone is performed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two good points:

  • You are right, the README file needs a proper overhaul anyway. Lets fix that in a new PR to not delay this one.
  • I guess that you are right that we probably don't want to use the separate repo for GH Pages: https://github.com/emmo-repo/emmo-repo.github.io since that is used for top and mid-level EMMO. In that case we have to use a branch. I am not so concerned about the released versions since they would be rather limited, but rather about the continuous changes to the generated ontology for each push. Is it possible to push to GH Pages without tracking history or does it works exactly like an ordinary branch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works more or less like an ordinary branch, but I think we'll do it in a non-working way. Meaning, we can generate all the files needed for the branch, delete everything else and only have that. This will then be force-pushed, and as such the branch will follow the development of the main branch whenever an ontology file is updated.

Is it correctly understood that your concern is towards the changes that arise when we update an ontology file? I think the size of the repository will not change much from these changes - that's the whole point of git, to only store the diff, not the whole file - only once. The issue could arise if we keep making new large files with new names etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correctly understood. My concern is that the changes in the generated files may be quite large even for a very small change in the code that generates the file.

Copy link
Contributor Author

@CasperWA CasperWA Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Due to the weird way different Protégé versions stores files?
Could we maybe have a pre-commit hook that always re-reads and writes any changed ontology files in a specific way? In this way it should be fine no matter the Protégé version, since the change will happen upon git commit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a very nice feature - not only for the CIF ontology but for all our ontology repositories.

Run pre-commit with black that converts all strings to be
double-quotations.

Update path to ontology folder.
Implement parameters for generate_cif.main() in order to use it from
the CLI.
@CasperWA CasperWA marked this pull request as ready for review April 21, 2021 16:04
@CasperWA
Copy link
Contributor Author

CasperWA commented Apr 21, 2021

@jesper-friis since your last review, I have now added all the boiler plate stuff to make the Python module a proper package and also allowed myself to create a CLI using argparse that calls the generate_cif.main() function.

While this could have been implemented as the CLI being dic2owl and it simply ran the generate_cif file, I think having the ability to parse inputs is nice.

To check out the CLI type dic2owl --help (note, you need to pip install the package before you can use the CLI, as it will bind through the setuptools installation.

The equivalent of just running python generate_cif.py is now dic2owl cif_core.dic.

Note that the git diff can be a bit confusing if you look at all the commits. This is due to the implementation of pre-commit that runs Black and flake8 to stylize and lint the Python code upon performing git commit. One just needs to install it first by running pre-commit install after installing the python package adding the dev extra. I.e.:

$ pip install -e ./dic2owl[dev]
$ pre-commit install

Done.

To get a better git diff for the review, check out individual commits to see the actual content changes.

@jesper-friis jesper-friis merged commit 9832141 into main Apr 22, 2021
@jesper-friis jesper-friis deleted the close_21_change-repo-layout branch April 22, 2021 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Repository hierarchy/layout
2 participants