How to contribute to Voc4Cat?¶
The contribution process is almost the same as in source code repositories. Below we describe the steps in detail so that beginners without git or GitHub experience can follow. The essentials of the contribution process are:
Request a range of IDs (for new concepts): Create an issue
Download the current vocabulary Excel file: voc4cat.xlsx
Edit the Excel file to add/modify concepts
Submit your Excel file in a pull request placing your changed
voc4cat.xlsxin theinbox-excel-vocabs/folder (keep the filename)
Vocabulary contribution workflow & Continuous Integration (CI) Pipeline.¶
Step-by-step contribution guide¶
The main steps that a community member needs to follow to contribute to Voc4Cat are illustrated in the diagrams below.
Steps 1-4 are performed only once and are intended to set up the GitHub environment for the contributor.
flowchart LR
A1[Step 1<br>Create GitHub<br>account] --> A2[Step 2<br>Access<br>repository] --> A3[Step 3<br>Request<br>ID range] --> A4[Step 4<br>Fork repository<br>set up remotes]
The actual vocabulary contribution process involves steps 5-7 which are repeated with each new submission.
flowchart LR
B5[Step 5<br>Get latest Excel template] --> B6[Step 6<br>Edit concepts in Excel] --> B7[Step 7<br>Open PR and iterate]
What happens during the contribution steps is visualized in the next figure.
Initial steps (one-time)¶
Step 1 – Create a GitHub account¶
This step happens in the browser only. No commands. Visit github.com and create a personal account.
Caution
It is essential to use a personal GitHub account. Contributing from organization accounts does not work due to GitHub limitations (status Nov-2025, see discussion).
Step 2 – Access the Voc4Cat repository¶
While signed in, open the repository of Voc4Cat https://github.com/nfdi4cat/voc4cat. Familiarize yourself with the README and existing issues.
Contributing or participating in discussions requires you to be signed in with your personal GitHub account.
Step 3 – Request an ID range¶
You will need unique IDs for each new concept that you want to add to Voc4Cat.
Voc4cat gives each user their personal ID range(s) to pick IDs from. This way contributions can happen asynchronously without needing additional coordination between users.
While on the Voc4Cat repository page https://github.com/nfdi4cat/voc4cat,
click on Issues (top left corner), then
click on New issue (green button on the top right corner)
finally click on the “Request a range of IDs” option.
This opens a dialog window:
Enter the number of IDs that you need for new concepts. The number does not have to be exact. Request enough if you plan several contributions in the future (typical numbers are 10…100).
Add your Open Researcher and Contributor ID (ORCID identifier). This is optional but recommended.
Add the Research Organization Registry (ROR) identifier of the organization you work for.
Add any additional information that you deem necessary such as what you plan to work on.
Step 4 – Fork the repository¶
Copying (”forking”) Voc4Cat’s repository to your workspace allows modifications independent of the original repository. Forking is an essential step in GitHub´s collaborative development model.
To fork Voc4Cat, while being logged in,
Navigate to the nfdi4cat/voc4cat repository.
Click on the Fork button (top right corner).
After creation, your fork will be available as new repository in your workspace at https://github.com/<your_username>/voc4cat.
To prepare for contributing from your local computer
Clone your fork:
git clone https://github.com/<your_username>/voc4cat.git
Add “upstream” remote:
cd voc4cat
git remote add upstream https://github.com/nfdi4cat/voc4cat.git
No action required.
Steps for every contribution¶
Step 5 – Get the latest versions¶
First, download the current version of Voc4Cat as xlsx (Excel) file. Go to the voc4cat-homepage, and download the file by clicking on the middle Vocabulary card (or use this direct download link).
Second you have to update your voc4cat-fork. While not essential it is strongly recommended that the vocabulary stored in the repository as RDF/turtle files matches the concepts stored in the downloaded xlsx file.
From the root directory of your cloned repo:
Update local main:
git fetch upstream --tags
git switch main
git pull upstream main`
Create a feature branch for the contribution (reference the issue solved by the PR):
git switch -c issue###_<short_topic>`
Place the freshly download voc4cat xlsx-file in
inbox-excel-vocabs/voc4cat.xlsx. Do not change the filename.
Open your fork of voc4cat https://github.com/your_username/voc4cat in the browser.
Sync your fork with upstream nfdi4cat/voc4cat. Press the green “Sync fork” button which makes your fork/clone an exact copy of upstream
mainagain (nfdi4cat/voc4cat). If this fails, see Troubleshooting.
Be careful when you “Sync fork” in GitHub UI.
It should look like this screenshot. Your main branch should only be behind the forked repository (not ahead of it).

If your main-branch is ahead of main in the forked repo, do not press “Sync fork” but go to Troubleshooting.

Step 6 – Add / edit concepts in Excel¶
Introduction to the xlsx vocabulary file (click to unfold).
The downloaded Excel file consists of seven sheets:
Introduction: General information regarding Voc4Cat.
Help: A guide on how to properly fill the necessary information in Voc4Cat Excel sheets.
Concept Scheme: Collects the top-level information about the vocabulary.
Concepts: A concept according to SKOS is a unit of thought, idea, meaning, or category of an object or event which underlines a knowledge organization system. This sheet collects concept descriptions, (optionally) their translations to other languages, simple broader / narrower relations between the concepts and provenance information. This is the sheet where most edits are made.
Additional Concept Features: This sheet allows to add more relations between concepts. These extra relations are adapted from the SKOS specification, and they include:
Related Matches: Mapping with this cell asserts a related or associated relationship to the concepts listed. It is important to note that this relation is not transitive (if concept A has a close match with concept B, and concept B has a close match with concept C, it doesn’t necessarily follow that concept A has a close match with concept C).
Close Matches: Mapping with this cell means the concepts are sufficiently similar that they can be used interchangeably. Close matches are also not transitive.
Exact Matches: This is a subset of a close match. Concepts are to be added if they are similar enough to be used interchangeably but have an even higher degree of closeness that includes transitivity, e.g., if concept A is an exact match for concept B, and B is an exact match for C, then A is also an exact match for C.
Broader Matches: Broader match allows the user to assert that a concept is broader in meaning to another concept. This is the inverse of a narrower relation.
Narrower Matches: Narrower match allows the user to assert that a concept is narrower in meaning to another concept. This is the inverse of a broader relation.
Collections: Collections are an easy way to group together concepts for various purposes. If collection rows are added to the sheet, all cells must be filled out.
Preferred Label: A simple one-line title for the Collection.
Definition: The defining description of this Collection that may be longer and include line-breaks.
Member IRIs: A comma-separated list of the Concept IRIs of all Concepts belonging to this collection.
Provenance: A note on the source of this Collection.
Prefix Sheet: This sheet is for defining a mapping between short prefixes and namespaces which are the basis for using “compact URI” also called “CURIE”. For Voc4Cat we have registered “voc4cat” as prefix in the Bio Registry (bioregistry.io) and the compact URI form would be “voc4cat:xxxxxxx” (e.g., voc4cat:0007001 for the concept “heterogeneous catalysis” with a full URI: https://w3id.org/nfdi4cat/voc4cat_0007001). For more on compact URIs, see https://www.w3.org/TR/2010/NOTE-curie-20101216/.
The Concepts sheet is where most contributions by users will be made. Detailed descriptions on how to properly fill these columns can be found in paragraph 6.6. There are nine columns used in the “Concepts” sheet:
Concept IRI: Must be a valid URI. This is based on the Vocabulary URI (Uniform Resource Identifier) and for new contributions must align with the requested ID range.
Preferred Label: A simple one-line label for the concept.
Pref. Label Language Code: Two or three letter language code according to ISO 639-2 or 639-3 for the Preferred Label. If no language code is given, “en” is assumed as default (for English).
Definition: The defining description of the Concept.
Def. Label Language Code: Two or three letter language code according to ISO 639-2 or 639-3 for the Definition. If no language code is given, “en” is assumed as default (for English). Translations of a concept into different languages use the same IRI but they occupy different rows in the template. As an example, as shown in the following table for the “heterogenous catalysis” two translations of the concept name and the definition are available (in English -en- and in German -de-) but they both use the same concept URI (voc4cat:0007001).
| Concept Compact IRI | Preferred Label | Pref. Label Language Code | Definition | Def. Language Code |
| voc4cat:0007001 | heterogeneous catalysis | en | A process during which a chemical reaction is accelerated by the presence of a catalyst in a different phase than the reactant. The reaction generally proceeds at the interface. | en |
| voc4cat:0007001 | heterogene Katalyse | de | Ein Prozess, bei dem eine chemische Reaktion durch das Vorhandensein eines Katalysators in einer anderen Phase als der Reaktant beschleunigt wird. Die Reaktion findet im Allgemeinen an der Grenzfläche statt. | de |
Alternate Labels: Any other names (labels) for this Concept. Separated by commas. If the user wants to use a comma as part of the Alternate label, escape it with “\ like in: “one\two”.
Children IRIs: A list of IRIs of children of this Concept, separated by commas. This creates a hierarchical relationship between the terms. In SKOS terminology, the Concept is broader than its Concept-Child and in turn the Concept-Child is narrower than the Concept. Note, broader/narrower are not transitive.
Provenance: A note on the source of this concept. This should be an identifier for the person and a provenance note. As an identifier, an ORCID ID (with or without the https://orcid.org/ part) or a GitHub name should be used. Multiple entries must be separated by comma.
Source Vocab URI: If this Concept is imported from another vocabulary, this should be the URI of the concept in the other vocabulary. Before including content from other sources, make sure that such re-use is permitted by their license.
Edit the xlsx vocabulary file locally in your spreadsheet program of choice. Follow the Guidelines page. These guidelines for suggesting, adding, and editing content to Voc4Cat guarantee consistency and coherence in the selection and structuring of concepts.
Vocabulary xlsx file rules
Provide clear definition text (concise, domain-relevant).
Keep changes focussed. Solve one issue or add/edit a set of strongly related concepts.
Not more than 20 new concepts per PR. This keeps the review process manageable for you and the curators. Less is better!
Do not change the Excel template structure (sheet names, header rows, column order).
Step 7 – Create & iterate on the Pull Request¶
After you finished the additions and edits in the Voc4Cat xlsx file, the update file can be submitted. This happens in a “Pull request”.
Make sure that you are in your fork .
Add and commit the xlsx file to your local feature branch.
If you changed other files as part of your contribution, git add them as well (before commit).
git add inbox-excel-vocabs/voc4cat.xlsx
git commit -m "A meaningful commit message"
Push the branch to GitHub (first time):
git push -set-upstream origin <your-feature-branch-name>
and for later updates:
git push
Open the GitHub page of your fork
https://github.com/<your_username>/voc4catand - if you already created a branch for your contribution - switch to the branch.Open the “inbox-excel-vocabs” folder
Click click on the “Add file” then “Upload files” to open a file a file submission page.
Upload the voc4cat.xlsx file (the file name must be exactly this).
Commit the uploaded file making sure that you commit to your feature-branch (not
main). If you have to create a new branch, you may change the default name to a more reasonable name describing your contribution in the formissue###_<short_title>.
Create the pull request
The easiest way is go to the original nfdi4cat/voc4cat repository. A notification will show up (1st screenshot) and if you click “Compare & pull request” the defaults will be correct (2nd screenshot).
Vocabulary contribution workflow & Continuous Integration (CI) Pipeline¶
Check that the source is your feature branach and the target is nfdi4cat/voc4cat¶
Here are some points to check before you finally submit the PR:
PR title and description summarize motivation and scope.
Vocabulary file is
inbox-excel-vocabs/voc4cat.xlsx(name and path unchanged).IDs of added concepts are within my allocated range.
Each concept has prefLabel, definition, and a broader chain to a top concept.
Contribution is focused/small enough for review (split if needed).
No
.ttlfiles edited.
Tip
Open your PR as “Draft” if you want early feedback. Mention any specific questions. Curators are informed automatically about every new PR, also drafts.
What happens next?
When the PR is submitted, an automated Continuous Integration (CI) pipeline is triggered. The pipeline first checks the submitted Excel file for various errors.
If errors are detected, github reports the failed job with a red cross, see example

To find out more about the reasons for the failure, click on the red cross ❌. This brings you to a page with a run log that typically gives enough information to understand the reason for the failure. In addition to the run log, we also create a so called job artifact which can be accessed by going to the job “Summary” from the run log page. The artifact is a zip-file with several files that help to diagnose the source of the failure. It contains:
log-file of the run
xlsx vocabulary file enriched with error information in the failing row (for some errors).
HTML vocabulary documentation (useful to check the broader/narrower hierarchy)
If the run succeeded you see a similar job message but with a green checkmark:

The second line with the GitHub icon is a commit created by the pipeline in which
the xlsx vocabulary file gets removed and
the RDF/turtle (ttl-files) with the SKOS representation are added.
If you work on a local checkout on you computer, you need to pull this commit made on GitHub to your local clone with git pull.
Later updates on your PR
For updates: Switch to the previously created feature branch, fix the xlsx file (or other files) Excel, commit and push it again.
After merge¶
CI builds and publishes the updated development version.
Go to the homepage which is now updated with your contribution.
Download new Excel if planning further contributions.
Note that this development version is not published to Skosmos at ZB Med or the TIB Technology service. Only releases are published to these services.
Contributing to the homepage¶
We use Sphinx as documentation builder with the furo theme and MySt to support extended markdown.
To get an overview about the styling features supported visit the furo theme documentation or the MyST authoring documentation.
To help you checking your changes before making a pull request, we provide instructions how to build the documentation locally.
Troubleshooting¶
Fork out-of-sync If you ever committed to the main branch in your repository directly, your history is no longer the same as in the upstream voc4cat. This is a problem because it creates a merge conflict. The fix is to reset your main match the upstream one exctly. You can do this via the following git command:
git fetch upstream git reset --hard upstream/main git push --force
Questions or issues?¶
Vocabulary discussions: https://github.com/nfdi4cat/voc4cat/issues
Tooling improvements (voc4cat-tool): https://github.com/nfdi4cat/voc4cat-tool/issues
Template feedback (voc4cat-template): https://github.com/nfdi4cat/voc4cat-template/issues