Setting up a Vocabulary Project¶
This section is about how to create a new vocabulary project using the voc4cat-template repository.
The voc4cat-template provides a complete GitHub repository structure with CI/CD workflows, issue templates, and documentation generation.
Creating your repository¶
Create an empty repository on GitHub first, then:
git init my-vocabulary
cd my-vocabulary
git pull https://github.com/nfdi4cat/voc4cat-template
git remote add origin https://github.com/my-gh-name/my-vocabulary.git
git push -u origin main
This preserves the template’s commit history, making future syncs easier. By syncing you can update your repositories with features or bug fixes made in the template.
What’s included¶
my-vocabulary/
├── idranges.toml # Vocabulary configuration
├── justfile # Local development commands
├── README.md # Customize for your project
├── README_TEMPLATE.md # Template documentation
├── inbox-excel-vocabs/ # Drop xlsx files here for PRs
├── vocabularies/ # Converted turtle files
├── templates/ # xlsx template
└── .github/
├── workflows/ # CI/CD pipelines
│ ├── ci-pr.yml # Pull request validation
│ ├── merge.yml # Main branch build
│ └── publish.yml # Release publication
└── ISSUE_TEMPLATE/ # Issue forms
Initial configuration¶
Edit idranges.toml with your vocabulary metadata:
config_version = "v1.0"
[vocabs.myvocab]
id_length = 6
permanent_iri_part = "https://example.org/myvocab_"
# Required metadata
vocabulary_iri = "https://example.org/myvocab"
prefix = "myvoc"
title = "My Vocabulary"
description = "A vocabulary for..."
created_date = "2025-01-01"
creator = "Your Name https://orcid.org/0000-0000-0000-0000"
repository = "https://github.com/yourorg/my-vocabulary"
# Namespace prefixes
[vocabs.myvocab.prefix_map]
myvoc = "https://example.org/myvocab_"
# Your initial ID range
[[vocabs.myvocab.id_range]]
first_id = 1
last_id = 100
gh_name = "your-username"
orcid = "0000-0000-0000-0000"
See Schemas for the complete documentation of the configuration fields.
When your are done, add this configuration to the repository:
# Commit & push
git add idranges.toml
git commit -m "Initial vocabulary config"
git push
Generate your first vocabulary¶
First, add the configuration for your vocabularies to the repository:
# Generate xlsx template from configuration
voc4cat template --config idranges.toml --outdir vocabularies/
# Edit the xlsx file in Excel or LibreOffice
# Add your initial concepts
Configure GitHub settings¶
Branch protection¶
It is recommended to prevent accidental changes to the main branch. On Github this can be achieved via branch preptection rules.
Settings → Branches → Add rule for main:
Require pull request reviews before merging
Require status checks to pass
Do not allow bypassing the above settings
GitHub Pages¶
GitHub pages are used to serve the documentation and vocabulary.
Settings → Pages:
Source: Deploy from a branch
Branch:
gh-pages/rootSave
The gh-pages branch will be created automatically on the first successful merge of a pull request with vocabulary data.
Repository settings¶
Recommended settings:
Enable Issues (for ID range requests and discussions)
Disable Wiki (optional, storing documentation in the code-repository is typically preferred)
Disable “Allow merge commits” (optional, the remaining options “squash” and “rebase” give a cleaner git history)
Customize your project¶
Update README.md¶
Replace the template README with your own:
Describe your vocabulary’s scope and purpose
Add contribution guidelines
Include links to documentation on gh-pages
Review/change the license
Add badges (CI status, DOI if available)
Adjust issue templates¶
The templates in .github/ISSUE_TEMPLATE/ can be customized:
request-ids.yaml- ID range request formimprovement.yaml- Vocabulary improvement suggestionsbug.yaml- Bug reports
Customize the xlsx template¶
While most sheets in the xlsx vocdabulary are created automatically, you can still provide a base-template to which the auto-created sheets will be added. This could for example be used to provide a help sheet.
Drop a custom template file template_myvocab.xlsx to the folder templates/.
The sheets that you put into this template may not have the same name as any of the auto-generated sheets
(see list in migration guide).
Keeping in sync with the template¶
It is suggested to merge the changes from the template repository before every new release of your vocabulary. This ensures that the centrally maintained features and best practices trickle into your project.
To review the changes made in a new template version 1.x.y and compare to when you last pulled it use:
# View changes
git fetch https://github.com/nfdi4cat/voc4cat-template tag v1.x.y
git diff ...FETCH_HEAD
If you decide to take over the changes, pull them into your repository and push them to GitHub.
# Apply changes
git pull https://github.com/nfdi4cat/voc4cat-template tag v1.x.y
# -- resolve any conflicts --
git push
Conflicts may occur if you and the template-maintainers have changed the same files since the last template-update.
Handling merge conflicts¶
If conflicts occur during sync:
Resolve conflicts in your editor
Be careful to keep your customizations (README.md, idranges.toml)
Accept template updates for workflows and justfile
Commit the merge
IRI design¶
Plan your vocabulary’s IRI structure early. The default pattern:
https://example.org/myvocab_0000001
└─────────┬────┘
permanent_iri_part + concept ID
For persistent IRIs, consider using w3id.org redirect service:
https://w3id.org/yourorg/myvocab_0000001
This allows content negotiation (HTML for browsers, RDF for machines).
See also IRI design (in voc4cat-template).
Next steps¶
Deeper Customization
Study Voc4Cat, the vocabulary for Catalysis, and look at its code for inspiration how to further customize your vocabulary. For, example you may also want to include a custom homepage built with Sphinx/MyST.
Also take a look at its w3id.org redirect configuration if you struggle to create your own.
Contributing to a Vocabulary - Learn the contribution workflow
Running Workflows Locally - Set up local development
Maintaining a Vocabulary - Maintainer guide for managing contributors and releases