DVC (Data Version Control)

We use DVC to version-control large data files (models, input datasets) without storing them in Git. DVC tracks file hashes in .dvc files committed to Git, while the actual data lives on remote storage.

Authentication

Create a file .dvc/config.local (already gitignored) with credentials for the remotes:

['remote "goodcloud"']
    user = nhi_api
    password = <RIBASIM_NL_CLOUD_PASS>
['remote "modeldata"']
    user = nhi_api
    password = <RIBASIM_NL_CLOUD_PASS>
['remote "minio"']
    region = eu-west-1
    endpointurl = https://s3.deltares.nl
    access_key_id = <MINIO_ACCESS_KEY>
    secret_access_key = <MINIO_SECRET_KEY>

The password is the same as RIBASIM_NL_CLOUD_PASS in your .env file. The MinIO keys correspond to MINIO_ACCESS_KEY and MINIO_SECRET_KEY in .env.

Note

Since pixi run install runs dvc pull, this configuration is needed before that step.

Pulling data

After authentication is set up, pull all tracked data:

pixi run dvc pull

If you have local changes that you want to overwrite, add --force.

Remotes

Three remotes are configured in .dvc/config:

Remote URL Purpose
minio (default) s3://ribasim-nl/dvc Primary DVC cache on MinIO
goodcloud The Good Cloud /dvc Legacy DVC storage
modeldata The Good Cloud /Ribasim modeldata Source data not under DVC control

Pipeline

The DVC pipeline is defined in dvc.yaml. Stages run sequentially with dependencies:

flowchart LR
    node1["bathymetry"]
    node2["bergend@aa_en_maas"]
    node3["bergend@brabantse_delta"]
    node4["bergend@de_dommel"]
    node5["bergend@drents_overijsselse_delta"]
    node6["bergend@hunze_en_aas"]
    node7["bergend@limburg"]
    node8["bergend@noorderzijlvest"]
    node9["bergend@rijn_en_ijssel"]
    node10["bergend@stichtse_rijnlanden"]
    node11["bergend@vallei_en_veluwe"]
    node12["bergend@vechtstromen"]
    node13["data/Basisgegevens/Baseline/baseline-nl_land-j23_6-v1/baseline.gdb.dvc"]
    node14["data/Basisgegevens/BuitenlandseAanvoer/aangeleverd/BuitenlandseAanvoer_V5.xlsx.dvc"]
    node15["data/Basisgegevens/LHM/4.3/input/LHM_data.tif.dvc"]
    node16["data/Basisgegevens/RWZI/aangeleverd/locaties/RWZI_coordinates.geojson.dvc"]
    node17["data/Basisgegevens/Top10NL/top10nl_Compleet.gpkg.dvc"]
    node18["data/Basisgegevens/VanDerGaast_QH/spafvoer1.tif.dvc"]
    node19["data/Basisgegevens/profielen.dvc"]
    node20["dynamic@aa_en_maas"]
    node21["dynamic@brabantse_delta"]
    node22["dynamic@de_dommel"]
    node23["dynamic@drents_overijsselse_delta"]
    node24["dynamic@hunze_en_aas"]
    node25["dynamic@limburg"]
    node26["dynamic@noorderzijlvest"]
    node27["dynamic@rijn_en_ijssel"]
    node28["dynamic@stichtse_rijnlanden"]
    node29["dynamic@vallei_en_veluwe"]
    node30["dynamic@vechtstromen"]
    node31["feedback@amstel_gooi_en_vecht"]
    node32["feedback@delfland"]
    node33["feedback@hollands_noorderkwartier"]
    node34["feedback@hollandse_delta"]
    node35["feedback@rijnland"]
    node36["feedback@rivierenland"]
    node37["feedback@scheldestromen"]
    node38["feedback@schieland_en_de_krimpenerwaard"]
    node39["feedback@wetterskip_fryslan"]
    node40["feedback@zuiderzeeland"]
    node41["forcing@amstel_gooi_en_vecht"]
    node42["forcing@delfland"]
    node43["forcing@hollands_noorderkwartier"]
    node44["forcing@hollandse_delta"]
    node45["forcing@rijnland"]
    node46["forcing@rivierenland"]
    node47["forcing@scheldestromen"]
    node48["forcing@schieland_en_de_krimpenerwaard"]
    node49["forcing@wetterskip_fryslan"]
    node50["forcing@zuiderzeeland"]
    node51["hws_demand"]
    node52["hws_transient"]
    node53["koppelen"]
    node54["parameterized@aa_en_maas"]
    node55["parameterized@brabantse_delta"]
    node56["parameterized@de_dommel"]
    node57["parameterized@drents_overijsselse_delta"]
    node58["parameterized@hunze_en_aas"]
    node59["parameterized@limburg"]
    node60["parameterized@noorderzijlvest"]
    node61["parameterized@rijn_en_ijssel"]
    node62["parameterized@stichtse_rijnlanden"]
    node63["parameterized@vallei_en_veluwe"]
    node64["parameterized@vechtstromen"]
    node65["profiles@amstel_gooi_en_vecht"]
    node66["profiles@delfland"]
    node67["profiles@hollands_noorderkwartier"]
    node68["profiles@hollandse_delta"]
    node69["profiles@rijnland"]
    node70["profiles@rivierenland"]
    node71["profiles@scheldestromen"]
    node72["profiles@schieland_en_de_krimpenerwaard"]
    node73["profiles@wetterskip_fryslan"]
    node74["profiles@zuiderzeeland"]
    node75["rwzi"]
    node76["samenvoegen"]
    node1-->node51
    node2-->node20
    node3-->node21
    node4-->node22
    node5-->node23
    node6-->node24
    node7-->node25
    node8-->node26
    node9-->node27
    node10-->node28
    node11-->node29
    node12-->node30
    node13-->node51
    node14-->node20
    node14-->node21
    node14-->node22
    node14-->node23
    node14-->node24
    node14-->node25
    node14-->node26
    node14-->node27
    node14-->node28
    node14-->node29
    node14-->node30
    node15-->node2
    node15-->node3
    node15-->node4
    node15-->node5
    node15-->node6
    node15-->node7
    node15-->node8
    node15-->node9
    node15-->node10
    node15-->node11
    node15-->node12
    node16-->node75
    node17-->node54
    node17-->node55
    node17-->node56
    node17-->node57
    node17-->node58
    node17-->node59
    node17-->node60
    node17-->node61
    node17-->node62
    node17-->node63
    node17-->node64
    node18-->node2
    node18-->node3
    node18-->node4
    node18-->node5
    node18-->node6
    node18-->node7
    node18-->node8
    node18-->node9
    node18-->node10
    node18-->node11
    node18-->node12
    node19-->node65
    node19-->node66
    node19-->node67
    node19-->node68
    node19-->node69
    node19-->node70
    node19-->node71
    node19-->node72
    node19-->node73
    node19-->node74
    node20-->node76
    node21-->node76
    node22-->node76
    node23-->node76
    node24-->node76
    node25-->node76
    node26-->node76
    node27-->node76
    node28-->node76
    node29-->node76
    node30-->node76
    node31-->node65
    node32-->node66
    node33-->node67
    node34-->node68
    node35-->node69
    node36-->node70
    node37-->node71
    node38-->node72
    node39-->node73
    node40-->node74
    node41-->node76
    node42-->node76
    node43-->node76
    node44-->node76
    node45-->node76
    node46-->node76
    node47-->node76
    node48-->node76
    node49-->node76
    node50-->node76
    node51-->node52
    node51-->node53
    node52-->node76
    node54-->node2
    node55-->node3
    node56-->node4
    node57-->node5
    node58-->node6
    node59-->node7
    node60-->node8
    node61-->node9
    node62-->node10
    node63-->node11
    node64-->node12
    node65-->node41
    node66-->node42
    node67-->node43
    node68-->node44
    node69-->node45
    node70-->node46
    node71-->node47
    node72-->node48
    node73-->node49
    node74-->node50
    node75-->node20
    node75-->node21
    node75-->node22
    node75-->node23
    node75-->node24
    node75-->node25
    node75-->node26
    node75-->node27
    node75-->node28
    node75-->node29
    node75-->node30
    node75-->node41
    node75-->node42
    node75-->node43
    node75-->node44
    node75-->node45
    node75-->node46
    node75-->node47
    node75-->node48
    node75-->node49
    node75-->node50
    node75-->node51
    node76-->node53
Figure 1: DVC pipeline DAG (click to zoom)

Reproduce the full pipeline:

pixi run dvc repro

Reproduce a single stage:

pixi run dvc repro dynamic@aa_en_maas

Importing data from remote storage

Use dvc import-url to download a file from a remote URL and take it under DVC control. The remote://modeldata alias refers to The Good Cloud storage that holds source data:

pixi run dvc import-url -f remote://modeldata/Zuiderzeeland/modellen/Zuiderzeeland_parameterized_2025_9_0 data/Zuiderzeeland/modellen/

Pushing changes

After producing new outputs (running pipeline stages or adding new data), push to the remote:

pixi run dvc push

Do this before pushing the updated dvc.lock file to git, so the data hashes referenced therein are avaiable for everyone.