Wildlife-Detection/MODEL_CARD.md
2026-04-01 15:35:17 +02:00

124 lines
5.7 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Model Card: EfficientNet V2-S — Wild Forest Animals
This document describes the image classifier used by the Wildlife Monitoring Dashboard (0HM340 HumanAI Interaction, TU/e). It follows the spirit of [Model Cards for Model Reporting](https://arxiv.org/abs/1810.03993).
---
## Model summary
| Field | Value |
|--------|--------|
| **Model family** | EfficientNet V2-S (`torchvision.models.efficientnet_v2_s`) |
| **Pre-training** | ImageNet-1K (`EfficientNet_V2_S_Weights.IMAGENET1K_V1`) |
| **Task** | Multi-class image classification (single label per image) |
| **Output** | 7 logits → softmax probabilities over fixed classes |
| **Weights file** | `efficientnet_v2_wild_forest_animals.pt` (state dict only; not shipped in the repo) |
| **Input** | RGB images resized to **224×224**, ImageNet normalization |
| **Framework** | PyTorch |
---
## Intended use
- **Primary:** Educational / research prototype for humanAI interaction and explainability (dashboard with ScoreCAM, LIME, nearest-neighbour views).
- **Deployment context:** Simulated camera-trap workflow in a demo UI; **not** validated for real wildlife management, safety-critical decisions, or law enforcement.
**Out-of-scope uses:** Do not rely on this model for operational conservation decisions, species surveys with legal implications, or any setting where errors could cause harm without independent verification.
---
## Output classes
Fixed label set (order matches the classifier head):
| Index | Class |
|------|--------|
| 0 | bear |
| 1 | deer |
| 2 | fox |
| 3 | hare |
| 4 | moose |
| 5 | person |
| 6 | wolf |
The dashboard narrative may reference a specific national park; **the model was not trained on data from that park only** — see *Training data*.
---
## Training data
- **Source:** Roboflow project `wild-forest-animals-and-person`, workspace `forestanimals`, **version 1**, export format **multiclass**.
- **Local layout:** `wild-forest-animals-and-person-1/` with `train/`, `valid/`, `test/` splits and `_classes.csv` per split (one-hot columns per class).
- **Label handling:** Rows with multiple positive labels in the CSV use the **first** positive class only (single-label training).
- **Domain:** Mixed camera-trap / wild-animal imagery bundled by the dataset authors; distribution across species, geography, lighting, and quality follows that dataset — **not** guaranteed to match any real parks fauna or camera setup.
---
## Training procedure
Implemented in `train.py` (see repository for exact defaults).
| Setting | Default |
|---------|---------|
| Optimizer | Adam |
| Loss | Cross-entropy |
| Mixed precision | Enabled on CUDA (`autocast` + `GradScaler`) |
| Train augmentations | Random horizontal flip (p=0.5), then ToTensor, Resize 224, ImageNet normalize |
| Evaluation augmentations | ToTensor, Resize 224, ImageNet normalize |
| DataLoader shuffle (train) | Yes, with fixed generator seed |
| Reproducibility | `SEED = 42`; CUDNN deterministic mode enabled in training script |
**Default hyperparameters (CLI overridable):** epochs `3`, batch size `32`, learning rate `1e-3`. Example overrides: `--epochs 5`, `--lr 0.0005`, `--batch-size 16`.
**Reported metrics:** The training script prints validation loss/accuracy per epoch and **test** loss/accuracy after the last epoch. Exact numbers depend on run, hardware, and hyperparameters; **record your own metrics** when you train. Weights in the repo are not pinned to a single certified benchmark run.
---
## Evaluation
- **Split:** Held-out `test` folder from the Roboflow export.
- **Metric:** Top-1 accuracy and cross-entropy loss on the test loader (see console output from `train.py`).
- **Limitations:** No per-class confusion matrix or calibration analysis in the default pipeline; no external geographic or temporal holdout.
---
## Ethical and fairness considerations
- **“Person” class:** Predictions can affect privacy perceptions in camera-trap settings; treat as a coarse label, not identity or intent.
- **Wildlife labels:** Errors could misrepresent which species are present; the UI supports **manual verification** — use it when stakes matter.
- **Deployment:** Automated alerts (e.g. wolf/bear warnings in the demo) are **illustrative**; they should not replace expert assessment or park regulations.
---
## Caveats and limitations
1. **Domain shift:** Performance will drop on images that differ strongly from the training distribution (new sensors, night IR, heavy occlusion, rare poses).
2. **Single label:** Images with multiple species only contribute one label during training; the model is not trained for multi-label detection.
3. **Geographic / ecological claims:** Class names refer to species types; **the model does not prove** an animals presence in any specific jurisdiction or ecosystem.
4. **Weights:** If you did not train the checkpoint yourself, treat reported behaviour as **unknown** until you evaluate on your data.
5. **API keys / data download:** Training and dashboard can auto-download data via Roboflow; use your own keys and comply with Roboflow terms in production-like setups.
---
## How to reproduce
```bash
uv sync
uv run python train.py
# Optional: uv run python train.py --epochs 5 --lr 0.0005 --batch-size 16
```
This produces `efficientnet_v2_wild_forest_animals.pt` compatible with `dashboard.py` and `main.py`.
---
## Citation / contact
- **Course / context:** 0HM340 HumanAI Interaction, Eindhoven University of Technology.
- **Base architecture:** Tan & Le, *EfficientNetV2*, 2021 (via torchvision).
- For questions about this card or the codebase, refer to the project `README.md` and `DEVELOPER_GUIDE.md`.
---
*Last updated to match repository layout and training script defaults; update this file if classes, dataset version, or training recipe change.*