Wildlife-Detection/DEVELOPER_GUIDE.md
2026-03-18 11:15:06 +01:00

142 lines
6.2 KiB
Markdown

# Developer Guide
Technical reference for developers working on or extending the Wildlife Monitoring Dashboard.
## Architecture Overview
The application is a single-file Flask server (`dashboard.py`) with HTML/CSS/JS templates embedded as Python string constants. There is no separate frontend build step.
```
Browser <──HTTP──> Flask (dashboard.py)
├── Model inference (PyTorch, EfficientNet V2-S)
├── XAI pipeline (ScoreCAM, LIME, nearest neighbours)
├── In-memory detection state
└── File-based XAI cache (_xai_cache/)
```
## Key Components
### Model & Data Loading (module level)
On startup, the following are loaded once:
- **Model**: `EfficientNet_V2_S_Weights` base with a custom 7-class classification head, loaded from `efficientnet_v2_wild_forest_animals.pt`.
- **ScoreCAM**: Targets the last convolutional layer (`model.features[-1]`).
- **LIME explainer**: `LimeImageExplainer` instance.
- **Dataset auto-download**: If `wild-forest-animals-and-person-1/` is missing, the Roboflow SDK downloads it automatically on startup.
- **Datasets**: `test_ds` and `train_ds` using `WildForestAnimalsDataset`, a custom `Dataset` class reading from the Roboflow-exported directory structure with `_classes.csv` label files.
- **Training features**: Pre-extracted feature vectors for all training images (used for nearest neighbour lookup). Stored as a normalised matrix `train_feats` for fast cosine similarity via matrix multiplication.
### Detection Simulation (`/api/simulate`)
Each call:
1. Picks a random image from `test_ds` and a random camera.
2. Runs model inference to get predicted class and confidence.
3. Creates a detection dict with a unique 8-char hex ID.
4. Saves the original image to `_xai_cache/<det_id>/original.png`.
5. Kicks off `compute_xai` in a **background thread** (serialised by `_xai_lock` to prevent GPU OOM).
6. Returns the detection as JSON.
### XAI Pipeline (`compute_xai`)
Produces all explanation artefacts for a detection and writes them to `_xai_cache/<det_id>/`:
| File | Method |
|---|---|
| `original.png` | Raw input image (224x224) |
| `chart.png` | Matplotlib probability bar chart (kept for potential offline use) |
| `scorecam.png` | ScoreCAM heatmap overlay |
| `lime1.png` | LIME explanation for top-1 predicted class |
| `lime2.png` | LIME explanation for top-2 predicted class |
| `contrastive.png` | Contrastive LIME (top-1 vs top-2) |
| `nb1.png`, `nb2.png`, `nb3.png` | Three nearest training neighbours |
| `meta.json` | Metadata: predictions, probabilities, LIME class labels, neighbour info |
The function is **idempotent** — it checks for `meta.json` existence before computing. A `threading.Lock` (`_xai_lock`) ensures only one computation runs at a time to avoid GPU memory exhaustion.
### Background Pre-computation
When a detection is simulated, XAI computation starts immediately in a background thread. The `/api/xai/<det_id>` endpoint waits on a `threading.Event` for that detection. This means:
- If the user clicks a detection after a few seconds, XAI is likely already cached.
- If they click immediately, the request blocks until computation finishes.
- If somehow no background thread was started, it falls back to synchronous computation.
### API Endpoints
| Endpoint | Method | Purpose |
|---|---|---|
| `/` | GET | Home page (map + sidebar) |
| `/det/<det_id>` | GET | Detection detail page |
| `/cam/<cam_id>` | GET | Camera feed page |
| `/api/simulate` | POST | Simulate a new detection |
| `/api/xai/<det_id>` | GET | Get XAI results (triggers computation if needed) |
| `/api/detections` | GET | List all detections |
| `/api/verify/<det_id>` | POST | Verify or correct a detection |
| `/map.webp` | GET | Serve the map image |
| `/xai/<det_id>/<filename>` | GET | Serve cached XAI images |
### HTML Templates
Three template constants are defined as raw Python f-strings (`r"""..."""`):
- `HOME_HTML` — fullscreen map, camera markers, sidebar with chart/filters/list, auto-simulation JS loop.
- `DETAIL_HTML` — verification bar, XAI carousel with left/right navigation, HTML/CSS probability chart.
- `CAM_HTML` — responsive grid of detection cards for a single camera.
Templates use Jinja2 syntax (via Flask's `render_template_string`) and receive context variables like `cameras`, `class_names`, `det`, etc.
### In-Memory State
All detection state is held in the `detections` Python list. This means:
- State is lost on server restart.
- There is no database.
- This is intentional for a demonstration/prototype.
## Extending the Application
### Adding a new XAI method
1. Add the computation logic inside `compute_xai()`, saving the output as a PNG in the `out` directory.
2. Add any relevant metadata to the `meta` dict.
3. Add a new slide entry in the `slides` array in `DETAIL_HTML`'s JavaScript.
### Adding a new species
1. Add the class name to `CLASS_NAMES`.
2. Add an emoji to `SPECIES_ICON`.
3. Retrain the model with the new class and update `efficientnet_v2_wild_forest_animals.pt`.
### Adding or moving cameras
Edit the `CAMERAS` dict. The `px` and `py` values are percentages relative to the map image dimensions (left/top).
### Persisting detections
To add persistence, replace the `detections` list with a database (e.g. SQLite). Key fields per detection: `id`, `idx`, `cam`, `pred`, `conf`, `time`, `verified`, `manual`. The `_xai_cache` directory already provides file-based persistence for XAI artefacts.
## Dependencies
Core runtime dependencies (see `pyproject.toml`):
| Package | Purpose |
|---|---|
| `flask` | Web framework |
| `torch`, `torchvision` | Model inference and image transforms |
| `grad-cam` | ScoreCAM implementation |
| `lime` | LIME explainer |
| `scipy` | Required by LIME internals |
| `matplotlib` | Probability chart generation |
| `Pillow` (via torchvision) | Image I/O |
| `scikit-image` (via lime) | `mark_boundaries` for LIME visualisation |
## Running in Development
```bash
uv run python dashboard.py
```
Flask runs in debug mode with auto-reload on port 5000. The `_xai_cache/` directory can be deleted at any time to force recomputation of all explanations.