GeoLocator v1.0: Baseline Training Configuration
This model serves as the initial baseline for the GeoLocator project, featuring a simple single-head classifier architecture. Its performance was significantly hindered by the quality of the training data and initial clustering strategy, leading to high prediction errors. This article documents the architecture, training configuration, and critical lessons learned.
Poor Baseline Performance
With a median error of approximately 1,500 km, this model could barely identify the correct country. The primary issues stemmed from data quality and a flawed clustering strategy that allowed clusters to span international borders.
Performance Metrics
| Metric | Value |
|---|---|
| Median Error | ~1,500 km |
| Evaluation Metric | Haversine Distance |
| Model Selection | Validation Loss |
Architecture & Loss Function
The v1.0 model utilized a straightforward single-head classification approach, outputting logits for 10,000 geographic clusters.
Backbone
ConvNeXt Tiny
convnext_tiny
Architecture
GeoModel
Single-Head Classifier
Loss Function
CrossEntropyLoss
Standard classification
Training Hyperparameters
| Parameter | Value |
|---|---|
| Global Batch Size | 1024 |
| Max Epochs | 200 |
| Learning Rate | 3e-4 (Base) / 1e-6 (Min) |
| Optimizer | AdamW (weight_decay=1e-4) |
| Precision | AMP (autocast) |
| Gradient Clipping | max_grad_norm=1.0 |
| Early Stopping | Patience = 20 epochs |
Data Handling & Preprocessing
Training data was sourced from the Flickr Dataset, which contained many images deemed unsuitable for geolocation tasks. Data was loaded via a streaming approach from sharded MessagePack (.msg) files.
Clustering Strategy
Clustering Issue: Border Confusion
Clusters were allowed to span international borders, which significantly confused the model during training. This architectural flaw was identified and corrected in subsequent versions.
Image Preprocessing (Albumentations)
Lessons Learned
The v1.0 baseline established critical insights that informed subsequent development. The poor performance, while disappointing, revealed fundamental issues with both data quality and clustering methodology. These learnings directly influenced the architectural decisions in v1.4, including the move to border-respecting clustering and multi-task learning objectives.