“Recognizing a hotel from an image of a hotel room is important for human trafficking investigations. Images directly link victims to places and can help verify where victims have been trafficked.” (Stylianou et al., 2019).

As part of the Eight Workshop on Fine-Grained Visual Categorization a kaggle competition was launched to support investigations by advancing models to identify hotels from images.

This post contains some parts of my contribution to the Hotel-ID to Combat Human Trafficking 2021 - FGVC8 kaggle competition.

The challenge

The competition contained 97000+ images of hotel rooms from 7700! different hotels around the world. The objective was to identify the hotels of 13000 images from the hidden test set. The metric of the competition was Mean Average Precision of the top 5 picks (MAP@5). My solution scored 14th place out of 92 teams with a 0.6164 MAP@5 on the private leaderboard.

My solution contained six CNN models with various configurations. More technical details and why I ended up with rather simple models is described in a kaggle discussion topic.

Here I post the training and inference of one of the six models as well as the ensemble inference code.

Training

The final training was done on the entire train dataset. I didn’t choose a cross validation strategy to safe training time. To keep variance low nonetheless I relied on the usual regularization strategies, such as dropout and augmentation and in particular on test time augmentation during inference.

To refine the model a validation set can be created by setting the debug flag as described in the notebook.

Inference

Inference ensemble

References

Stylianou, Abby and Xuan, Hong and Shende, Maya and Brandt, Jonathan and Souvenir, Richard and Pless, Robert (2019). Hotels-50K: A Global Hotel Recognition Dataset. The AAAI Conference on Artificial Intelligence (AAAI)