We present SeeKer, a method for detecting anomalous human behaviour by predicting the conditional density of skeleton keypoints. Given the preceding keypoints, SeeKer predicts a multivariate normal distribution over possible locations for the subsequent keypoint. Applying this approach across the entire skeleton sequence results in an autoregressive factorization of the joint sequence density at the keypoint level. SeeKer training corresponds to maximizing sequence likelihood. During inference, SeeKer identifies skeletons as anomalous if their constituent keypoints deviate from the predicted distributions.
We evaluate on three datasets for skeleton-based video anomaly detection where SeeKer shows outstanding performance:
We demonstrate keypoint-level anomaly scores in different actions and scenes examples from the UBnormal dataset.
Abnormal actions are: running, having a seizure, laying down, shuffling, walking drunk, people and car accident, jumping, and jaywalking.
@article{delic2025sequential,
title={Sequential keypoint density estimator: an overlooked baseline of skeleton-based video anomaly detection},
author={Deli{\'c}, Anja and Gr{\v{c}}i{\'c}, Matej and {\v{S}}egvi{\'c}, Sini{\v{s}}a},
journal={arXiv preprint arXiv:2506.18368},
year={2025}
}