Google Researchers Introduced LSM-2 with Adaptive and Inherited Masking (AIM): Enabling Direct Learning from Incomplete Wearable Data

Introduction

Wearable devices are transforming health monitoring by enabling continuous collection of physiological and behavioral signals such as heart rate, activity, temperature, and skin conductance. However, the real-world data that these devices generate is highly prone to missingness due to sensor failures, device removal, charging, motion artifacts, battery-saving modes, and other interruptions. This presents a significant challenge for self-supervised learning (SSL) and foundation models, which typically expect complete, regular data streams. Past solutions often relied on data imputation or discarding incomplete instances, which risks introducing bias or wasting valuable information.

A team of researchers from Google DeepMind introduced LSM-2 (Large Sensor Model 2) framework—accompanied by the new Adaptive and Inherited Masking (AIM) strategy—addresses these issues directly, learning robust representations from incomplete wearable sensor data without explicit imputation. Below, we examine the technical innovations, empirical results, and key insights from this advancement.

The Challenge: Wearable Data Missingness

Data Fragmentation

Missingness Modes

Device off (charging or not worn)Selective sensor deactivation (power-saving or operation-specific)Motion artifacts or environmental noiseOut-of-range or physiologically impossible readings filtered out during preprocessing

Impact on Modeling

Adaptive and Inherited Masking (AIM): Technical Approach

Key Concepts

AIM integrates two masking types for robust learning:

Inherited Mask

Artificial Mask

These masks are unioned and handled by a transformer-based encoder-decoder structure, enabling the model to:

Masking Strategies for Pretraining

Random Imputation

Temporal Slices

Sensor Slices

AIM combines the efficiency of dropout masking (removal from computation) and the flexibility of attention masking (support for dynamically-varying missingness), allowing the model to scale to long input sequences (day-long, >3,000 tokens).

Dataset and Pretraining Details

Scale

Sensors

Demographic Diversity

Downstream Labeled Data

Metabolic Study (hypertension, anxiety prediction; n=1,250 labeled users)Activity Recognition (20 activity classes, 104,086 events).

Evaluation and Results

Downstream Tasks

AIM-based LSM-2 was assessed on:

Classification

Regression

Generative

Quantitative Results

Task	Metric	Best LSM-1	LSM-2 w/ AIM	Improvement
Hypertension	F1	0.640	0.651	+1.7%
Activity Recognition	F1	0.470	0.474	+0.8%
BMI (regression)	Corr	0.667	0.673	+1.0%
Random Imputation (80%)	MSE (↓)	0.30	0.20	+33% lower error
2-signal Recovery	MSE (↓)	0.73	0.17	+77% lower error

Robustness to Targeted Missingness

Clinical Coherence

Scaling

Technical Insights

Direct Handling of Real-World Missingness

Hybrid Masking Mechanism

Generalizable Embeddings

Generative and Discriminative Power

Conclusion

LSM-2 with Adaptive and Inherited Masking presents a major step forward for deploying AI-driven health insights using real-world wearable sensor data. By directly embracing ubiquitous, structured missingness, and unifying generative and discriminative capabilities under one efficient and robust foundation model, this approach lays crucial groundwork for the future of wearable and health AI in realistic, imperfect data environments.

Check out the Paper and Technical details. All credit for this research goes to the researchers of this project.

Meet the AI Dev Newsletter read by 40k+ Devs and Researchers from NVIDIA, OpenAI, DeepMind, Meta, Microsoft, JP Morgan Chase, Amgen, Aflac, Wells Fargo and 100s more [SUBSCRIBE NOW]

The post Google Researchers Introduced LSM-2 with Adaptive and Inherited Masking (AIM): Enabling Direct Learning from Incomplete Wearable Data appeared first on MarkTechPost.