More with LESS
Local Scene Representations for Tactile Imaging

1Technion - Israel Institute of Technology   2Rambam Health Care Campus

Data Driven, 3D, Hand-Held, Generalizable Tactile Imaging

Overview

LESS overview: data collection, local encoder, zero-shot generalization, real-time hand-held demo

Palpation — the use of touch in medical examination — remains highly relevant today. In breast cancer, over 40% of US cases are first detected by palpation, motivating the development of tactile imaging: reconstructing the internal structure of soft objects from touch measurements

We propose LESS (Local Encoder for Spatial Sensing), a tactile imaging approach that exploits the local nature of touch. LESS achieves zero-shot compositional generalization to unseen geometries, supports spatial uncertainty estimation, and enables the first real-time, hand-held tactile imaging device with 3D reconstruction

Contents

Data Collection

We manufacture a soft, modular phantom consisting of a breast-shaped silicone shell and a gel insert containing a harder inclusion that varies in shape and location. The insert can be rotated inside the shell to produce different phantom configurations

X50

Using a motorized lift and a Franka Panda manipulator equipped with a gel-based tactile sensor, we collect over 800 hours of tactile interaction data across the phantom library

To obtain ground-truth internal geometry, each phantom is scanned with an MRI machine, producing volumetric images that clearly resolve the phantom's internal components

Gel insert
Inclusion
Pillar

LESS: Local Encoder for Spatial Sensing

The key insight behind LESS is that touch is inherently local: the tactile response at a given position depends primarily on the nearby sub-surface structure, not on distant regions

We spread particles at a grid of locations and use pose-based force prediction to learn a representation of the sequence of measurements in the receptive field around each location

LESS representation architecture

We then use each encoding to reconstruct a patch of the final image, creating an end-to-end local image prediction

LESS patch reconstruction

Results

LESS is accurate in reconstructing the inner structure of a model from the tactile sequence

Pred X50
GT
Confidence
Pred X50
GT

Due to the localized prediction, LESS produces a useful local uncertainty estimate

Due to the locality of LESS, we get an emergent compositional generalization

The model was only trained on single inclusions and yet it generalizes to multiple and connected ones at test time

Triple

Pred X50
GT

Double

Pred X50
GT

Connected Double

Pred X50
GT

Connected Triple

Pred X50
GT
X2

Our model can zero-shot generalize to X4 larger phantoms, while it was only trained on a fixed size

Hand Held

Hand-held form factor is essential for clinical usability as it allows the physician to maneuver and maintain patient contact during examination

To mitigate the distribution shift when moving to human motions, we collected data from human-operated primitives

X2
X2

These primitives are crucial for effective transfer to human motions

To estimate the sensor pose, in place of the robot kinematics, we use a fiducial-based system

We show a proof-of-concept, real-time hand-held tactile imaging. Here, an operator can observe the model prediction and uncertainty in order to aim their movements

Finally, we produce an estimated prediction for the internal structure

Pred
GT

BibTeX

@inproceedings{rimon2026less,
  title     = {More with {LESS}: Local Scene Representations for Tactile Imaging},
  author    = {Rimon, Zohar and Shafer, Elisei and Tepper, Tal and Kozin, Daniel
               and Malka, Alon and Holland, Roy and Tamar, Aviv},
  booktitle = {Robotics: Science and Systems},
  year      = {2026}
}