Step 1 — Add Tactile Modality

Extending the Policy Config

The tactile observation can be added as either the full pressure map (spatial, 64 values per frame) or the aggregated scalar (total_force_n). Start with the scalar — it is easier to normalize and sufficient for most manipulation tasks. Add the pressure map if your task requires spatial contact information (e.g., distinguishing edge vs. center contact).

# lerobot/configs/policy/act_paxini.yaml policy: name: act dataset: repo_id: local/paxini-grasp-place observation_features: - observation.state # joint positions (7,) - observation.images.wrist # wrist camera (H, W, 3) - observation.tactile.total_force_n # scalar force (1,) - observation.tactile.contact_area_mm2 # scalar area (1,) # Optional: full pressure map # - observation.tactile.pressure_map # (8, 8) → flatten to (64,) action_features: - action # target joint positions (7,)
Step 2 — Compute Normalization Statistics

Normalization Statistics from Your Dataset

Always compute normalization statistics from your specific dataset — do not use hardcoded values. The Paxini SDK provides a utility for this:

# Compute stats from your recorded dataset from paxini.data import compute_dataset_stats stats = compute_dataset_stats("./tactile_dataset/") print(stats.to_yaml()) # Expected output (example values): # observation.tactile.total_force_n: # mean: 2.84 # std: 1.93 # min: 0.0 # max: 14.2 # observation.tactile.contact_area_mm2: # mean: 22.4 # std: 18.6 # Save to config stats.save("./act_paxini_stats.yaml")
Why normalization matters for tactile Force values (0–20 N) are on a completely different scale from joint positions (typically ±π radians) and pixel values (0–255). Without normalization, the model will learn to ignore the tactile signal because its magnitude is swamped by the visual loss term during training.
Step 3 — Training

Training Command

Launch ACT training with the tactile config. Training time: ~45 minutes on an 8GB GPU, ~3 hours on CPU.

# Install LeRobot if not already installed pip install lerobot # Train ACT with tactile modality python lerobot/scripts/train.py \ --config-name act_paxini \ --dataset-path ./tactile_dataset/ \ --stats-path ./act_paxini_stats.yaml \ --output-dir ./checkpoints/act_tactile/ \ --num-epochs 300 \ --batch-size 8 \ --seed 42 # Monitor training loss (in a separate terminal) tensorboard --logdir ./checkpoints/act_tactile/logs/

Training is complete when the validation loss plateaus. For 50 episodes at 100 Hz, expect convergence around epoch 200–250. The tactile_obs_loss metric in TensorBoard should decrease steadily alongside the main action loss.

Step 4 — Evaluate vs. Vision-Only Baseline

Evaluating Improvement Over Vision-Only

Train a second policy using only vision + proprioception (no tactile channels) on the same dataset. This is your baseline. Run evaluation over 20 rollouts for each policy on your target task:

# Train the vision-only baseline (same config, minus tactile keys) python lerobot/scripts/train.py \ --config-name act_vision_only \ --dataset-path ./tactile_dataset/ \ --output-dir ./checkpoints/act_baseline/ # Evaluate both policies — 20 rollouts each python lerobot/scripts/eval.py \ --checkpoint ./checkpoints/act_tactile/best.ckpt \ --num-episodes 20 \ --eval-name "tactile" python lerobot/scripts/eval.py \ --checkpoint ./checkpoints/act_baseline/best.ckpt \ --num-episodes 20 \ --eval-name "baseline"

Look for improvement in grasp stability metrics — specifically task success rate on slip-prone objects and average hold duration during the placement phase. Tactile-aware policies typically show the largest improvement on deformable and variably-weighted objects.

What to do if tactile does not improve performance First, verify your dataset's contact events align with video (Unit 4 quality check). Second, confirm normalization statistics were computed correctly — incorrect normalization is the most common training failure mode for new modalities. Third, try adding the pressure_map as an observation instead of just the scalar — it provides richer spatial context.

Path Complete

You have gone from first sensor reading to a trained tactile-aware manipulation policy. You now have a complete pipeline — hardware, data, and policy — that you can apply to any contact-rich task.

Unit 5 Complete When...

Training completes without errors and produces a checkpoint. The tactile validation loss decreases and plateaus during training. You have run 20 evaluation rollouts for both the tactile policy and the vision-only baseline and recorded the success rates for comparison.