Back
About

How This Sign Translation App Works

This project provides experimental, real‑time sign letter detection and learning tools powered by browser‑side machine learning—no external API calls required for inference.

Model Creation & Classification Flow

This high-level flow shows how raw captured hand landmarks become a trained model and how that model runs in the browser to classify live input.

Diagram: Offline training produces an exported JSON model consumed by the client‑side inference loop.

Overview

The application combines a hand landmark detector (MediaPipe Tasks) with a custom TensorFlow.js classification model trained on locally collected samples. It focuses on handshape-based alphabet recognition (finger spelling) and phrase exploration using mapped reference media.

Experimental: Models are prototype quality and not a full or authoritative representation of any standardized sign language.

Processing Pipeline

1. Video Capture

Accesses your camera using getUserMedia(). Frames never leave your device.

2. Hand Landmarks

MediaPipe produces 3D landmark coordinates (normalized) for each detected hand.

3. Feature Normalization

Coordinates re-rooted at the wrist, scaled, flattened into a feature vector.

4. Model Inference

Custom TF.js model outputs per-letter probabilities each frame.

5. Temporal Hold

Detected letter must stay stable above threshold for a hold duration.

6. Acceptance & Output

Accepted letters appended to transcript, optional speech synthesis triggered.

Model Details

Alphabet Classifier

Trained using locally captured landmark snapshots. Each sample is a 63–126 dimension vector depending on preprocessing. The model architecture (e.g., dense layers with dropout) is tuned for fast inference in the browser.

Filipino Experimental Variant

The Filipino model card references a parallel dataset exploring localized sign variants. Currently limited sample coverage—feedback helps identify misclassifications and guide dataset balancing.

Inference Performance

Privacy & Data

The app does not upload your camera frames or detected landmarks. All inference happens locally inside the browser. No persistent personal data storage is performed from public pages.

Limitations

Planned Enhancements

Give Feedback

Spot a misclassified letter or want to contribute samples? Open an issue or send feedback via a future integrated form. Your input helps expand coverage and fairness.