April 13, 2026

From Raw Video to Labeled Signs

Raw footage in, production-ready sign language data out.

Sign language has never had a clean dataset, not in the way computer vision has ImageNet or NLP has Common Crawl. What exists today are fragmented corpora, small and inconsistent, built by hearing teams who label what they see rather than what is actually being said.

Raw video of someone signing is not data. It is footage, and the gap between the two is exactly where every Sign Language AI model breaks down. CLERC exists to close that gap.

The problem with raw

A video of a Deaf signer contains a dense, three-dimensional linguistic signal where handshape, movement, location, facial grammar, and spatial reference all happen simultaneously at conversational speed.

Most existing datasets flatten all of this into a single English word per clip and call it done, which is not structure but a rough approximation at best. Models trained on these datasets perform well in controlled demos and collapse in production because they cannot handle signer variation, register shifts, or open vocabulary. The architecture is rarely the problem. The training data almost always is.

What CLERC does differently

We built a pipeline that transforms raw signing footage into structured, linguistically grounded data in three stages, with no shortcuts along the way.

Stage 01 / Source. Every video is recorded with native Deaf signers at their natural signing pace, without scripts adapted from English sentence structure and without hearing actors. The source material is authentic because the people producing it are native signers of the language.

Stage 02 / Structure. We extract the motion signal frame by frame across body, hands, and face, capturing and timestamping every micro-movement. This is not pose estimation for its own sake but the raw spatial grammar of the language, preserved at full resolution.

Stage 03 / Meaning. Each sign is mapped to its gloss, its linguistic label, in context. The question is not just "what handshape is this" but "what does this sign mean in this sentence, signed by this person, at this moment." Temporal boundaries are marked at the millisecond, and the data carries meaning rather than just motion.

The output is production-ready sign language data where every sign has been captured, segmented, and mapped to context.

What this looks like in practice

Take a simple sentence: "Are you tired?"

In most datasets, this would be a single video file tagged with an English translation, with no internal structure, no way to isolate individual signs, and no temporal information whatsoever.

In the CLERC pipeline, the same sentence becomes two distinct segments: YOU from 0.10 to 0.30s and TIRED starting at 0.60s. Each gloss carries its own temporal boundary, and each segment can be replayed, analyzed, and trained on in isolation. That level of granularity is what separates footage from data, and it does not exist in most sign language datasets today.

Why this matters

If you are training a sign language recognition model, your performance ceiling is determined by the quality of your data rather than the sophistication of your architecture. A transformer trained on poorly segmented and inconsistently glossed video will plateau early and fail on every edge case, and every researcher in the field already knows this even though very few have the infrastructure to actually fix it.

If you are building a product that relies on sign language understanding, whether that is a recognition system, an avatar, a search engine, or something entirely new, the same constraint applies. Your model is only as good as the data underneath it, and right now that data barely exists in a structured form.

CLERC is building that infrastructure with structured glosses from native signers and millisecond-level segmentation designed to be the foundation that AI and ML teams can actually build on.

Built by the Deaf community, for everyone

This is not a dataset built about Deaf people but a dataset built by Deaf people.

The founder is Deaf, and the signers are native, which is not a value statement but a data quality requirement. A hearing team making judgment calls on gloss boundaries, regional variants, and prosodic features introduces systematic error that compounds silently through every downstream model. We eliminated that error at the source by making sure the people who build the data are the same people who live the language.

See it in action

The video below shows the pipeline in action, from raw footage to structured, labeled sign language data. This is what the foundation looks like when it is built correctly. You can also read the CLERC demo breakdown to see what this produces at the end: 100 videos, three native Deaf signers, 1,100 annotated glosses, 90% recognition accuracy.

If you are working on Sign Language AI and want to dig into the technical details, we are happy to walk you through it. Reach out at florian@clerc.io.

MORE FROM CLERC

July 9, 2026

The Sign Language AI Dataset Landscape in 2026

WLASL, How2Sign, ASL Citizen, YouTube-SL-25, GoSign.AI: there have never been more sign language datasets. So why can't AI labs ship? An honest map of the landscape, and where the real gaps are.

July 6, 2026

Four Months, Less Than $1,000, and One Conviction: Data Comes First

CLERC's first public sign language recognition demo reached up to 71% accuracy on an unseen signer after four months, less than $1,000, and a Deaf-led data build.

June 22, 2026

Why Foundation Models Need ASL Training Data

Every major foundation model lab is missing the same thing: a sign language modality. Here is why ASL training data is not optional for the next generation of multimodal AI, and why generic video datasets do not close the gap.

June 1, 2026

Deaf People Use the Future First

Texting, captions, video calls. Deaf communities used them before the rest of the world caught up. Sign language is next, and this time people are not the only ones following.

April 27, 2026

Deaf Community First

Why CLERC exists because of, not for, the Deaf community and why that is a technical requirement, not a values statement.

April 22, 2026

We Are Not an Accessibility Company. We Are a Technology Company.

CLERC builds sign language AI infrastructure: structured, Deaf-led sign language data for ML labs, translation engines, and research teams. We're a tech company, not an accessibility company, and here's why that distinction matters.

April 7, 2026

Why Data, Not the Model

Sign Language AI does not have a model problem first. It has a data problem, and that changes everything.

April 1, 2026

THE CLERC DEMO: From Our Database to Your Use Case

Sign language has never had an ImageNet moment. CLERC is building the data foundation that makes sign language AI viable — and the demo shows exactly what that means in practice.

March 29, 2026

Manifesto of CLERC: The Foundation for the Next Generation of Sign Language AI

For nearly a decade, I have been obsessed with one challenge: breaking the communication barrier through technology.