How to Build a Locale‑Aware Word Counter Using Intl.Segmenter

4 March 2026 by

Suraj Barman

Intl.Segmenter provides a standard API for breaking strings into locale‑specific units such as graphemes, words, or sentences, allowing developers to write language‑agnostic text processing code.

Creating a Segmenter Instance

Instantiate the segmenter with a locale identifier and an options object that defines the desired granularity. This object determines how the input text will be partitioned.

Use new Intl.Segmenter(locale, {granularity: word}) for word‑level splitting.
Pass ja-JP for Japanese, hi for Hindi, etc.
The constructor throws a RangeError if the locale is unsupported.
Store the instance for reuse to avoid repeated construction overhead.

Granularity Options

The granularity field can be set to grapheme, word, or sentence. Choose the level that matches your use case.

grapheme: splits at user‑perceived characters, useful for character counters.
word: separates lexical items while flagging punctuation via isWordLike.
sentence: isolates complete sentences, handling language‑specific end marks.
Default granularity is grapheme when no option is supplied.

Filtering Word‑Like Segments

After segmentation, filter out non‑word tokens by checking the isWordLike boolean. This yields a clean array of lexical items.

Convert the iterator to an array with Array.from(segmenter.segment(text)).
Apply .filter(item => item.isWordLike) to drop punctuation.
Map the remaining objects to item.segment to extract plain strings.
The resulting array can be counted with .length for an accurate word total.

Checking Locale Support

Before deploying, verify that the target browsers support the required locales using Intl.Segmenter.supportedLocalesOf. This prevents silent fallbacks.

Provide an array of locale strings to the method.
The return value contains only the locales that are guaranteed to work.
If the array is empty, consider a polyfill or an alternative approach.
Combine this check with feature detection for broader compatibility.

Practical Example: Interactive Word Counter

The following snippet attaches a mouseup listener to a paragraph, runs a Japanese word segmenter on the selected text, and displays the count in a <pre> element.

Retrieve the selection with window.getSelection().
Pass selection.toString() into a Intl.Segmenter(ja-JP, {granularity: word}) instance.
Filter with .filter(item => item.isWordLike) and map to item.segment.
Update the UI by setting preElement.textContent = words.length.
For large‑scale apps, see our guide on real‑time orchestration for patterns that also apply to text pipelines.

By following these steps, developers can replace fragile regular‑expression hacks with a reliable, locale‑aware solution that works across modern browsers.