About Polyglot Watchdog

Table of contents

What this tool does

Polyglot Watchdog helps teams find localization problems by comparing English reference captures with captures from other languages.

It collects page evidence, matches comparable UI elements, and surfaces issues with context so reviewers can quickly understand what needs to be fixed.

How to use it (6 tabs)

  1. ADD URL

    Add the page links you want checked. This creates the review list for your run.

  2. FIRST RUN

    Start the first run to confirm the selected pages can be checked successfully from start to finish.

  3. CHOOSE NEEDED

    Keep only the pages and results your team needs right now. This narrows the work to priority items.

  4. CHECK LANGUAGES

    Review the selected content across languages to find text that is missing, unclear, or inconsistent.

  5. SEE ERRORS

    Open failed checks to see what stopped the run and what must be fixed before you run again.

  6. ABOUT

    Use this page for a quick reminder of each tab and what outcome to expect at every step.

What the reference pages mean

Reference pages are the English baseline used for comparison. They represent the expected meaning and structure for each capture context.

When a target-language page is compared, differences are evaluated against this baseline to identify potential translation, consistency, or rendering issues.

What errors and issue states mean

An error usually means the system could not complete a required step (for example missing artifacts or incomplete evidence).

Issue state indicates workflow position (for example newly detected, triaged, or resolved).

Who this is for

This tool is intended for localization QA analysts, translators, language leads, and product teams who need clear evidence to review multilingual quality.

It is also useful for engineering teams that want deterministic, repeatable localization checks in release workflows.

LLM Wire Format Dictionary

The app uses a compact JSON wire format only for the LLM request/response path to reduce token usage. Internally and in the UI, values are decoded back into readable labels.

Request format

{"l":"<target_language>","i":[[id,en,tg,k,c,m,p]]}

  • id: numeric item id
  • en: English source text
  • tg: target text
  • k: kind code
  • c: context code
  • m: masked flag (0/1)
  • p: low pairing confidence flag (0/1)

Response format

{"r":[[id,s,g,m,n]]}

  • s: spelling risk score (0..100)
  • g: grammar risk score (0..100)
  • m: meaning mismatch risk score (0..100)
  • n: primary note code

Kind codes (k)

0=a, 1=p, 2=h1, 3=img, 4=short_text

Context codes (c)

0=nav, 1=footer, 2=title, 3=body, 4=img_alt, 5=brand, 6=lang_switcher, 7=cta

Note codes (n)

0=ok, 1=spell, 2=grammar, 3=meaning, 4=untranslated, 5=partial, 6=brand_ok, 7=same_ok, 8=masked_ok, 9=adult_ok, 10=uncertain