This article is the developer companion to the user-facing string scanner documentation. It traces the scanner end-to-end through the wpresidence-translate plugin, names every function involved, and points out the option keys and cache signatures you need to know when debugging. Product context lives on the multi-language real estate website page.
Files Involved
| File | Role |
|---|---|
| includes/admin/string-scanner.php | Orchestrator. Owns wpr_translate_admin_handle_strings_actions(), wpr_translate_admin_scan_strings(), and the incremental mtime guard. |
| includes/admin/string-targets.php | Builds scan targets. wpr_translate_admin_get_scan_targets(), wpr_translate_admin_get_theme_domain(), wpr_translate_admin_get_plugin_domain(), wpr_translate_admin_locate_languages_directory(). |
| includes/admin/string-parser.php | Parses gettext files. wpr_translate_admin_locate_translation_files(), wpr_translate_admin_extract_strings_from_translation_file(), wpr_translate_admin_determine_language_code_from_file(), wpr_translate_admin_collect_strings_from_path(). |
| includes/admin/string-storage.php | Persistence. wpr_translate_admin_build_language_maps(), wpr_translate_admin_persist_detected_strings(), wpr_translate_admin_insert_string_rows_batch(). |
| includes/admin/string-database.php | Table existence + schema upgrade helper (processed flag). |
Entry Point
The scan is triggered from the String Translation admin screen via a POST form protected by wpr_translate_scan_strings / wpr_translate_scan_nonce. wpr_translate_admin_handle_strings_actions() verifies the nonce and capability (manage_options), then delegates to:
$result = wpr_translate_admin_scan_strings(); // returns array( 'new' => <int> ) or WP_Error
Target Construction
wpr_translate_admin_get_scan_targets() builds an ordered list of targets, each an associative array with context, path, languages_path, and domain:
- Child theme – context = ‘theme:’ . sanitize_title( $child_slug ).
- Parent theme – included when $theme->parent() is a WP_Theme and its realpath differs from the child path.
- Active plugins – merged with active_sitewide_plugins on multisite; single-file plugins (dirname === ‘.’) are skipped. Context is ‘plugin:’ . sanitize_title( $slug ).
Every target must have a readable languages/ subdirectory resolved through wpr_translate_admin_locate_languages_directory(); targets without one are silently dropped.
The text-domain resolution helpers (wpr_translate_admin_get_theme_domain(), wpr_translate_admin_get_plugin_domain()) prefer the ext Domainheader and fall back to stylesheet/template names or the plugin directory slug.
Incremental Scan State
Every target’s languages/ directory is walked with RecursiveDirectoryIterator inside wpr_translate_admin_get_languages_directory_mtime(). The latest file mtime becomes the signature. It is stored in the wpr_translate_scan_state option keyed by context:
array(
'theme:wpresidence' => array(
'mtime' => 1713300000,
'languages' => md5(wp_json_encode($languages)),
),
'plugin:woocommerce' => ...
)
A target is skipped when both the directory mtime and the language signature match the previous run. This makes repeated scans cheap. Resetting via wpr_translate_admin_reset_strings() deletes this option so the next scan runs fully.
Parsing
wpr_translate_admin_collect_strings_from_path() drives parsing:
- Lists candidate .mo/.po files via wpr_translate_admin_locate_translation_files().
- For each file calls wpr_translate_admin_extract_strings_from_translation_file() which returns array( ‘locale’ => …, ‘strings’ => array( array( ‘original’ => …, ‘translation’ => … ), … ) ).
- Determines the target language code with wpr_translate_admin_determine_language_code_from_file() using the locale and code maps from wpr_translate_admin_build_language_maps().
- Hashes entries into $strings[md5($context . ‘|’ . $value)] with name = ‘str_’ . md5($value).
Entries from multiple files (for example fr_FR.po, de_DE.po) are merged under the same hash – one bucket per source string, with a translations sub-array keyed by language code.
Persistence
wpr_translate_admin_persist_detected_strings() performs:
- Schema check via wpr_translate_admin_strings_table_exists() and lazy column upgrade via wpr_translate_admin_ensure_processed_column().
- Pre-loads existing rows in chunks of 25 contexts via wpr_translate_admin_get_existing_strings_map() – keyed context|name|language_code → string_id.
- Wraps everything in a transaction (START TRANSACTION → COMMIT).
- For each string, iterates configured language codes:
- Default language row always stores status = 1 and translation = $value.
- Non-default rows store status = 1 only when the translation is non-empty after wp_strip_all_tags().
- processed is set to 0 when the row needs export, 1 otherwise.
- Existing rows get an UPDATE; new rows are queued and inserted in batches of 50 via a single multi-row INSERT statement (wpr_translate_admin_insert_string_rows_batch()).
The return value is the number of newly inserted default-language rows, which the UI reports as “new strings registered”.
Options Touched
| Option | Purpose |
|---|---|
| wpr_translate_languages | Source of active language codes/locales. Required for scanning. |
| wpr_translate_scan_state | Per-context mtime + languages-signature cache. |
| wpr_translate_theme_admin_strings_domain | Stored domain from JSON-declared theme admin strings. |
| wpr_translate_theme_admin_strings_hash | Hash of the JSON file to skip reimport when unchanged. |
Failure Modes
- No languages configured → WP_Error( ‘wpr_translate_missing_languages’ ).
- No default language → WP_Error( ‘wpr_translate_no_default_language’ ).
- Missing helper wpr_translate_admin_get_default_language_code → WP_Error( ‘wpr_translate_missing_helper’ ).
- Unreadable directory during mtime walk → RecursiveDirectoryIterator is caught; the walk returns the partial mtime observed so far.
Non-Latin Safety
Language codes are normalised via strtolower() and locales via str_replace(‘-‘, ‘_’, $locale); values and translations are stored verbatim. Do not pass value or translation through sanitize_title() in extensions – Cyrillic, Arabic, CJK characters must survive end-to-end.
Extension Ideas
- Add a custom scan target by writing a helper that mirrors wpr_translate_admin_get_scan_targets() and calling wpr_translate_admin_collect_strings_from_path() / wpr_translate_admin_persist_detected_strings() directly from your own admin action.
- Force a full rescan programmatically with delete_option( ‘wpr_translate_scan_state’ ) followed by wpr_translate_admin_scan_strings().
Further Reading
- Translating Theme & Plugin Strings – the admin UI that consumes the scanner’s output.
- Gettext Pipeline & MO Files – how exported files reach gettext().
See also the multi-language real estate website guide.
