Optimizing searchStrings function in ImHex
Overview
Project: ImHex (Open-source hex editor)
Function: searchStrings
File Path: ImHex/plugins/builtin/source/content/views/view_find.cpp
ImHex is a popular open-source hex editor used by developers to inspect and analyze binary data. This case study looks at how Functio — our AI-powered performance tool — identified and fixed bottlenecks in ImHex’s searchStrings
function, a core routine for finding strings in large hex data sets.
Introduction
Functio is an AI tool that helps software engineers pinpoint, analyze, and resolve performance bottlenecks in code.
In this case, Functio was running on ImHex, focusing on the searchStrings
function in view_find.cpp
. This case study walks through:
1. What searchStrings
does
2. How Functio set up and ran benchmarks
3. What we found in the performance analysis
4. The code changes and optimizations we applied
5. Performance results before and after
Functionality Description
The searchStrings
function scans a chosen memory region byte-by-byte, looking for valid character sequences based on user-defined settings (SearchSettings::Strings
).
It handles multiple encodings — ASCII, UTF-8, UTF-16LE, UTF-16BE — and even hybrid types like ASCII_UTF16BE
by running multiple searches and combining the results.
Internally, it uses a prv::ProviderReader
to pull bytes from the searchRegion
and checks each one for validity (letters, numbers, punctuation, whitespace, underscores, and optionally line feeds). UTF-16 runs also validate that every other byte is zero, while UTF-8 checks ensure correct multibyte sequences.
When an invalid byte is found (or the end of the region is reached), the function checks if the current run meets the minimum length and null-termination requirements. If it does, it records the occurrence with its address, length, encoding type, and endianness, then resumes scanning.
The main loop of the function:
How Functio Approaches Optimization
Functio can work in two main ways:
1. Targeted function optimization – User provides the source file(s) containing the function to optimize. Functio extracts all dependencies, generates a standalone benchmark setup, and, if needed, creates mocks for missing functions. This setup includes a main
function to drive the tests, using either your test inputs or ones Functio generates automatically.
2. Workflow-based optimization – User points Functio at a command, query, or workflow in a large project. Functio runs sampling profilers to find the slowest functions, then builds a standalone benchmark for them (just like in approach #1).
This Case
We used the targeted function approach
In this case study the input was the view_find.cpp file with the function to be optimized being searchStrings
functio --file ImHex/plugins/builtin/source/content/views/view_find.cpp --function searchString
Test Inputs
Functio generated automatically the following test inputs to check the functionality and measure runtime of the original, and improved programs.

Analysis
When profiling searchStrings
, Functio found two big issues:
1) Character classification overhead
Functions like std::islower
, std::isupper
, and std::isspace
consumed a huge share of CPU time — 23.2%, 22.8%, and 8.7% respectively. Combined, these trivial checks took more time than the rest of the loop logic.
Best it can be visualized on a Flame graph
2) The main loop had a data dependency on the byte
variable that blocked out-of-order execution. The CPU could not to the fullest utilize OOO exectuion, lowering IPC and overall throughput.
Optimizations Applied
Pre-computed Lookup Table
As the program goes byte by byte through the input stream, for each byte we check if it is lowercase/uppercase/number/space etc. In the current ImHex implementation it is done by checking each byte using 7 functions: std::islower(byte), std::isupper(byte), std::isdigit(byte), etc.
- Before:
- Every character validity check called multiple functions (islower
, isupper
, isdigit
, etc.) on each byte.
- After:
- Functio replaced those calls with a single lookup into a precomputed table that encodes all the validity rules. This removes repeated function-call overhead and branches.
- Impact:
- Cycles spent per validity lookup measured through perf_event
- Original: ~131 cycles per single lookup (mean)
- Optimized: ~24 cycles per single lookup combined
Bulk Memory Loading
- Before:
- The reader processed one byte at a time inside the loop. This added per-byte overhead and made OOO execution harder.
- After:
- We read data in bulk into a buffer, allowing the compiler to auto-vectorize and reducing memory access overhead.
- This optimization opens an opportunity to break cross-iteration dependency chain, and reduces loop overhead
Dependency Chain Splitting
- Before:
- Sequential data dependency over variable byte
created performance bottlenecks due to limited CPU parallelism. Due to the significant size of the loop CPU cannot deploy the next iteration until the very end of the previous one, under-utilizing the CPU.
- After:
- Dependency chain split into four independent sub-chains. Functio found that 4 subchains yields optimal performance on our system.
- Enhanced CPU utilization through improved out-of-order execution capabilities, markedly boosting performance.
- Impact:
- Core-bound instruction count dropped from 43% to 26% in the Top-Down analysis.
Correctness
Functio automatically verified that the optimized version produced identical results to the original on all generated test inputs.
Results
Results of the original vs optimized benchmark setup were measured using perf stat.
Original
Optimized
Original Runtime: 0.263 s
Improved Runtime: 0.079 s
That’s over a 3× speedup.
Conclusion
With a few focused changes — precomputed lookups, bulk memory reads, and breaking the dependency chain — Function more than tripled the speed of searchStrings
.
The code is a bit more complex now, but the performance gain is well worth it for heavy hex-data searches, making ImHex faster and more responsive for its users.
Need performance optimization?
Contact us to discuss how we can help optimize your performance-critical software.
Express Interest