Problem: fuzzy.c can be further improved
Solution: Fix memory leak and refactor it (glepnir).
Optimize performance and memory allocation:
- Fix memory leak in fuzzy_match_in_list.
- using single memory allocation in match_positions
- Improve has_match performance and add null pointer checks
closes: #18012
Signed-off-by: glepnir <glephunter@gmail.com>
Signed-off-by: Christian Brabandt <cb@256bit.org>
Problem: fuzzy.c has a few issues
Solution: Use Vims memory management, update style
(glepnir)
Problem:
- Missing cleanup of lmatchpos lists causing memory leaks
- Missing error handling for list operations
- Use of malloc() instead of Vim's alloc() functions
- Inconsistent C-style comments
- Missing null pointer checks for memory allocation
- Incorrect use of vim_free() for list objects
Solution:
- Add proper cleanup of lmatchpos in done section using list_free()
- Set lmatchpos to NULL after successful transfer to avoid confusion
- Add error handling for list_append_tv() failures
- Replace malloc() with alloc() and add null pointer checks
- Convert C-style comments to C++ style for consistency
- Fix vim_free() calls to use list_free() for list objects
closes: #17984
Signed-off-by: glepnir <glephunter@gmail.com>
Signed-off-by: Christian Brabandt <cb@256bit.org>
Problem: fuzzy-matching can be improved
Solution: Implement a better fuzzy matching algorithm
(Girish Palya)
Replace fuzzy matching algorithm with improved fzy-based implementation
The
[current](https://www.forrestthewoods.com/blog/reverse_engineering_sublime_texts_fuzzy_match/)
fuzzy matching algorithm has several accuracy issues:
* It struggles with CamelCase
* It fails to prioritize matches at the beginning of strings, often
ranking middle matches higher.
After evaluating alternatives (see my comments
[here](https://github.com/vim/vim/issues/17531#issuecomment-3112046897)
and
[here](https://github.com/vim/vim/issues/17531#issuecomment-3121593900)),
I chose to adopt the [fzy](https://github.com/jhawthorn/fzy) algorithm,
which:
* Resolves the aforementioned issues.
* Performs better.
Implementation details
This version is based on the original fzy
[algorithm](https://github.com/jhawthorn/fzy/blob/master/src/match.c),
with one key enhancement: **multibyte character support**.
* The original implementation supports only ASCII.
* This patch replaces ascii lookup tables with function calls, making it
compatible with multibyte character sets.
* Core logic (`match_row()` and `match_positions()`) remains faithful to
the original, but now operates on codepoints rather than single-byte
characters.
Performance
Tested against a dataset of **90,000 Linux kernel filenames**. Results
(in milliseconds) show a **\~2x performance improvement** over the
current fuzzy matching algorithm.
```
Search String Current Algo FZY Algo
-------------------------------------------------
init 131.759 66.916
main 83.688 40.861
sig 98.348 39.699
index 109.222 30.738
ab 72.222 44.357
cd 83.036 54.739
a 58.94 62.242
b 43.612 43.442
c 64.39 67.442
k 40.585 36.371
z 34.708 22.781
w 38.033 30.109
cpa 82.596 38.116
arz 84.251 23.964
zzzz 35.823 22.75
dimag 110.686 29.646
xa 43.188 29.199
nha 73.953 31.001
nedax 94.775 29.568
dbue 79.846 25.902
fp 46.826 31.641
tr 90.951 55.883
kw 38.875 23.194
rp 101.575 55.775
kkkkkkkkkkkkkkkkkkkkkkkkkkkkk 48.519 30.921
```
```vim
vim9script
var haystack = readfile('/Users/gp/linux.files')
var needles = ['init', 'main', 'sig', 'index', 'ab', 'cd', 'a', 'b',
'c', 'k',
'z', 'w', 'cpa', 'arz', 'zzzz', 'dimag', 'xa', 'nha', 'nedax',
'dbue',
'fp', 'tr', 'kw', 'rp', 'kkkkkkkkkkkkkkkkkkkkkkkkkkkkk']
for needle in needles
var start = reltime()
var tmp = matchfuzzy(haystack, needle)
echom $'{needle}' (start->reltime()->reltimefloat() * 1000)
endfor
```
Additional changes
* Removed the "camelcase" option from both matchfuzzy() and
matchfuzzypos(), as it's now obsolete with the improved algorithm.
related: neovim/neovim#34101
fixes#17531closes: #17900
Signed-off-by: Girish Palya <girishji@gmail.com>
Signed-off-by: Christian Brabandt <cb@256bit.org>