Ticket #4587 (closed defect: fixed)

Opened 3 months ago

Last modified 11 days ago

regexp search issues in mcviewer

Reported by: dextarr Owned by: andrew_b
Priority: major Milestone: 4.8.33
Component: mcview Version: master
Keywords: mcviewer regexp Cc:
Blocked By: Blocking:
Branch state: merged Votes for changeset: committed-master

Description

Searching using regexp patterns sometimes result in weird matches.

How to reproduce:

Create a test file:

printf "0\n1\n2\n3\n4\n5\n101\n11\n12\n13\n14\n15\n" >input.txt

open the file in mcviewer (F3)

search in the file with F7 using regexp mode for "^1" (and then hitting 'n' for the next match) finds as expected:

the first character in 1 101 12 13 14 15

but also _both_ 1s in 11


reverse search the same pattern (? or F7 and setting 'Backwards' or just pressing "N") in regexp mode finds as expected:

the first character in 15 14 13 12 1

but also _both_ 1s in 11 and 101


the same tests work in mcedit as expected.

thank you for looking into the issue!

Change History

comment:1 Changed 5 weeks ago by zaytsev

Another example:

https://lists.midnight-commander.org/pipermail/mc/2024-November/005786.html

Let's say we have the following file:

---
    A   A

    B   A

A
---

Now, let's open it in the internal viewer and start a regular expression
search for the following pattern: ^\s*A. 

At first, it will find the very first occurrence, 'beginning of the
line, 4 spaces, A'. So far so good. But if we then press 'S-F7' or 'n'
to find the next match, the cursor will move to the next space on the
first match. I.e., it looks like at first it found '4 spaces and A',
then '3 spaces and A', and so on. And then it will move to the second
A on the very same first row. Then skips the third A (rightly so). And
then jumps to the last A on the fifth row (again, rightly so).

The problem is

  - there's really no such match as ^\s\s\sA. Or any other matches on
    the first row (as if it just ignores ^ further on if a match has
    been found)
  - And even if we drop ^, it still would be rather strange to find,
    at first, the greediest match, and then continue searching inside
    that greediest match for the less greedy ones.

comment:2 Changed 5 weeks ago by andrew_b

The simplest file for reproducing is following:

11

The ^1 regexp matches both 1.

comment:3 Changed 12 days ago by andrew_b

  • Owner set to andrew_b
  • Status changed from new to accepted
  • Branch state changed from no branch to on review
  • Milestone changed from Future Releases to 4.8.33

Branch: 4587_mcview_search_bol
Initial changeset:ebd67fab2f3765e0713873ea23ad3bcb7bb1e015

comment:4 Changed 12 days ago by zaytsev

  • Cc grolleman@… added

Reporters, could you please test?

comment:5 Changed 11 days ago by andrew_b

https://lists.midnight-commander.org/pipermail/mc/2024-December/005789.html

Further to the mentioned ticket updates, I checked the branch
4587_mcview_search_bol against the both mentioned test cases, and it
seems like the bug indeed was fixed. Thanks!

comment:6 Changed 11 days ago by zaytsev

  • Cc grolleman@… removed
  • Votes for changeset set to zaytsev
  • Branch state changed from on review to approved

I've pushed some grammar fixups.

comment:7 Changed 11 days ago by andrew_b

  • Status changed from accepted to testing
  • Votes for changeset changed from zaytsev to committed-master
  • Resolution set to fixed
  • Branch state changed from approved to merged

Merged to master: [511d4d853bd2561837f8b97c83235d5edd2dceb6].

git log --oneline 7950ae8b2..511d4d853

comment:8 Changed 11 days ago by andrew_b

  • Status changed from testing to closed
Note: See TracTickets for help on using tickets.