Bash [[ foo =~ regex ]] behavior changed with last update, gentoo patches?
Arch devs, I have a script that parses IP and CIDR notation using the following to capture IPs: [[ $1 =~ ^([^.]{1,3})[.]([^.]{1,3})[.]([^.]{1,3})[.]([^.]{1,3})$ ]] # validate 5 elements in BASH_REMATCH array if [ "${#BASH_REMATCH[@]}" -eq 5 ]; then ... (yes, I can improve it -- but that's not the point here) Then to capture CIDR: [[ $1 =~ ^([^.]{1,3})[.]([^.]{1,3})[.]([^.]{1,3})[.]([^.]{1,3})/(.*)$ ]] # validate 6 elements in BASH_REMATCH array if [ "${#BASH_REMATCH[@]}" -eq 6 ]; then ... This script was used to automatically update ipset lists and save /etc/ipset.conf. It has worked for years. Today after update to bash-5.2.037-3-x86_64 I get: $ ipsa 38.0.0.0/8 /home/david/scr/adm/ipset_add.sh: line 6: [: 0/8: integer expression expected /home/david/scr/adm/ipset_add.sh: line 6: [: 0/8: integer expression expected valid IP: 38.0.0.0/8 ipset v7.23: Hash is full, cannot add more elements The problem being for the first time ever, the regex in: [[ $1 =~ ^([^.]{1,3})[.]([^.]{1,3})[.]([^.]{1,3})[.]([^.]{1,3})$ ]] parsed the IP as 38 0 0 0/8 (with the "/8") instead of 38 0 0 0 filling the ipset blocklist completely with sequential 38.x.x.x IPs. Technically this regex matching is correct as the final character list [^.] doesn't preclude inclusion of '/', but this is a definite change from all prior bash versions. Is this the intended result of the gentoo patches change to the package? I've since fixed the regex with [^./] as the final list. Are there any other known changes to regex parsing with the changes to the bash PKGBUILD and patches? -- David C. Rankin, J.D.,P.E.
On Thu, 8 May 2025 at 15:31, David C Rankin <drankinatty@gmail.com> wrote:
this is a definite change from all prior bash versions
I checked the following command across several bash versions: [[ 38.0.0.0/8 =~ ^([^.]{1,3})[.]([^.]{1,3})[.]([^.]{1,3})[.]([^.]{1,3})$ ]]; echo $? ${BASH_REMATCH[@]} archlinux bash 5.2.037-5: 0 38.0.0.0/8 38 0 0 0/8 archlinux bash 5.2.037-2: 0 38.0.0.0/8 38 0 0 0/8 fedora bash-5.2.37-1.fc42.x86_64: 0 38.0.0.0/8 38 0 0 0/8 debian bash 5.2.15-2+b7: 0 38.0.0.0/8 38 0 0 0/8 I don't know what happened, but it doesn't look like regex matching changed behavior.
Hi David,
[[ $1 =~ ^([^.]{1,3})[.]([^.]{1,3})[.]([^.]{1,3})[.]([^.]{1,3})$ ]] ... [[ $1 =~ ^([^.]{1,3})[.]([^.]{1,3})[.]([^.]{1,3})[.]([^.]{1,3})/(.*)$ ]] ... $ ipsa 38.0.0.0/8 ... parsed the IP as 38 0 0 0/8 (with the "/8") instead of 38 0 0 0
I doubt your understanding is correct. I think it more likely the ‘0/8’ this time was short enough to match /([^.]{1,3})$/ in the first regexp and either: - it has never been short enough in the past, e.g. ‘0/24’, or - they've slipped through but not triggered an error for you to notice, e.g. ipset's hash table didn't fill up. -- Cheers, Ralph.
On 5/9/25 5:02 AM, Ralph Corderoy wrote:
I doubt your understanding is correct. I think it more likely the ‘0/8’ this time was short enough to match /([^.]{1,3})$/ in the first regexp and either:
- it has never been short enough in the past, e.g. ‘0/24’, or - they've slipped through but not triggered an error for you to notice, e.g. ipset's hash table didn't fill up.
Ah hah! Thank you Ralph. I think you put your finger on it, and helped remind us again that correlation does not equal causation. Your first bullet fits the bill. I believe this was the first time a /8 block was added to the hash:net list. That makes sense that the '0/0' fit both the final list and repetition of the regex. Mystery solved. Thanks also to heftig for his help with this as well. The firewall tools have been very busy this past month, and things don't show any signs of letting up... -- David C. Rankin, J.D.,P.E.
participants (3)
-
David C Rankin
-
Jan Alexander Steffens (heftig)
-
Ralph Corderoy