Hi,

I would like to found a regex match in a stdout

stdout

 /dev/loop0: [2081]:64 (/a/path/to/afile.dat)

I would like to match

/dev\/loop\d/

and return /dev/loop0

but the \d seem not working with awk … ?

How to achieve this ? ( awk is not mandatory )

  • TwilightKiddy@programming.dev
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 days ago

    Assuming you made a bit of a typo with your regexp, any of these should work as you want:

    grep -oE '/dev/loop[0-9]+'
    awk 'match($0, /\/dev\/loop[0-9]+/) { print substr($0, RSTART, RLENGTH) }'
    sed -r 's%.*(/dev/loop[0-9]+).*%\1%'
    

    AWK one is a bit cursed as you can see. Such ways of manipulating text is not exactly it’s strong suite.

    • 4am@lemm.ee
      link
      fedilink
      arrow-up
      1
      arrow-down
      1
      ·
      2 days ago

      Why are you making a literal out of the + operator? This will not work.

      grep -o ‘/dev/loop[0-9]+’
      
      • pelya@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        2 days ago

        Because it does work, you need grep -E for ‘+’ to work without escaping. Also, your quotes are wrong, ‘ should be ’ .

        • ulterno@programming.dev
          link
          fedilink
          English
          arrow-up
          0
          arrow-down
          1
          ·
          2 days ago

          Also, your quotes are wrong, ‘ should be ’ .

          Alright, I am getting confused. What quotes are those?

          I got this from the stuff I copied from your comment:

           ./UTF8txt2hex ’‘
          UTF-8: e2 80 99 e2 80 98
          UTF-16: 2019 2018 
          UCS 4: 00002019 00002018 
          

          And these are the single quote and backtick I used (of course I had to escape them, because they are the actual ones):

           ./UTF8txt2hex \`
          UTF-8: 60
          UTF-16: 60 
          UCS 4: 00000060 
           ./UTF8txt2hex \'
          UTF-8: 27
          UTF-16: 27 
          UCS 4: 00000027 
          

          And from what I see, your original comment had the correct ones, so was this all to elicit this response out of me?

  • unlawfulbooger@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    4
    ·
    edit-2
    2 days ago

    You could try [0-9] instead?

    awk '/\/dev\/loop[0-9]/ {print}'
    

    If you have a larger sample of input and desired output, people can help you better.

  • vpklotar@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    2 days ago

    I know this isn’t grep or awk but of you simply want the first part I would probably use cut as following: ``` cut -d : -f 1

    
    Simply put, cut the line in multiple parts with the colon as the delimiter and choose the first part.
  • You’ve got lots of answers, so I’ll just say that shorthand character classes like \s, \w, and \d - all those backslash ones - are not widely supported, especially in the only POSIX tools. Many tools have an extended or Perl mode that makes them available, but some don’t. You can’t rely on them being everywhere. That’s why you’re getting suggestions to use explicit, long-form character classes like [0-9].