Substring Match in shell


Recommended Posts

Hi,

 

  I'm trying to create a script which includes substring match. Consider the following test cases.

 

  List of strings - 12345678, 657895

 

 Now I need to see if substrings - 1234,12,657,65,6578 are in this above strings.

 

My problem is that the substrings - 1234,12 will both be found in 12345678. But I need to go for 1234 if it exists, rather than 12. Similarly, if 6578 exists in 657895, then it should be selected and not 65 or 657.

 

I know I can create sub-cases inside an 'if statement'. But I'm looking for an easier work around.

 

Thanx :)

Link to comment
Share on other sites

Use regex and jut put the longest match first (Longest|Middle|Shortest)

Link to comment
Share on other sites

Something like this will return the longer match. If flip the order you'll see that it will match "te" instead of "test"

echo "test" | perl -ne 'print $1 if /(test|te)/'

Note: I'm using perl because the OR operator wasn't working with grep.

 

EDIT:
 

Here is a slightly better example with your input that will return all matches line by line instead of just the first match:

echo "12345678, 657895, 1234" | perl -nE 'say for /(6578|65|1234|12)/g'
Link to comment
Share on other sites

Thanx a ton snapchat. First example works. The other example you gave did not work. I'm testing this on AIX 6.1 were the default shell is ksh. With second example it says -E is an unrecognised switch.

Link to comment
Share on other sites

Here's a slightly longer form for the second example (without using the say module -- i assume you have an older or different perl):

echo "12345678, 657895, 1234" | perl -ne 'while(/(6578|65|1234|12)/g){print "$&\n"}'

P.S.: my nick is snaphat ;-)

 

EDIT:  here is a slightly better revision:

echo "12345678, 657895, 1234" | perl -ne 'print "$&\n" while /(6578|65|1234|12)/g'
Link to comment
Share on other sites

Can you please also help me with this e.g I have a file where every position of a each line represents some meaning. I need to find a string from certain position e.g 10 characters from 5th position. Let assume the complete string (one line of a file) is the below one where blank spaces are also characters. I need to see the 10  characters starting from 5th position. How can I do this?

 

abcdefgh      12345678      porp

Link to comment
Share on other sites

^ dunno if the spacing is correct in the following, but you are looking for something like this:

echo "abcdefgh      12345678      porp" | perl -ne 'print substr($_,5,10)'
  • Like 2
Link to comment
Share on other sites

This topic is now closed to further replies.