Image of the glider from the Game of Life by John Conway
Skip to content

Testing AlphaNumeric Arguments In Bash

Spending the evening working on my shell scripting, I thought I would jump into "Wicked Cool Shell Scripts" by Dave Taylor. In his script validalnum.sh, he has a test case to check if a user entered in valid alphabetic or numeric characters. His result is elegant and clean. I've changed up the script a bit for clarity:

1
2
3
4
5
6
7
8
9
10
11
12
#!/bin/bash

echo -n "Enter alphanumeric input: "
read input

compressed="$(echo $input | sed -e 's/[^[:alnum:]]//g')"

if [ "$compressed" != "$input" ] ; then
    echo "Input not valid."
else
    echo "Input valid."
fi

In this example, the user is asked to enter input that can be any combination of letters and numbers, regardless of case. If the user enters punctuation, the test case fails, and the user is notified of such. Otherwise, the test case passes, and everyone is happy.

I want to call to your attention the cornerstone of this script, however:

1
compressed="$(echo $input | sed -e 's/[^[:alnum:]]//g')"

The variable $compressed is holding only alphanumeric characters. This is done by taking the user input, and piping it to the stream editor sed. With sed, we are searching for any character in the string that is not a number or a letter. If such a character exists, we remove the character altogether. Thus, if $compressed removes any characters, then it does not match what the user entered, and our test will fail. If no characters were removed, then no punctuation exists in the input, and our test case will pass.

I thought this was most clever, and just had to share, hoping others benefit from this simple example. I also hope that Dave is not mad at me for taking an example, changing it up a bit, and presenting it on this blog. Thanks Dave.

{ 10 } Comments

  1. David Tomaschik using Firefox 2.0.0.11 on Ubuntu | January 3, 2008 at 10:24 pm | Permalink

    I think this might be simpler:

    if echo $input | grep -q '[^0-9A-Za-z]'
    then echo "Input not valid."
    else
    echo "Input valid."
    fi

  2. Tormod Volden using Firefox 2.0.0.11 on Ubuntu | January 4, 2008 at 5:51 am | Permalink

    Or better:
    if echo $input | egrep -q "\W"

  3. Dave Taylor using Safari 523.10.6 on Mac OS | January 4, 2008 at 6:44 am | Permalink

    How can I be upset when you refer to my script as "elegant and clean"? :-) Seriously, I love to see people further hack and push the bits around. Trust me, I'm not the world's best shell script writer, I just have persistence. :-)

  4. Aaron using Firefox 2.0.0.11 on Ubuntu 64 bits | January 4, 2008 at 7:36 am | Permalink

    @David Tomaschik and Tormod Volden- The idea isn't simplicity, but elegance and flexibility. Dave Taylor's solution brings more to the table, and let's the script writer take advantage of the $compressed variable if needed.

    @Dave Taylor- Welcome! I was worried that the code may be copyrighted or otherwise unavailable for redistribution. So, as you may have noticed, I changed the script considerably "just in case". :) Anyway, it's a great book, and I'm looking forward to learning more from it.

  5. Gareth Williams using Firefox 3.0b5 on GNU/Linux | May 12, 2008 at 6:11 am | Permalink

    Very useful discussion - taught me a few things as a beginner:
    1) don't cut and paste the examples from the comments into a script as they won't work - the quote marks need to be typed as proper single or double quotes!
    2) the "\W" example works differently from the other two as it will allow underscore in the input text.
    :-)

  6. Jan Rome using Firefox 3.0.8 on Ubuntu | May 21, 2009 at 4:12 am | Permalink

    I also agree that Aarons version is better because it is more flexible:

    validate_input() {
    INPUT="$(echo $1 | sed -e 's/_//g')"
    VALID="$(echo $INPUT | sed -e 's/[^[:alnum:]]//g')"
    if [ "$INPUT" != "$VALID" ]; then
    echo "string contains invalid characters"
    fi
    }

    Based on Aarons code, we can make a slight modification of the INPUT variables value. Basically, it allows a user to match only alpha numeric input, but with inclusion of the $(echo $1 | sed -e 's/_//g') part, we can make exceptions to the [^[:alnum:]] part.

    The final result is such that the function will allow only alpha numeric input and characters specified in the INPUT variables sed part.

    Broken down:
    1) invoke the function: validate_input
    2) use sed to remove the underscore from blah_blah and set the result as the variable INPUT
    3) use sed to restrict the VALID variable to alphanumeric characters
    4) compare the two, if they differ, report that there are invalid characters present.

    One can thus restrict the valid input to only alphanumeric characters and other user specified characters... (in our case, the underscore)

  7. Krayon using Unknown on GNU/Linux | September 9, 2009 at 11:52 pm | Permalink

    If you're just interested in validating the input, you can just something like:
    input="$(echo $input|sed '/[^[:alphanum:]]/d'
    if [ "$input" == "" ]; then
    echo "Invalid input"
    fi

    The advantage of this is that it'll set it to "" if there's even 1 incorrect character. In theory entering anything other than the expected could imply the user doesn't know what they are entering :P

  8. Anonymous using Google Chrome 22.0.1229.79 on Windows 7 | October 5, 2012 at 3:27 am | Permalink

    Good one, but I have one doubt, the "read" command I see is auto trimming the leading spaces and trailing spaces, I do not find a way to turn this off.

    I am using this regex to validate alphanumeric input , this works for all cases, fails to match if a space is at the beginning or at the end.

    if [[ $username =~ [^0-9A-Za-z]+ ]]

    Is there a way, I can fix this script, without awk. sed, to math user inputs with leading and trailing spaces also ?

    Thanks for the blog, was helpful.

  9. Sandeep using Google Chrome 22.0.1229.79 on Windows 7 | October 5, 2012 at 3:28 am | Permalink

    Good one, but I have one doubt, the “read” command I see is auto trimming the leading spaces and trailing spaces, I do not find a way to turn this off.

    I am using this regex to validate alphanumeric input , this works for all cases, fails to match if a space is at the beginning or at the end.

    if [[ $username =~ [^0-9A-Za-z]+ ]]

    Is there a way, I can fix this script, without awk. sed, to math user inputs with leading and trailing spaces also ?

    Thanks for the blog, was helpful.

  10. Sandeep using Google Chrome 22.0.1229.79 on Windows 7 | October 5, 2012 at 4:19 am | Permalink

    Found a fix here, thanks for your blog !

    http://fahdshariff.blogspot.in/2008/06/read-file-without-trimming-leading.html

Post a Comment

Your email is never published nor shared.

Switch to our mobile site