jimx0r: (me)
*NIX weenies (like me!) like their command lines.  That's just the way it is.  You can pry my command-line from my cold, dead hands!  I'm not saying the CLI is the ultimate UI, I'm just saying that for most things, I can get the task accomplished with my command line faster than you can even launch your GUI not to mention how much time I save by *not* pointing and clicking (and therefore moving my hands from the mouse to the keyboard and back)!!

I've seen several different libraries for parsing/handling command-line arguments.  Having written more C/C++ code than I like to admit, I've mostly looked at C-based libraries; the friendly folks over at #lisp turned me onto optparse from Python land (and, since #lisp turned me onto a python lib, I had to take a look!).  These include getopt (and getopt-long), argp, suboptions, and argparse (from Python-land, prior to Python 3.0, this appears to have been called "optparse").  Lets take a quick look at these libraries:


In the C/C world, most people use something like getopt.  Here's an example from the GNU Getopt Long documentation of defining command-line options:
           static struct option long_options[] =
               /* These options set a flag. */
               {"verbose", no_argument,       &verbose_flag, 1},
               {"brief",   no_argument,       &verbose_flag, 0},
               /* These options don't set a flag. 
We distinguish them by their indices.
*/ {"add", no_argument, 0, 'a'}, {"append", no_argument, 0, 'b'}, {"delete", required_argument, 0, 'd'}, {"create", required_argument, 0, 'c'}, {"file", required_argument, 0, 'f'}, {0, 0, 0, 0} };
Here they define four flags that require no argument: verbose, brief, add, and append; as well as three flags that require an argument: delete, create, and file.  This also defines short-options for the add, append, delete, create, and file flags which are 'a', 'b', 'd', 'c', and 'f' respectively.

If you want to, say, specify a bunch of files (as in 'ls'), your only option really is to have these arguments be the final arguments to the program.  In the source code to GNU ls (git.savannah.gnu.org/cgit/coreutils.git/tree/src/ls.c) that is exactly what they do.  The names of the files are left unprocessed using getopt, then the code uses what was left of ARGV as the list of files to list.

To be fair, I'm no expert with getopt and haven't used it in *years* and then I only used it for rudimentary things!  Lucky for me, I'm not trying to give you the full documentation for these libraries, but maybe a little "taste"...

Then there's argp from those kindly folks at GNU: It appears to be somewhat more flexible than getopt and also is kind enough to define --help and --version switches for you.

Then we can tack on suboptions to parse arguments to our options.  The example given in the suboptions documentation is: "One of the most prominent programs is certainly mount(8). The -o option take one argument which itself is a comma separated list of options."

Then there's Python's argparse: Except for being Python, its definitely better than getopt.  Argparse allows you to define options with multiple arguments (I'm not 100% certain you *can't* do this with getopt).  Also, Argparse comes with functionality to print help text (i.e. in response to being passed the "-h" or "--help" flag) so that you don't have to write your own help text printing function.

Here's a quick example cribbed from the python documentation at http://docs.python.org/dev/library/argparse.html:
parser = argparse.ArgumentParser(description='Process some integers.')
parser.add_argument('integers', metavar='N', type=int, nargs='+',
                   help='an integer for the accumulator')
parser.add_argument('--sum', dest='accumulate', action='store_const',
                   const=sum, default=max,
                   help='sum the integers (default: find the max)')

The add_arguments function has several options:
  • name or flags - Either a name or a list of option strings, e.g. foo or -f, --foo.
  • action - The basic type of action to be taken when this argument is encountered at the command line.
  • nargs - The number of command-line arguments that should be consumed.
  • const - A constant value required by some action and nargs selections.
  • default - The value produced if the argument is absent from the command line.
  • type - The type to which the command-line argument should be converted.
  • choices - A container of the allowable values for the argument.
  • required - Whether or not the command-line option may be omitted (optionals only).
  • help - A brief description of what the argument does.
  • metavar - A name for the argument in usage messages.
  • dest - The name of the attribute to be added to the object returned by parse_args()
This doesn't seem too terrible.  (Except, of course that Python is the Devil!)

In comes CL-CLI... In short, the goal of CL-CLI is to be a more "Lispy" way to define and process command-line options.  In Lisp, it is possible to have functions which take required arguments, optional arguments, keyword arguments, and &rest (any number of additional arguments) arguments.  I think that my command-line options should be able to be similar to the functions used to process them: Why not take optional arguments?  Why not take &rest arguments?  ... or both...

One thing I'm quite certain I've never seen command-line processing utilities do before CL-CLI is to take keyword arguments!  To be fair, I'm not sure how useful a feature it will be... I just decided to include this for completeness sake... It was fun to write at least....

So, what does a CLI interface need to support?
I think that it should be able to support both short and long-style options.  (*Currently* the code assumes that options will begin with either "-" or "--".  Some day I may think about non-*NIXy options!: Hopefully soon I'll at least make it configurable!).

Co-required flags and conflicting flags:
It should also be able to take care of options that when specified require other options to be also specified on the command line.  e.g. the --foo flag could require that the --bar flag also be specified and reject the command line if --foo is present, but not --bar.  Similarly, it should be able to take care of handling conflicting flags and rejecting the command line if two flags that were specified conflict: It may not make sense to specify both the --verbose and the --quiet flags!

Flexible arguments to flags:
One thing that makes Lisp so nice is that you have a lot of flexibility when you're specifying the arguments when defining a function.  Functions in CL can have positional arguments, optional arguments, keyword arguments, and &rest arguments.  I think my command-lines should have similar syntax!

When defining a flag in CL-CLI, one thing that is required is a lambda-list that specifies the arguments to the flag.

Responses to flags:
The libraries presented above mostly have pre-defined responses to having seen a flag.  Either set the value of a boolean or store the arguments to the flag into some variable.  I for one am not a fan of telling people what they want to do!  Lucky for me, Lispers are generally very much OK with writing a handful of sexprs as the BODY part of the macro...

Data Structure:

An earlier crack that I made at this sort of thing used CLOS objects for *everything*.  I've since decided that CLOS is not needed in *every* programming situation!  So, CL-CLI represents the definitions of command line options as lists.  This makes them easy to create and monkey with programmatically.  Also, since command-line option processing is (hopefully) not the heaviest of the lifting your program will have to do, I'm not that worried about saving every last FLOP I can!

Built-in "Help" options:
I absolutely agree with the folks at the argp and argparse projects: It can be a royal pain to maintain a reasonable "help" option!  However, their format is pretty standardized and with only asking the author of the code to specify a description of the flag (in addition to all of the other parts that make up an option definition) we can generate this information for the programmer instead of asking him/her to ensure that when they add a new flag or remove an unneeded one that they update the help function!

Currently Open Issues:
One thing I would like to support is specifying the types of arguments: Is this a NUMBER or a STRING or ... ??? 

Here is the option definitions for the main SCAT binary: Really it doesn't do much, just sets up for the "help" and "version" flags (neither of which takes any arguments).

(defvar *opts*
   ;; here we define the version flag
   (defopt ("v" "version") () ()
       "Display the version of SCAT"
     (format t "SCAT version ~A~%" +scat-version+))
   ;; here we define the help flag
   (defopt ("h" "help" "?") () (:toplevel-option t)
       "Display this help text"
     (help *opts*)

Here is the option definitions for the createnode SCAT binary:  Here we set up for version and help flags again as well as 3 required and one optional positional argument.

(defvar *opts*
   (defopt (:positional) (name mac type &optional (state "boot"))
       () ""
     (setf *name* name
           *mac* mac
           *type* type
           *state* state))
   (defopt ("v" "version") () (:toplevel-option t)
       "Display the version of Scat"
     (format t "Scat version ~A~%" +scat-version+))
   (defopt ("h" "help" "?") () (:toplevel-option t) "Display this help text"
     (help *opts*)

In short, each option definition either has a name or :POSITIONAL (aka. required arguments that are *not* for a flag), a lambda list that defines the names of the arguments to the flag, a list of requirements/conflicts, a description of the option (used when displaying the help message), and, finally, one or more forms to evaluate when the flag is seen.

Frankly, I wish that I'd gotten this a little more polished before publishing, but I want to get my thoughts out there so people can shoot at them!
jimx0r: (Default)
So, work has decided to help a user re-implement his code written originally in Basic in a language that might be A) a bit faster and B) easier to compile/run under Linux. A co-worker mentioned this to me and I said "hmm, bet we could do it in Lisp!"....

So, I got myself a copy of the code. And, frankly, it is ugly as sin! I did a quick-and-dirty re-implementation in CL, a quick bit of profiling and optimization (mostly just declaring the types of my arrays to be floats) and my code takes about 1.5 seconds to run the same problem as his code can do in only 1.5 *hours* (yes, that's a 3 orders of magnitude improvement!).

One of the things that I did in my code, rather than a naive "translation" of the original code, was to replace statements that looked like the following with what I considered to be more reasonable alternatives:


Where "dist$" is previously set to a string like this:

I replaced these in my code with a macro that expands to
(- (aref position j) (aref position i)) where i, j, and position all come from the lexical environment.

So, I told my co-worker that initially turned me onto the problem about my speed-up. To say the least, he was a bit skeptical! We got to wondering if writing the code as inefficiently in CL would produce similarly slow code... In fact, I was *asked* if I could implement the equivalent of the above code in CL.

My first take was something like this:
(defmacro dist (array i j)
   `(eval '(- (aref ,array ,j) (aref ,array ,i))))

Unfortunately, that eval is done in the NULL Lexical environment, so I, J, and position are not bound! :P So, I stopped into #lisp and got a few suggestions. What I finally came up with is this:
(defmacro dist (array i j)
       (append '(lambda (array))
                           (format nil
                                       "(abs (- (aref array ~A) (aref array ~A)))"
                                       ,i ,j)))))

In short, I read from a string that I generate with I and J substituted for their values which gives me the "formula" and pack that into a function. I decided to "hard-code" i and j into the strings so that they would be unique each time (so that Lisp won't be able to optimize away a the entire eval statement! I then funcall this function with the value of the array from the environment.

Now that I've made my *awesome* modification to the code, it runs in about 2 hours! YES! I beat the horribly written code written in a horrible language at being *really slow* with my own carefully-crafted to be insanely slow CL code! Lisp wins again!

I did a little more with optimization of the non-slow code, I've now got it running in about 0.6 seconds!  This has resulted in the code becoming nearly unreadable as it's littered with declarations and THE statements (to specify the type of the result of an expression). 

I think that this excursion into complete perversity will help us to make the original BASIC code at least 10 if not 100 times faster .  OR, if I can convince the author of the BASIC code to switch to the dark side.... he can easily have a 1000 times speedup!


jimx0r: (Default)

July 2012

2930 31    


RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 24th, 2017 04:35 am
Powered by Dreamwidth Studios