HOME  |  GIT Overview  |  Script-Archive: (docs) : (wiki) : (git)  |  ...

Peter's Script-Archive - Motivation - The Unix Philosophy, with some Reflections on Acquiring New Skills


Philosophy, "Ancient Artes" and a Modern Tool Chest

A small detour. In the Middle Ages, the Arts, Math and Philosophy were considered pretty similar. Even a millenium later, this common root and the combination of concepts and ideas from these fields make up a very valuable way of thinking about systems: For an elegant, efficient problem solution that won't degenerate immediately into a maintenance nightmare, you still need to combine Arts&Crafts/ (skills - sic!), Philosophy, Logic&Mathematics, and Engineering. Just consider the title of a classic of the field of Informatics: D.E.Knuth's "Art of Programming". Or just one keyword: City Architecture.

I hope you did think of the Gang-Of-Four and Design-Patterns just now, right :) ?

Back to Unix: The Unix tool chest is a quite nice demonstration of combining mostly-orthogonal viewpoints and taking that combination as far as possible. And given the rejuvenation of Unix after a decade of corporate mistreatments (the Unix wars) thanks to a medicine called Linux, there's still no end in sight for development, while the foundations and basics have been proven solid by the test of time. With some interesting offspring like the Internet and the Web.

Combining Orthogonal Concepts

At its core, Unix is just a suitable combination (read: at just the right level of abstraction) of a set of basic (read: excellently selected and combined) orthogonal concepts. The concepts as well as many of the ways to combine them have been extended (e.g. NFS), but the core remains unchanged: a thirty year old shell script has pretty solid chances of working unchanged. With a small difference: Given e.g. network access or process substitution, even that ancient unchanged script suddenly gains new features and uses.

Let's just consider two of these concepts: 1. Everything is a file, 2. Pipes. Now you can already easily extend the usage possibilities of basic commands. Maybe even add mknod and sockets, if you're tired of being restricted to a perceived linear-monocausal design. These two concepts already offer a pretty interesting toolkit, don't they? Way better than requiring a programmer to pre-imagine the set of all possible usage scenarios ahead of time...

Now, add a third slightly more recent concept to the mix and check the set of usage scenarios again. So just add a pinch of regular expressions and stir; and suddenly basic tools allow solving complex tasks previously impossible.

Too abstract, you say. You want a program to run?

With those 3 concepts we have e.g. grep command and some of its common usage scenarios. Now add an editor like vi or sed: we can modify files and data passing through a pipe - e.g. the matches found by grep. While we can just use sed -i.bak to mass-edit files (sed combines the concepts of regexes, pipes and line editing), adding a "backend" that takes grep output allows better file locking and say improved logging.

Now add the concept of change-sets between versions of a file. That's the diff and patch command. Now let's implement that backend in perl, taking grep output edited with vi, have the backend locate the grepped matches in the files specified by grep, apply some perl expression to them, and lock, backup and diff the files on change. This is the Grep.xchange command in this very archive: a simple idea building upon those four concepts, bridging grep output to diff/patch and thus implementing a mass file changer with unlimited undo/redo. And that kind of logging just happens to turn Grep.xchange into a generic patch generating tool. It can be invoked with a simple pipe like grep ... | vim.pipe | Grep.xchange, while performing undo/redo for any number of involved files with just a patch -u < LOG.

tagls is another nice example for combining concepts to gain expressiveness, implemented in a way that allows tagls itself to become a bulding block/concept to build upon. The command's also fun, as it extends the use of regular expressions by both adding synonyms and boolean expressions into the mix (as does Grep.pm as a more ordinary kind of grep).

Or waitcond, which implements a small mini-language of boolean expressions over various tests including greps to test/describe/wait-for what might be called "dynamic system processing states".

Of course, combining, customizing and extending the basic concepts and building blocks is part of the charm of Unix, and thus you'll find a part of my tool chest in this archive.

Fast Forward to the Present

That's obsolete stuff just deals with ancient pre-UNICODE ASCII text, you object. Aren't pipes and commandlines unusable, given XML, you ask.

Even with XML, suitably modified grep, diff and sort commands are quite helpful in pipes. Just consider XPATH. Just structuring text doesn't affect the usability of Unix core concepts at all - simply replace one command or another with a more suitable domain-specific version, and the time-honed command patterns still allow avoiding tedious interactive changes. And if you want to start or end your workflow in a graphical application, Unix offers those as well.

When working with the proper abstraction, "things" don't really change that much at all...

Further Reading

The first one offers more on the concepts of Unix, the second one offers the real meat of the implementation details, albeit a bit kernel-centric. The slightly heretic "Unix Toolbox" from Kernighan and Pike offers a complementaries programmer's tool smith perspective to the first book. If you haven't read it yet, also check Stevenson's Essay "In the Beginning was the Command Line".

Seeing it from the Learning and Improving a Skill Point of View

Learning a topic is easy when using the right level of abstraction for a topic. Freely switch the levels of your abstractions and just try out combinations of base concepts as shown above. Consider the abstraction levels in the ISO/OSI network layers, Chomsky Grammar Levels from CH-3 to Turing Machine, or the Interpretation Hierarchy: play with levels, and see code and data interchange, for example control structure as code on one level and being data on the other, when moving from CPU to higher level languages or abstractions like a program implementing a finite automaton state engine for some process. The concept of the Interpretion Hierarchy is e.g. applied in this talk to consider computing in general, to consider levels from cpu to higher and more abstract languages, and finally future hardware trends.

As shown, learning at the right level of abstraction for each aspect is easy and fast. Adding experimentation, it is fun: just write 10 lines of Perl to test an assumption and just ask Perl or Unix to provide their answer to your question. This incidentally is also challenging and improving your skill as each such question and answer is just a bit outside of your current area of competency. So there's absolutely no need to just passively restrict yourself just reading books and to locally online information like man intro or perldoc -f open, or to internet sources like perlmonks.org (the various search-functions are your best friends for this).

With (very valuable) sources like howtoforge.com, you need to be a bit more careful or you will not really learn at all: read between the lines and remember to ask yourself regularly: Why is a certain approach chosen by the howto author? What are available alternatives, what are the restrictions for them?

Learning is an ongoing process, quite suitable to sharing in a community; with applying or teaching one's acquired skill being the best way to retain the skill and improve on it. Consider LOPSA or e.g. Perlmonks in case of Perl.

So do teach yourself (Unix concepts and) programming in 10 years, but don't err and subscribe to the invalid Teach Yourself Nothing Sustainable in 7 Days approach with its subtitle of and Throw Away your aquired bit of skill on day 8: The first few days are merely the beginning and hopefully enough to handle some topics at a usable level of skill. Practice, use and improving start from day 8 onward. Sounds somewhat like ZEN. And that it is.

Paraprasing Alan Perlis: A language or system that doesn't affect your way of thinking isn't worth knowing. A good discussion on skills on perlmonks and how to approach learning and avoid some of the pitfalls.

To improve, practice correctly (that is: experiment just outside your level of competency), seek both people to teach, and people who are able to teach you.

Teaching is inot giving the full solution (amaze or drown the student), nor is it an annoyed some RTFM flame (threaten him). It is answering ReadTheFineManual THERE, including which manuals to check, and providing some keywords for searching and self-study. Maybe with as little as 10 words :).

plus a keyword, terse example or a link.

Must Be In The Air - A Foot Note On Parallel Evolution

Some topics just 'are in the air'. So to take a peek at a completely independent implementation of boolean expressions over regular expressions and perl scraps, check e.g. David Coppit's mailgrep and the -E option, as well as the Changelog for its history (around 4.80/y2k): "-E allows for complex pattern matches involving logical operators".

For my own part, late in 1997, I got annoyed at freewais and wanted to switch to something maintainable like ice, but didn't like its limits. Thus I provided a patch to add boolean support to ice2-for.pl (ICE, by Christian Neuss, is an old-school light-weight perl full text indexer and cgi searchform; licence: free educational, greenpeace-donation otherwise).

With some squinting the origins of the 'implicit and' can be imagined in '$need_op'. fwait's boolean reimplementation should probably count as the interim screen-space-is-way-too-expensive-attempt, albeit with a certain elegance and doing expression compilation with 'token' regex-replacement in-situ. While the current incarnation in tagls, waitcond and Grep.pm is the most adaptable and readable.


HOME  |  GIT Overview  |  Script-Archive: (docs) : (wiki) : (git)  |  ...

jakobi(at)acm.org, 2009-07 - 2012-03