• Home
  • Sports
    • Rowing
    • Martial Arts
    • Tricycles
      • About My Tricycle
      • Some Adventures
      • Health Issues
      • Upgrade How-To
      • Difficult Weather
      • How to Buy a Trike
      • Cycling vs. Automobiles
    • Sailing
    • Walking
    • Flying
  • Spirit
    • A few teachers
      • Robert Adams
      • Amber
      • Ibn El'Arabi
      • Meister Eckhart
      • Douglas Harding
      • Brother Lawrence
      • Ramana Maharshi
      • Nisagardatta
      • Rabia
      • Rinzai
      • Jalalud'din Rumi
      • Morihei Ueshiba
      • Ulla
      • Wei Wu Wei
    • Hucksterism
    • The Poonja Crowd
    • Zen and Sore Knees
    • Oprahism Religion
    • Advaita Nondual
    • Newage Victims
    • Christianity
      • Borrowed Myths
      • Censorship of Ideas
      • Ensuring Falliblity
      • The Modern Inquisition
      • Religious Fanaticism
    • Islamic Thought
    • Meditation for Gain
    • Buddhism
    • Martial Arts
    • Religious Fanaticism
    • The Guru Game
  • Philosophy
    • Doxa
    • Straussian Superiority
    • Metanoia
    • Jus ad Bellum
    • Morality
    • Indeterminism
    • Core Beliefs
    • Neorological Morality
    • Maleable Beliefs
  • Obliteration
    • Unending war
    • Undercounting the dead
    • Military Spending
    • Helping despots
    • Arms dealing
    • Prison Systems
    • Kakistocracy
    • Guns for all!
    • Altnerative to war
    • Justification for war
  • Education
    • Pedagogy
    • Mass Illiteracy
    • Bookburning
    • Inhibiting Learning
    • Accreditation
  • Science
    • What is Science?
    • Indeterminism
    • Tordesillas Lunar
    • Global Cooling
    • Narrative Theory
    • Neuroimaging
    • Overpopulation
    • Environmental Ecocide
    • Deep Structure
    • Computer Language
  • Social
    • Media Control
    • The End of Democracy
    • Ensuring Obedience
    • Creating Fear
    • Altering Core Beliefs
    • Nothing to Hide
    • Redirect Thought
    • Doublespeak
    • Computer Rights
      • Encryption
      • Proxies
      • DNS Privacy
      • Simple Firewall
      • Block Access
      • Secure Remote
      • Block Bots
    • Trivia as News
    • Big Brother
    • Mass Censorship
  • Economics
    • What is Money?
    • Trickle-Up Economics
    • Economic Value
  • Medicine
    • Forcing Patients
    • Neuroimaging
    • Medical Ineptitude
    • Modern Phrenologists
    • Dignity in Death
    • Cause of Illness
    • Personality Testing
  • Art
    • Homemade Flutes
    • Tiny Music Studio
    • Small Painting Studio
  • About
    • About my Site
    • Terms of Use
    • Contact me

Language Classification Schemes

Computer languages can be classified in many different ways. Unfortunately the literature seldom agrees on what criteria to use and what procedures to follow in contrasting them.

Trivial schemes

The most trivial classification scheme is simply to sort on whether or not a given language is primarily subject oriented, declarative, modular, structured, list process, concurrent, realtime, expert, transormational, generative, extensible, recursive, and ... well, you get the idea. That is to say, classify by some aspect sof the language which one chooses. Such a sorting scheme obviously fails when determining the extent to which a given language has any or all of these properties (subjectivity in abundance). But more importantly, classification of computer language in this way has no clear benefit.

Early schemes

Other more rigorous classifications have been attempted.
For example, early researchers developed detailed lists of language capability [1]. Items on these lists ranged from differences in control flow, through parameter scope, recursion type, and so on. Languages were rated on the extent to which they possessed a given capability. A bit of factorial analysis was used to tease out as much of the objective aspects of such a rating as possible, and attempt to tease out some peak harmonics. Unfortunately as with most factorial analysis, meeting the rigid statistical requirements for rating schemes was difficult... as was identifying any emergent common factor.
Others, notably [2,3] chose a different route. They contrasted design philosophies which they believed to be inherent in different approaches to computer language. They sought a methodology for extracting so-called ’core’ approaches in language design, then using this to tease out what they termed the “design mind set”. Again however, subjectivity and disagreement about what if any common factors were occurring stymied further work.
In yet another approach, some [4,5,6] contrasted computer languages by means of the success of implementation regarding documented standards. From this they developed a notion they called ’expressive power’ which related compiler properties to ’good’ (efficient, workable, etc.) patterns of thought. Unfortunately compiler properties are not the same thing as language properties. Nor was there sufficient objectivity in the metric to merit accurate or reliable classification.
There have also been attempts at constructing classification schemes based upon complete independence of a language’s definition from the processes necessary to implement it on a given structure such as compiler, the OS, and so on [7]. But once both cross-platform compilers became common, and the problems of scalability diminished, this classification scheme no longer made any sense.
Others such as [8, 9, 11] have argued that modularity, reliability, execution cost, resource usage and similar obvious schemes be used as basic classification. I need hardly mention the problems with such an approach, other than to say bloat is not a language problem. Similarly execution and resource cost are compiler not language concerns. Or should be.
And finally in this look at early schemes, more and more began to argue that the only practical approach was to eschew theoretical classification involving grammar, syntactical difference, and the readily available morphological description altogether [10,6,14,18,12,21]. Instead it was argued that one should use case studies of the usage of different languages in identical tasks to tease out classification schemes. While this sounds sensible, IMHO there are no identical tasks beyond the trivial which can be measured. Even a casual familiarity with operational design or even (rolls eyes) system analysis should rule out any such attempts at classification before they are begun.

Mathematical classification

Turing machines are nice and all that, linguistics is a lot of fun, and information theory is not a closed logic. That is to say, IMHO languages cannot be fully described and classified in mathematical terms despite the wishful thinking of proponents for this approach. The minimalist belief that all language classification can be done mathematically does not (usually) take into account infinite non-patterned extensibility (viz. with certain n-valued logics or some genetic algorithms), or going the other way, the fact that not all language require atoms (basic structures) to produce meaningful patterns (chaotic complexity in language is not well studied, and even less well understood [G]  [G] Although there is some interesting work on this at the Sante Fe Institute [39]). Never the less many keep trying. So for example, some have looked to physics to find metrics for software ’force’ in term s of the energy available and work required to advance a certain distance in development [33].
Others have extended this to look at ’force’ regarding the properties involved in various representations of algorithms [34]. For certain of these metrics researchers [35,33,36] found a high degree of statistical invariance, and a corresponding degree of prediction concerning time spent in using a given language to perform a given task. They argued that their metrics could be used to classify languages.
Clearly there are criticisms which can be levelled at this approach. For example the requirement for ’purity’ of code [37, 38], the need for multiple judgement calls throughout the process [39], and so on. But most of all, the fact that the metrics in their various forms are all poor predictors of software behaviour [38,39,40].
It is also possible to classify languages by logical means, such as the extent of closure (in the mathematical sense) for given language structures, derivability (in an axiomatic sense), value (in the sense of fuzzy or set theoretic value; or even in the sense of 1,2, or n-values logics they may perform), and so on. But since few languages are logically consistent and well-formed, this approach lacks real-world applicability.

Heuristic classification

Early on a number of researchers (notably [41]) proposed two major approaches to heuristic language classification: 1) software engineering 2) feature richness
The first of these includes such well known issues as information hiding, data typing, scoping, exception handling, and so on down the every narrowing tunnel to nowhere of software engineering. Interesting in and of itself, but less a science or engineering field than a means of extracting dollars from unknowing and gullible CFOs, their minds dulled by the pretty albeit fraudulent representations on a CASE screen an image - please see terms of use
The second - feature richness - is unfortunately difficult to apply. The old Neanderthal saying of “One cave’s features are another cave’s detriments” applies. ’Nuff said. Less flippantly, IMHO a good language is not a feature-rich language. Rather it is a language which is fully, easily, and formally (i.e. rule-based) extensible. Comparing the features and extent of features in a language is far less important than its ability to be added to with easy and consistency.

Linguistic classification

The is a rough split in the literature therefore between linguistic related classification schemes and heuristic schemes.
The former is in general believed to be best applied to ordered structural languages with finite operators. The latter to list-processing, unstructured, freely extensible, and more abstract designs.
In the 1950’s Chomsky described a hierarchy of levels of complexity applicable to languages, demonstrated that each of his classes was associated with a class of grammars and a class of machines [22,23,24,25,26]. The argument maintained that language can be reduced to a set of generative grammars of production rules guiding how a string of symbols could be written to generate another string [22,24].
But whilst this approach is very useful in say, creating verification compilers [15,16], it falls short of many of the practical aspects of language design [27,34,7]. In particular threaded extensible languages which are not purely regular (Type III grammars) cannot be entirely expressed by rules which satisfy stringent formal specification (cannot be fully described for example, by BNF strictures).

Threaded extensibles

To give a single exception instance: In such languages there is no requirement that the right side of an assignment be a single nonterminal symbol, or that the left hand side be a single terminal or terminal followed by a single non-terminal. It is too easy to violate the requirements of regularity.
Additionally it is often no possible to classify such languages as context-free (Type two grammars), because a given nonterminal need not always yield the same productions regardless of what symbols surround it [25]. Further, some such languages can match references to a variable with the corresponding declaration. That is to say, context free grammars can balance pairs of symbols (say nested parenthetical) but cannot match arbitrarily long strings of symbols arbitrarily far apart. Hence in some threaded extensible languages where variables must be explicitly declared, the language cannot be context free [H]  [H] Unless it is formalised as a context-sensitive gramme, which of course would limit reference matching to declaration, as suggest previously..
The utility of forcing universal context-sensitivity is questionable. The hyper-rules which would be employed to allow consistent substitutions of protonotions (context-free grammar productions), with a single metanotion (i.e. abstraction of select protnotions into a single metanotion) are certainly possible [26,28] but only IMHO at a general loss of simplicity and dare I say, elegance. Moreover the lack of commonly used descriptive techniques of Type One grammars (eg. Backus-Naur Form cannot be used) really makes the forcing of universal context-sensitivity for threaded extensible languages, somewhat moot.
Instead I feel that such languages can be well classed as context-free IFF special extensions which allow recursive definition of nonterminals, trees, and declarations is allowed [I]  [I] Which has the added plus of enable BNF description in extension [28]. The overwhelming advantage of adding such extensions I feel, is that one could (via the recursion extension) define the nested syntactical structures which are so very useful in the production of threaded tree structures (i.e. the graph-theoretic navigable trees naturally formed in many threaded languages).
Now, in practical terms most context-sensitive information in threaded tree-producing languages may be collected in to symbol tables or simply in multiple stacks. The former are commonly used by semantic analysers. Suspected atoms (smallest language symbol) in the input stream are analysed by parsing against a symbol table of identifiers (which of course are highly variable were every aspect of the thread is extensible). Using a symbol table therefore is very handy in avoiding the complications of a fully context-sensitive language. Using any number of stacks just makes the practicalities of the process simpler [29]. When the need arises, any given stack may itself in whole or in part be added to the symbol table [J]  [J] That is, linked as trees themselves. Resulting in referential transparency to any part of a thread. Nice.. This has the added benefit of making the resultant thread topologically simple. IMHO this is simpler (and faster) than the list processor route taken by some languages wherein names of atoms in the list are saved as part of that list, or of a threaded interpreter such as Forth or Mumps where names are saved with the code for a given thread.
The point of course, is that the linked trees of threading can be used as self-classifiers. That is to say, extensions are always in a particular tree branch, as are their progeny. In theory any given language can be realised (implemented) in a robust threaded language [30,31,32]. Hence we need only compare at the trees formed by the implementation of different languages - superimpose them if you like visuals - to see the naturally occurring similarities and differences. Can the implementations be done objectively (i.e. automated). Yes, with some creative use of the universal function. But with the same caveat which applies to all automatic programming - only for certain classes. None the less, this heuristic scheme presents an interesting line of research to pursue.
Overall however, language classification for machines has IMHO failed for the same reason it has failed for human and animal language. For despite the protestations of linguists, psycholinguists, and AI apologists, there are too many harmonics in the system. No finite state machine, deterministic or not, can be universal - many areas of complexity theory for example, simply are beyond the scope of such machines even should they have infinite states, capacity, and zero transition time.
Or said another way, not everything is mathematically reducible, and not everything can be factored. Just my opinion of course an image - please see terms of use .
[42 References]

Back to the top of this page
Copyright © 2012 by peter at peter.ca. All rights reserved.