Free Web Hosting Provider - Web Hosting - E-commerce - High Speed Internet - Free Web Page
Search the Web

 

Search Space

                 © Wayne Paul Amsbury     31 May 2002        UP: visualization

 

A dig is an avatar for the dynamics of language that operate within a search space that is far larger and more chaotic that it might appear to be.

The language search space includes categories, rules, vocabulary, tokens, and patterns such as those that govern agreements. The rules are not strict and there are alternatives to the patterns.

The English rule subject before action is violated by the passive voice. The patterns of agreements and feature vectors are both multifaceted. Children and those new to a particular language learn the rules and the patterns piecemeal. Vocabulary itself appears to be sorted in a number of ways, by size, spelling, concept, parts of speech, and so on.

Probability. Some languages are mixed in many aspects. In particular, a language such as English has borrowed many things from many other languages. The model feature that takes this into account is that of weighted links, in which, for instance, noun à verb might be many times more likely than verb à noun, but both are possible. (In that context, time is most likely to be a noun in time flies.) Poetry and other subtle uses of language deliberately take advantage of this statistical aspect.

This provides another constraint for digs that will be honored in the breech as not needed for the purpose of this essay:

         Weighting. Structural choices in digs may be taken to be weighted rather than strict, as needed.

Optimization. Why is writing with pristine grammar so difficult? Because it requires the morphing of a constructive search into a search for an optimum, a much more difficult process. Optimizing adds a layer of abstraction, so that children and foreign-language students come to it late. Both of these groups sometimes seem to opt for schemas that are globally probable but not precisely applicable in context, or for those derived from another language. Their probability estimates are shifted with learning.

Optimization as used here differs from "Optimality Theory" as used by linguists [Baker page 77], where it corresponds to probabilities, but not to the polishing of grammar. The basic dichotomy addressed by Optimality Theory is the distinction between Formal and Functional views of linguistics. In the metasearch model, the functional act of searching is enhanced by the formal constraints applied to the process of mig selection.

It may seem that the roles of grammar are arbitrarily imposed upon the Great Unwashed by those of us who are Serious Persons, but this is not so. Grammatical rules are derived from observations of common usage. Usage changes with cultural shifts; cultural borrowing from other languages includes the borrowing of language constructs.

Grammatical choices may be viewed as being derived from culture and history, or from formal frameworks. There is no contradiction of either view in a metasearch model, as history may very well play a role in the probabilities of selection; the model simply accepts that they are there. However, if competing schema selections are closely probable, the search process will surely dither and become slow and ambiguous, so a cultural shift of the probabilities would tend to a new and definite choice between them. The apparent scarcity of some potential mixes of digs in languages follows from the same logic: some of these mixes promote inefficient searches.

The fairly rapid historical switch from one construct to another in some languages is in agreement with the motivation to avoid dithering. To paraphrase a patent examining statement, it would have been obvious to one of ordinary skill in the art of the language at the time to choose one rule or the other because it is more efficient to apply only one rule during searching.

Word complexity. Prefixes and suffixes are common in many languages, and some languages pack a great deal of information into single words. As noted in Baker, the Navaho yishcha is essentially yi- à -sh- à cha, where the (infix) sound token –sh- denotes the subject I. A string of six prefixes appears in one Navaho word example in Baker. A printed dictionary may be awkward to use in order to parse such a word, but the search for an appropriate Navaho mig appears to be little different than a search done in a language in which the same information is carried by individual words.

Dictionaries. Vocabulary searches deal largely with some analog of a dictionary. However, this in itself can involve a metasearch on multiple indexes, as working a crossword puzzle proves.

When I work a crossword puzzle, I find that I seldom miss or need to check the size of a candidate word, even if the size is a dozen letters and more. It is clear to me that my internal word dictionary is indexed by size as well as within many other categories. I am also searching for meaning, including perhaps tense, person, part of speech and other things. I evidently search a number of these categories concurrently. The speed with which people in puzzle competitions do this is stunning, and it requires very efficient indexing, given the number of things to be searched. This inference is worth focus:

Multi-indexing. Mental dictionaries are multiply indexed and the indexes are used concurrently in a metasearch.

The vocabulary of most people is at least a few tens of thousands of words, and for some it may be hundreds of thousands. Clearly the mind uses neural nets in the literal sense at some level, but at least the computerized neural nets in use are trained to recognize one pattern. It is conceivable that every language pattern from vocabulary to lexical rules by way of tokens is simple recognized as itself by a unique neural net, but it is hardly feasible.

It is every conceivable language pattern that is at issue here, and most of them we encounter are constructed by someone else. The existence of the observed organization of natural languages belies the use of a unique neural net for every word, and it would be a very inefficient design for language dynamics.

A viable model of how vocabulary is organized in the mind does not seem to be available, but one can create a model that at least sets forth some of the prominent features of vocabulary.

Printed dictionaries are organized in lexicographical order, by first letter, then by second letter within the group with the same first letter, and so on. Words are grouped alphabetically into finer and finer divisions, in a way that a computer scientist would recognize as a tree index. It takes only two steps in the index to find the word by, and the possibilities are divided by (roughly) 26 at each step. Five steps will uniquely find any five letter word, of which there are (potentially) 265 = 11,881,376. We suppose that words are grouped in this way in the mind.

Words in English are of various lengths, fading out at length twenty or so, but in some languages words are much longer. The index above does not stop at length five. One way to visualize this is to think of words as being also indexed by size. Size then can be treated as an orthogonal axis, forming a plot of words with all of the words beginning with the letter g grouped together on the alphabetic axis, and each of them represented by a point at the height determined by its length. The word grandiloquent is at height 11 and comes after gar at height 3 but before gruff at height five.

Suppose that we are looking for an English word of length 5 that begins with a g and means "large." There are many g-words of arbitrary length, but at the intersection with level five there are many less. Thus is we looked for g-words on one axis and five-letter words on the other, the intersection corresponds to a much smaller search space than the one we began with. It would be efficient to search in these two orthogonal directions concurrently in order to find this smaller space quickly.

The third orthogonal axis that is clearly involved in this simple example is that of meaning, in this case "large." We do know that the human mind does concept searching in some manner. If this is treated as another axis, then the convergence is fast, because there are not many five-letter g-words with the proper meaning, (but gross is one of them). Searching an all three axes concurrently, if it can be done, reduces the search space to something quite small.

The handy model for a "meaning" axis is provided by a Thesaurus, where a cluster of words with similar or related meanings form a group indexed by one of them. Clearly, such clusters overlap, and there are other links to other clusters, such as antonyms. This is a natural place for the neural net to meet the search path, or perhaps a natural function of neural nets within the search processes.

The supposed dynamics of the search using this structure is one step to g-words, one step to length five, and one cluster intersection with that point in two-dimensional space to explore along a third axis devoted to concepts.

Combinatorics. The combinatorics of node order in digs is quite fierce. For two nodes in the token stream there are two potential linear orders, and for three there are six. For five there are 120 and for eight there are 40,320, since each such node order corresponds to a permutation of the nodes. (One should keep in mind that most people seem to read in chunks of at least eight words when they scan.) Without this constraint, any combination of arrows in any permutation can be reversed, which for N nodes would be N-1 of them, but with reversal, the same pattern shows up in both directions. Thus there are 2N-2 times as many possibilities without the constraint. For three nodes there are 12 such possibilities, and for 8 there are 64*40,320.

The space of node orders is small compared to the space of potential patterns of language fragments in any natural language. It is almost certainly true that basic patterns and vocabulary are recognized at some level as patterns, quite probably by neural nets. The number of combinations of these things, however, gets quickly out of hand. The combinations available in a given language are reduced by parameters to a few that are characteristic of the language.

NEXT :metasearch_process.htm       BACK: Visualization