| Evolution of Human Languages  An
  international project on the linguistic prehistory of humanity   [Contacts]             There are currently about 6,000 languages on Planet Earth
  as of 2017, some of them spoken by millions and some by only a few dozen
  people. A primary goal of EHL researchers is to provide a detailed
  classification of these languages, organizing them into a genealogical tree
  similar to the accepted classification of biological species. Since all
  representatives of the species Homo sapiens presumably share a common origin,
  it would be natural to suppose - although this is a goal yet to be achieved -
  that all human languages also go back to some common source. Most existing
  classifications, however, do not go beyond some 300-400-language families
  that are relatively easy to discern. This restriction has natural reasons:
  languages must have been spoken and constantly evolving for at least 40,000
  years (and quite probably more), while any two languages separated from a
  common source inevitably lose almost all superficially common features after
  some 6,000-7,000 years.              Nevertheless,
  despite widespread skepticism and reluctance to tackle the problem, there are
  a number of scholars who believe that these obstacles are not insurmountable.
  Research has been going on over the past several decades that appear to
  indicate that larger genetic groupings are not only possible, but indeed
  quite plausible. It can be shown that most of the world's language families
  can be classified into roughly a dozen large groupings, or macro families.
  Two sorts of evidence can be used for this purpose:              1) Even a
  superficial analysis of the vocabulary of a large number of linguistic
  families reveals numerous lexical similarities extending far beyond the
  borders of the smaller genetic units. They are frequently restricted to
  individual macro families (such as Eurasiatic, Afro Asiatic etc.), but a
  significant number of such matches have already been found between the macro
  families themselves, pointing to the probability of common origin.             2) Classical
  historical linguistics has developed a very powerful tool - the comparative
  method - that allows the reconstruction of unattested language stages,
  so-called proto-languages. It turns out that whereas modern languages may
  vary significantly, protolanguages in various cases tend to be much more
  similar to one other. This is the case, e.g., with Indo-European, Uralic and
  Altaic: modern English, Finnish, and Turkish may have almost nothing in
  common, but their respective ancestors - Proto-Indo-European, Proto-Uralic
  and Proto-Altaic - appear to have many more common traits and common
  vocabulary. This means that the possibility exists of extending the time
  perspective and reconstructing even earlier stages of human language and much
  of this research has already been conducted. 
             Etymological
  databases for several macro families are also being compiled, and several of
  them - Australian, Eurasiatic (Nostratic) and Afro Asiatic - are already near
  completion. Once an etymological database becomes available, it can be used
  to significantly simplify the task of searching for lexical cognates and
  building up higher level databases. Etymological databases can also be used
  (and are being used) for a statistical evaluation of taxonomic correlations.
  The number of etymological matches between languages is a good measure of the
  distance between them and they can also be employed for evaluating the time
  depth of any linguistic family. In fact, so-called lexicostatistics is the
  only available tool for absolute linguistic dating and its theoretical
  rationale and practical employment is one of the central tasks of the EHL
  project.              While the
  project is concentrated on building up a hierarchical system of etymological
  databases, reflecting the hierarchical taxonomy of the linguistic
  genealogical tree, it is also concerned with collecting and putting online
  primary language wordlists as well as existing etymological sources. The
  ideal etymological database system should be able to provide an etymology for
  any word in any modern or ancient language, tracing its origin as far as
  possible. The participants of the project have provided source wordlists for
  poorly explored language families such as Indo-Pacific and Australian, where
  most of the comparative work is yet to be done. They have also scanned,
  recognized, and converted to database format some of the major existing
  etymological dictionaries, such as Pokorny's Indo-European etymological
  dictionary.              The
  ultimate goal of the system of databases described above is to arrive at a
  stage when an absolute majority of the world's languages can be reduced to a
  minimum number of huge language macro families, which in turn can be traced
  back to a Proto-Sapiens stage, should the databases provide sufficient
  evidence to support the hypothesis of monogenesis. With the database system
  completed, and the basics of the Proto-Sapiens structure established,
  we can hope to come into possession of a vital tool for helping us understand
  the nature of the origin of language itself. |