\documentclass{article} %report?
%\usepackage{natbib} apa style etc.
\usepackage{amsmath}
\usepackage{fullpage}
\usepackage{multicol}
\usepackage{titlesec}
\usepackage{verbatim}
\usepackage[pdftex]{graphicx}
\usepackage[english]{babel}
\usepackage[colorlinks=true, linkcolor=black, urlcolor=blue, pdfborder={0 0 0}]{hyperref}
%\usepackage{synttree}
\usepackage[utf8x]{inputenc}
%
\title{Esperanto: a counterexample to the very idea \\
of formal natural language semantics? 
}%
%\emph{ . . . being a minor investigation on rigor and \\
%[... bananas bananas bananas ...]}}
%
\author{Andreas van Cranenburgh\footnote{0440949, \texttt{acranenb@science.uva.nl}}}
%
%
\newcommand{\TODO}[1]{\textbf{[[TO DO: #1]]}}
%
\titleformat{\section}
{\titlerule
\vspace{.8ex}%
\normalfont\scshape}
{\thesection.}{.5em}{}

\titleformat{\subsection}
{\normalfont\scshape}
{\thesubsection.}{.5em}{}

\titleformat{\subsubsection}
{\normalfont\scshape}
{\thesubsubsection.}{.5em}{}
%
\makeatletter
\renewcommand\section{\@startsection {section}{1}{\z@}%
                                   {.1ex \@plus -1ex \@minus -.2ex}%
                                   {1ex \@plus.1ex}%
                                   {\centering\large\scshape}}
\makeatother
\newcommand{\R}{\mathbb{R}}

\begin{document}

\maketitle
\begin{center}
Structures for Semantics, Master of Logic, \\
Institute for Logic, Language \& Computation, \\
University of Amsterdam
\end{center}

%\begin{tabular}{ll}
%\begin{quote}
\begin{verbatim}
                      It's like learning a new language
                      Helps me catch up on my mime
                      If you don't bring up those lonely parts
                      This could be a good time
                      You come here to me.
                      We'll collect those lonely parts and set them down
                      You come here to me . . .\end{verbatim}
 (Leif Erikson, performed by Interpol, from Turn on the bright lights, 2004)
%\end{quote}
%&
%\begin{quote}

%\emph{
%                      would you kindly, read this word for word \\
%                      so loud and clear, \\
%                      I can't remember it all, \\
%                      it needs to be clear, I tell you, \\
%                      if the feeling drops out of your voice, \\
%                      would you kindly pick it up
%}\footnote{T.B.D., performed by Live, from Throwing copper (1994)}
%\end{quote}
%\end{tabular}
%\vspace{4em}

\abstract{Esperanto and formal semantics are incompatible. This is the fault of formal
semantics, not Esperanto.}

\begin{multicols}{2}

%\newpage

%\tableofcontents

%\newpage

\section*{Thesis}

Is it possible to account for a fragment of Esperanto in a rigorous, formal semantics?

%endaevor a similarly formal enterprise for a
%constructed language such as Esperanto?

Formal Semantics is Anglocentric, albeit for sociological reasons. A limitless supply
of usually convergent semantic judgments is available as input to the effort of
formalizing fragments of the semantics of English. %cite

On the other hand there is the situation with dead languages such as Latin and
Ancient Greek, where no such data is forthcoming. In such cases hermeneutics %cite
and educated guesswork is all we can rely on -- objectively surface forms are the only
hard evidence, which must be studied together with circumstancial evidence,
viz.\ archeology.  When we as humans struggle to understand such languages from
a first person perspective -- e.g., knowledge of collocquial Latin is based on
a single remnant work: the Satyricon -- why bother attempting a third-person,
formal account of them? Such an account will most likely be unfalsifiable, let alone
useless, since the point of historical texts and languages is understanding
texts in their context, whereas formal semantics aims at producing a
context-free literalist denotational (contrasted with connotational) mapping of
arbitrary word forms to `meanings;' scare quotes are called for because
meaning in formal semantics has a very technical and specific sense, which is
arguably of negligible interest to real understanding (this provocative claim
should be the topic of another essay or dissertation).

Esperanto has a sufficient amount of speakers\footnote{estimated 
at 100.000 to 1 million %cite
} --- a critical mass which is a necessary condition for a language to thrive
and to enable a general linguistics treatment.  Despite that, its speakers are
mostly \textsc{l2} speakers, which makes intuitions less reliable. Since there
is no geographically concentrated community of \textsc{l1} speakers which
develop the language continuously and intensively, no extensive, fine-grained
intuitions on marked constructions -- be it syntactically, semantically or
pragmatically -- have developed. This is a problem generally for formal
linguistics, and specifically for formal semantics, which necessarily attempts
to give a clear and precise account of a fragment of natural language -- in
other words, a rational reconstruction.

%My thesis is that a 
{\bf Semantics of Esperanto requires a battery of evidence} which may
very well not reduce to a neat formalization such as implicitly envisioned for
English by formal semanticists. %cite

Specifically this is because of the curious situation and status of Esperanto.
Generally this is because current formal semantics has a tunnel vision which
ignores the stochastic nature of natural language semantics; knowledge of
semantics seems to be continuous with world knowledge -- I would be surprised
if philosophers (or men of science\footnote{I employ this term because `scientist' is
an opaque derivation, a ``person who deals in knowledge'' is much too general;
transparent derivations are a thing of beauty.  Needless to say, \emph{men of
science} should be read as \emph{men and women of science}; classical
collocations may be a sacrifice to political correctness, but {\em quid ad me!}} for that
matter, I subscribe to Naturalism which submits that philosophy is (or should
be) continuous with science) could succeed in finding a clean break which
allows them to rigorously isolate semantics in empirical data.  suffers from
undergeneralization -- lots of interesting facts are available but a coherent
theory is not forthcoming (this is of course a very general problem of
neuroscience and to an even greater extent psychology).

{\bf A Reading comprehension experiment} could reveal how underspecified truly is, in
practice. My intuition is that Esperanto relies more on context than on linguistic
resources such as markedness, compared to ethnic languages with stronger cultural
transmission (Tomassello: ratchet effect), but that despite this it is an effective
tool for communication (which is what century long practice seems to show).

Esperanto has relatively little polysemy, because of the principle one root, one sense,
just as it follows one spelling, one pronunciation for phonology. The set of distinctions
that Esperanto makes lexically is unique and large compared to other Indo-European
languages.

Generative linguistics assumes the competence-performance distinction, just as
structuralism follows the distinction between {\em langue} and {\em parole},
respectively. Linguistics, according to these two theories at least, should study
only the second of these two pairs. The way to get at it is with recourse to speaker
intuitions, for example judgments of grammaticality or meaning. Initially these
intuitions were considered the only reliable source of data about the presupposed
abstract rule system that is grammar, since it was hypothesized that actual language use
is riddled with performance errors: disfluencies, ungrammaticality, etc.

Contrary to this are the findings in psychology demonstrating that introspection is a
completely unreliable source of data. %cite.
Another problem is that there can be no independent corroboration if the
researcher starts with intuitions and verifies these with yet more intuitions
(Stokhof 2008).

When linguistics (and hence semantics) employs corpus studies and questionaires coupled
with sophisticated statistical approaches, it has much more empirical leverage, and hence
more potential for success.  However, the problem that will rear its head is
that of integrating disparate sources of evidence, and the fact that there may
be lots of it (e.g., the BNC corpus is 100 million words). A neat, parsimonious
theory may not obtain. Then there is the problem of sparsity, which is especially 
relevant for endangered languages. Humans seem to have powerful generalization strategies
to deal with such situations, as evidenced by the fact that it has been possible for
anthropologists to learn the language of natives without the help of any common language.

Empirical rather than {\em a priori} approaches to semantics are available in the
form of Distributional Semantic Models (DSM), such as Latent Semantic Analysis (LSA),
which exploit statistics and word co-occurrences to reveal information such as semantic
similarity in a truly unsupervised way (i.e., not biased by {\em a priori} linguistic
theorizing), since they only exploit the concordances of surface forms in corpora.
But also in this case the corpora need to be carefully chosen, which is challenging
in Esperanto since speakers come from different backgrounds and may be at odds with
each other\footnote{Wim Jansen (personal communication) informs me that there seem to
be convergent intuitions on frequent and basic constructions among speakers from
different backgrounds, but that intuitions are less clear for potential constructions
which are not widely attested (e.g., clefts).}. It goes without saying that it is
possible to let DSMs loose on the whole literature of Esperanto, but there is undoubtedly
a similar kind of diglossia between written and spoken communication as attested
in other languages, which makes such a project less general than desired for a complete
semantics.

Stastically it should be conjectured that an exclusively \textsc{l2} language
such as Esperanto makes much stronger independence assumptions than a language
for which an exstensive collection of marked constructions is handed down
through generations. Although this speaks to the high degree of
compositionality in Esperanto -- which is good news for formal semantics which delights
in such systematicity -- stochastically this means that stronger independence
assumptions come into play, as the fragments of exemplars that are combined are
smaller, thus requiring more fragments for a single derivation, with less
corroboration by spurious derivations using fragments of different sizes (since
in the limit the fragments are words, which is the case when the principle of
compositionality holds unconditionally and when the lexicon is neatly composed
only of words).  This data sparsity is not problematic for international
communication and literature, as evidenced by the practice of Esperanto, but it
should become problematic when a language is stressed by the scrutiny applied
in court rooms and close readings.

{\bf Formal semantics assumes that semantics is universal}, be it tacitly or
explictly.  Whether this assumption is warranted is a difficult question, but
Esperanto sheds light on it from a unique angle. %cite antonym work

Miner (2008) raises fascinating questions as to the normativity of a language without
native speakers. In his seemingly structuralist opinion normativity without native
speakers simply does not obtain. I beg to differ, as the experience of Esperantists
from wildly different backgrounds seems to show that norms may be negotiated dynamically,
which allows the language to evolve (albeit slowly as a consequence of its relatively
non-intensive usage). Miner's point about negative evidence being absent is of course
correct, but I do not see why Esperanto is a special case in a crucial way. Negative
evidence is always a nebulous concept in philosophy and science, and
solliciting judgments from speakers, be they native or not, does not solve this. Note
that in modern Bayesian approaches there is actually no principled difference between
positive and negative evidence. %cite.

Children acquire native languages without any considerable negative evidence, why should
linguistics be any different in this regard? The fact that the judgments of \text{l2}
speakers are less authoritative (I refuse to consider this distinction as categorical)
does not detract from the apparent fact that they {\em exist}, and thus should be
exploited in an empirically responsible theory. Judgments can be aggregrated according
to voting theory and sophisticated statistical methods. This is of course more
work than the arm-chair linguistics that current \textsc{l1} speakers and
researchers are endeavoring, but it should be well worth the effort if the aim is to
get a more complete and trustworthy picture of a language.



%\section{Argument}
%When formal semantics is not being
%Anglocentric but takes a comparative approach instead, it usually implies a
%more or less explicit assumption that semantics is universal.

\section*{Conclusion}
{\bf Esperanto and formal semantics are incompatible. This is the fault of formal
semantics, not Esperanto.} Much work remains to be done for an 
{\em intellectual} theory of everything,\footnote{Please excuse the pompous
physics envy -- physicists and journalists alike seem to have no qualms about
their hubris when they employ such slogans. This decade we will find the ``God
particle!''} in which meaning should be a first-class citizen. I am confident
that men of science will achieve great strides in this respect during the
current century.

\vspace{5em}
{\bf Acknowledgments}: 
I am grateful for a critical discussion with In\'es Crespo which has sharpened my
terminology and thinking. I am indebted to professor Wim Jansen for
extensive correspondence and discussions on this topic. Nevertheless the opinions
expressed in this essay are entirely my own responsibility and by and large 
synthesized {\em de novo}.

\end{multicols}

\end{document}

Bibliography:

Miner 2008
Krifka; varieties
Stokhof; hand or hammer
Montague PTQ.
DSM, LSA: Steyvers, (cf. ESSLI course)
Dual Code (neuroscience) (cf. blackboard, zeevat course, langopt essay)
more?

SLA. Rekta metodo (ostensive).
-------------------------------------------

3. Convince the reader with your body paragraphs. Make sure each paragraph
supports your argument in a new way. Not sure your body's up to task? Try
isolating the first sentence of each paragraph; together, they should read like
a list of evidence that proves your thesis.

4. Conclude with strength. Try using the ROCC method:
Restate your thesis statement
One important detail which is usually found in your last paragraph

Conclude--wrap it up
Clincher--where you give the reader something left to think about
5. Show some style. Using outside sources? Find out which citation style your
instructor prefers, MLA or APA. Each has a precise notation system, so if
you're unsure of the rules, check the manual (on line versions available at
owl.English.Purdue.EU). Peppering quotes throughout your text is certainly a
good way to help make your point, but don't overdo it.

6. Burn flab, build muscle. Are your sentences in good shape? Examine each one
and decide whether you've used the fewest words possible while retaining
meaning. Trade in weak "to-be" verbs for stronger "action" verbs. (Example: "I
was writing my term paper" becomes "I wrote my term paper.")

7. Don't be a such a slob. Running your spelling-checker is only the first step
in proofreading your paper! A spell-check won't catch errors like "how" instead
of "show", nor will it pick up on doubled words ("the the") or grammar problems
(unless you use MS Word, which can be configured to check grammar, and already
catches double words). Little goofs like these aren't likely to impress the
instructor--if you're too careless to proofread, after all, there's a good
chance you didn't put much effort into your paper. Address the mess: ask a
friend to read through your essay, marking any mistakes.

8. Be sure to think of a good title to catch the readers attention!

% miner: negative evidence, underspecification (test?!)
% dasgupta, me: high compositionality, systematicity
% stokhof: logical form as ideal language, 
%	esperanto contrasted with ideal language
% me: semantics as skill not fact; is it possible to conceive of a formal
%	semantics describing a skill?! eg. one can objectively study bicycle
%	riding, but it's still practically 100% a skill to humans (hard to conceive of someone learning bicycle riding from a scientific study).
%	look up stuff about facts vs. skills, pragmatism etc.
% intuitive argument: if language is not ambiguous and underspecified enough
%	(eg. FOL), then it is not practically useful for domain-general spoken
%	communication, because the language will not exploit the role of
%	context, but everything will have to be spelled out (legalese)

%particle movement is oligatory when object NP is a pronoun
% (1) *he threw out it.
% (2) he throuw out the garbage

%Esperanto is strongly exemplar-based 
% - but due to sparsity, strong indepedence assumptions come into play
%    => high compositionality

% fact-value dichotomy: objectivity versus normativity;

% -------------------------------------------------------------------
Kara Andreas,                                                                             
Jes, mi konas la artikolon de Miner el la Festlibro kaj mi serioze                        
eklaboris ghuste en tiu kampo, t.e. en la kampoj de 'transparency' kaj                    
'opacity'. Ghuste en jhaudo mi havos kunvenon pri chapitro pri la                         
travideblo de Esperanto, kiun mi verkis por 'tutorial', verkata de                        
Kees Hengeveld.                                                                           
Mi faros mian eblon komenti viajn ideojn ghis la 9a kaj vershajne mi                      
sukcesos lastminute. En sabato mi havos kunvenon de iu doktoriga                          
komisiono, por kiu mi devas prepari min tre serioze kaj funde. Mi jam                     
finlegis la disertacion, sed mi antauvidas ferocan batalon inter la                       
komisionanoj.                                                                             
Eg interesa via sugesto pri Miner! Vi audos de mi, amike                                  
Wim

On Mon, 31 May 2010 17:26:32 +0200                                                        
"Wim Jansen" <wimjansen@casema.nl> wrote:                                                 
                                                                                          
> En ordo, la afero estas klara. Jes, mi ricevis vian mesaghon de la 17a kaj              
> bedauras, ke mi ne konfirmis respondon. Bone, ni priparolos ghin.                       
> Pri jhaudo la 10a: mi havos alian kunvenon je la 10a horo en                            
> Bungehuis, kaj                                                                          
> mi supozas, ke ghi dauros ghis la lunch-pauzo. Chu eventuale ni                         
> fiksu simple                                                                            
> la 14an horon por nia renkontigho? Vi scias, ke mi ne havas decan                       
> oficejon,                                                                               
> sed en Bungehuis ni sendube trovos kunvenejon (la pordisto bone konas min               
> kaj estas tre helpema); en la plej katastrofa kazo ni restos en la                      
> manghejo                                                                                
> au iros al la transstrata trinkejo.                                                     
                                                                                          
La 14an horon ?i estos.                                                                   
                                                                                          
?ar la 9a estas dato kiam mi devas finverki eseon pri semantiko, mi ne                    
povos demandi konsilojn pri ?i tiutempe, do mi faros tion nun. Mi planas                  
eseon pri la (ne)ebleco de lingvistiko kaj specife semantiko pri                          
Esperanto, reakcie al artikolo de Ken Miner [1]. ?u vi konas tiun                         
artikolon? Miaopine ?i estas tro pesimisma, sed mi ne havas tre klarajn                   
kaj konkretajn argumentojn por tio; mi iomete provis resumi miajn                         
pensojn en prezenta?o pri la artikolo [2], sed mia kono de lingvistiko                    
malvastas (ekz., mi suspektas ke lia tezo temas specife pri                               
strukturalisma lingvistiko, sed ne certas).                                               
                                                                                          
Mi pensas ke mi konsentas kun Miner pri la "svageco" de Esperanto (sed                    
svageco ne estas la ?usta vorto, la ?usta vorto estas neprecizeco, au                     
"underspecification"), sed akcepti tion havas la konsekvencon ke oni                      
devas klarigi kial Esperanto tiom bone funkcias en la praktiko, tio                       
?ajnas paradoksa. Eble ni povas konkludi ke Esperanto bezonas pli da                      
kunteksto por solvi ambigueco kaj neprecizeco, kaj ke tio estas ofero                     
de lingva precizeco por facileco. Mi tre scivolas pri viaj pensoj, ?ar                    
mi memoras ke vi asertis ke en Esperanto vi povas esprimi ?uste kion vi                   
sentas, anstatau nur kion vi povas diri kiel en fremda lingvo. ?u                         
malfremdeco kaj sento de esprimpovo rilatas al precizeco?                                 
                                                                                          
Andreas

 > A sizeable minority has Esperanto as a mother tongue                                   
                                                                                          
   Sizeable is erg rekbaar, maar je weet waarschijnlijk wel dat het om                    
   maximaal 1000 sprekers gaat, dat dit een telling is die is gebaseerd op                
   aantallen internationale esperantistenhuwelijken, maar ik heb nog nergens              
   een lijst kunnen vinden waaruit blijkt dat deze schatting plausibel is, of             
   dat het gaat om 1000 nu levende sprekers en niet om een historisch                     
   geaccumuleerd totaal.                                                                  
                                                                                          
 > But: children acquire any first language without negative evidence!                    
 >                                                                                        
 > (including meanings)                                                                   
                                                                                          
   Dat lijkt mij ook. Ouders bieden hun kinderen weliswaar vereenvoudigde                 
   taal aan, aanvankelijk - zou ik zeggen - zelfs ongrammaticale taal, maar               
   naarmate het kind vordert, wordt de taal niet alleen completer, maar ook               
   correcter. Bovendien pikken kinderen indirect heel veel op zonder direct               
   aangesproken te worden (gesprekken tussen anderen), en die taal bevat                  
   uitsluitend positive evidence (nou ja, fouten maken we allemaal).                      
                                                                                          
   Ik ben daarom een voorstander van het bedenken van goede negative evidence             
   experimenten in het Esperanto. Het is namelijk wel zo, dat                             
   Esperanto-sprekers hun taal doorgaans gebrekkig leren en maar heel beperkt             
   toepassen. Er zouden best wel eens constructies kunnen zijn die                        
   bestaanbaar zijn, maar nog niet bestaan. Zo zijn er grammaticafijnslijpers             
   die iets tegen de cleft `gekliefde zin' hebben in de syntaxis, terwijl er              
   sprekers zijn die met plezier deze constructie gebruiken en ook nog                    
   begrepen worden. Bij mijn promotieonderzoek heb ik ook gezocht naar                    
   aanwijzingen voor de bestaanbaarheid van hanging topics en left                        
   dislocations. Ik heb dat niet tot hoofdthema kunnen maken, maar heb het                
   erin gesmokkeld, en het is duidelijk dat dit wel degelijk meer aandacht                
   waard is.                                                                              
                                                                                          
 > Also: intuitions are only crucial for competence linguistics, and they                 
 >                                                                                        
 > are just as suspect as introspection in psychology                                     
                                                                                          
   Dit moet je me uitleggen.                                                              
                                                                                          
 > Esperanto's lexical semantics is very precise                                          
                                                                                          
   Daar valt veel over te zeggen en te onderzoeken. In hetzelfde Festlibro                
   voor Humphrey Tonkin staat een aardig artikel van Ilona Koutny over dit                
   onderwerp. Bovendien kun je in ieder woordenboek zien dat er synoniemen en             
   homoniemen zijn. 

 > Very rich morphology and systematic word building                                      
 >                                                                                        
 > (perfectly compositional)                                                              
                                                                                          
   Wat niet tussen haakjes staat is correct. Over de compositionaliteit                      kunnen we discussi�ren (met Koutny, hierboven). Vb: vortaro is heel                    
   specifiek `woordenboek', met uitsluiting van `verzameling woorden';                    
   libraro is heel generiek `verzameling boeken', met uitsluiting van bijv.               
   `bibliotheek' of `boekwinkel', en zo zijn er veel meer voorbeelden, die in             
   veel gevallen (allemaal?) historisch gegroeid zijn. Waar ikzelf mee bezig              
   ben is het verschijnsel derivatie-zonder-derivatiemiddelen: we weten wat               
   akvo betekent, maar wie bepaalt wat akvi is? Daarmee hangt samen het hele              
   fenomeen van redundantie, dat weer samenhangt met transparantie/opaciteit              
   tussen de verschillende niveaus van taalanalyse (1:1 afbeeldingen of                   
   niet), bijv. op de interface semantiek-morfosyntaxis.                                  
                                                                                          
 > Committee of average Esperanto speakers supplies consensus intuitions after all         

   Ik ben het helemaal met je eens. Er is een community (hoe je die noemt,
   doet er niet veel toe) en die heeft wel degelijk een consensus ontwikkeld. 
                                                                                          
 > But: this seems to amount to giving up formalism                                       
                                                                                          
   Dit moet je me weer uitleggen.

