\documentclass[10pt,a4paper]{article}
\usepackage{linguex}
\usepackage{verbatim}
\usepackage[pdftex]{graphicx}
\usepackage[english]{babel}
\usepackage{hyperref}
\usepackage{fullpage}
\usepackage[utf8x]{inputenc}
\begin{document}

\begin{center}
{\em 
Andreas van Cranenburgh 0440949 \\
MoL project:
Empirically motivated \\
lexical representations in Lexical Semantics
 
University of Amsterdam, January 2010}
\end{center}

%Notes:
% advantage of esperanto/mal is that we know the antonym pairs explicitely


\begin{center}
{\bf \LARGE Antonym morphology in Esperanto}
\end{center}

\section{Introduction}

Esperanto is a constructed language with a simple grammar and transparant
morphology. While it is a constructed language not traditionally spoken by a
single human community, it does fulfill other criteria of natural language:
it has a speech community, a continuous history of use, written literature and
accords with linguistic universals (Jansen 2007). The fact that it started out
as a constructed language is not sufficient to dismiss it as a natural
language, because it has since developed and evolved to meet the needs of its
speakers.  It is the most succesful planned language, with an estimated number
of speakers ranging from 100.000 till 1 million.

Although Esperanto is clearly unique in its success as a constructed language,
it has received relatively little attention from academic linguistics. This may
be related to the fact that Esperanto has no normative community of native
speakers, so it is not possible to consult intuitions on an arbitrary
construction, except when it occurs in the existing literature which serves as
a model to be emulated. The consequence of this is that while it is possible to
attest what is grammatical through corpora and speaker knowledge, no Esperanto
speaker is in the position to say what is not allowed, except for the the more
trivial cases.

Although the lack of a community of native speakers, mixed vocabulary and
simple grammar conforms to the features of a pidgin, Esperanto is not a pidgin,
because it was deliberately planned and published with a grammar, vocabulary
and sample translations. Although there do exist native speakers of Esperanto
who have been taught the language as a secondary L1 at home, this phenomenon is
not comparable to the complete immersion that comes with national languages.
And since these native speakers are a minority and do not use Esperanto as
their main, day-to-day language, it cannot be said that Esperanto has
`creolized' because of them; the vast majority of speakers still learn the
language as a second language.

%Leĝo de Tonjo: “Ju pli longas reta diskuto en Esperanto, des pli la        
%probableco, ke ĝi deflankiĝos al diskuto pri gramatikaĵoj aŭ pri la            
%uzata vortigo, (asimptote) proksimiĝas al 1.”
\subsection{Research question}

Gradable antonym pairs are characterized by several properties that distinguish
a marked and an unmarked item of the pair, or alternatively, a positive and a
negative polarity antonyms. Lehrer (1985) discusses these properties with
regards to English and concludes:

\begin{quote}
[...] we see that markedness not a general structural property of antonymy;
rather it consists of a number of independent properties that are imperfectly
correlated.  However, none of these is in fact true of all antonym pairs.
\end{quote}

However, since in Esperanto most antonyms are systematically formed using the
very productive affix \emph{mal}-, perhaps their markedness is more clearly
related to their antonymy. The fact that antonyms are morphologically marked
will make it trivial to find antonym pairs. 

This relates to the controversial hypothesis of decomposable antonyms, also
known as the syntactic negation theory of antonymy (Heim 2008), which states
that instead of being specified in the lexicon, antonyms are formed by a
predicate negation operator {\bf little}, hidden away in the logical form of
words such as `short.' In Esperanto this is ostensively the case, since most
antonyms are formed by an explicit version of this operator, which makes it
an interesting case in point.

Research question: To what extent is antonym morphology in Esperanto indicative
of or related to semantic negativity?

My hypothesis is that the antonym prefix of Esperanto (\emph{mal}-) is
predictive of negative polarity, and more systematically so than in English.  
I will employ the following tests for negative polarity:

\begin{itemize}
\addtolength{\itemsep}{-.35\baselineskip}
\item Measure phrases require positive polarity (Kennedy 2001)
\item Ratio modifiers, `twice' / `x times', occur more with positive polarity
	adjectives (Sassoon 2008) 
\item Nominalizations of positive polarity adjectives
	are more frequent than those of negative polarity (Lehrer 1985)
\end{itemize}

While there are other tests, such as that positive antonyms are neutralized
in questions or that negative antonyms produce stronger entailments, I will
employ these tests because they are suitable for corpus study.

The outline of this paper will be as follows. First I will present some
evidence of the productivity of the antonym morphology in Esperanto, then
the main part of the paper, the three tests for markedness on antonym pairs.
I will conclude with some additional measurements.

I will make use of two corpora. The first is an approximately 1 million word 
corpus compiled from Gutenberg sources, containing 36 works, translated and
original literature, with a total of 743.623 words. The second corpus, the
Tekstaro (corpus), is an approximately 4.3 million word corpus available
online (Wennergren 2003), containing translated and original literature but
also magazine articles.

\subsection{Frequency spectrum of \emph{mal}-antonyms and base forms}

In order to see why the \emph{mal}-prefix is a suitable object of study,
I will demonstrate its pervasiveness and productivity in language usage.

To determine the productivity of an affix, it is useful to look at the
frequency spectrum of a class of words. Baayen and Lieber (1997) describe a
technique where the frequencies of words with certain affixes are compared.  By
looking at the frequencies of frequencies it is possible to gauge the
productivity of an affix. In other words, we abstract from the distribution of
the words and instead look at the distribution of their frequencies. There is a
certain number of words which occur once, a number that occur twice, etc. The
relation with productivity is that a highly productive affix can be recognized
by the fact that most of its frequencies are in the lower end, because newly
produced words will have low frequencies. An unproductive affix will, on the
other hand, have a set of words associated with it that are frequent enough to
be memorized as correct by speakers. Unproductive affixes can also be irregular
and semantically non-transparant, as a consequence of having to be memorized,
because if a word has to be stored in the mental lexicon in its own entry, it
might as well take on a life on its own. 

Baayen's productivity index can be calculated by dividing the number of hapax
legomena (word types which occur only once) of a given word formation process
by the total number of its tokens (Hay \& Baayen 2002). For the \emph{mal}-
prefix this is 1017 / 11025 = 0.0922. This index corresponds to the rate at
which new types are expected when further tokens are sampled. This index is
exceedingly high, predicting almost 1 new \emph{mal}-type for every 10
\emph{mal}-tokens that are sampled. Compare this to the productivity index for
the English un- prefix: 0.005 (Hay \& Baayen 2002).

Figure \ref{dist} shows the frequency distribution and density of
all \emph{mal}-antonyms, versus their corresponding base forms. The x-axis
shows frequencies, while the y-axis shows the frequency of a given frequency 
(ie, the number of word types that have the frequency on the x-axis). 

%Since the graph for the mal-words is more skewed to the left than the graph for
%the base forms, it can be concluded that the mal-words are characterized by a
%relatively high number of low-frequency words, which is a characteristic of
%productive affixes. Also, the fact that the distribution of mal-words appears to be
The graph for the frequency densities shows that the density of
\emph{mal}-words is smooth and has only one peak, ie., there is a unimodal
distribution, which suggests that all derivations are transparant, as opposed
to the graph in Baayen and Lieber (1997) showing a bimodal distribution caused
by the presence of both opaque and transparant uses of a Dutch prefix. 

Finally, it is striking that for the \emph{mal}-words the number of hapax
legomena (1017) is almost 8 times as large compared to that of corresponding
the non-\emph{mal}-words (129), and that is in spite of the fact that the
corresponding non-\emph{mal}-words have a total count of more than 7 times
higher than those of the \emph{mal}-words (83871 versus 11242). This suggest a
high amount of productivity for the \emph{mal}- prefix, but it could also mean
that the \emph{mal}- prefix is preferrably applied to more common words.

\begin{figure}
\includegraphics[scale=0.4]{maldist.pdf}
\includegraphics[scale=0.4]{maldens.pdf}
\caption{Left: frequency spectrum of \emph{mal}-words on a logarithmic
frequency scale, from gutenberg corpus. The x-axis has a logarithmic scale.
Right: The solid line shows the frequency density of the first 100
frequencies of \emph{mal}-words, the dashed lines the frequency density of the
corresponding non-\emph{mal}-words. From gutenberg corpus. Note that while the frequencies start with 1, the density curve is smoothed to the left.
}
\label{dist}
\end{figure}

The next two graphs show vocabulary growth curves for \emph{mal}-words. The
x-axis shows the number of \emph{mal}-word tokens considered, the y-axis the
number of types encountered so far. So within the first 2000 tokens, there have
been 500 different types.

Figure \ref{vgchap} shows the empirical growth curve (thick line) together with
the number of hapax legomena (thin line), as well as a comparison with an
interpolated version. Since the interpolated curve does not diverge from the
empirical curve there is no reason to believe that there are non-random
patterns in the data. 

\begin{figure}
\includegraphics[scale=0.4]{vgchap.pdf}
\includegraphics[scale=0.4]{vgcint.pdf}
\caption{Left: Empirical growth curve of \emph{mal}-words. The thin line is the
number of hapax legomena. From gutenberg corpus. Right: Empirical growth curve
with interpolated (expected) curve. From gutenberg corpus.} \label{vgchap}
\end{figure}

What is striking is that the growth curve is still very steep in the end, which
predicts that the number of types will keep rising steadily as more tokens are
considered. Since the curve for the hapax legomena is also quite steep, it is
clear that it is not just a fixed group of words larger than the sample that is
being encountered, but a steady stream of truly productive one-off words.

\subsection{\emph{Mal}-marked antonyms versus alternative forms}
%Do the neologisms for mal- adjectives behave the same?
Since the publication of the language, a lot of new words have been coined.
Among these there have also been neologisms to complement the
derived \emph{mal}-antonyms. This started out chiefly in poetry, where the
derived antonyms were felt to be too long and unnatural. If these neologisms
would be in general use, it would conflict with the assumed representativeness
for this study of \emph{mal}-antonyms for antonymy in general in Esperanto.

Table \ref{adjalt} compares some of the most frequent \emph{mal}-adjectives
with alternative forms, with counts from the large corpus:

\begin{table}\begin{center}
\begin{tabular}{lrlr}
\emph{mal}-marked 		& \# 	  	& alternative	& \# \\ \hline
\emph{malbona} (bad) 		& 1349    	& \emph{mava}		& 4 \\
\emph{malkara}	(cheap) 	& 30	  	& \emph{ĉipa}		& 3 \\ % malmultekosta
\emph{malsama} (different) 	& 268 	  	& \emph{diferenca} 	& 52 \\
\emph{maljuna} / \emph{malnova} (old) & 274 + 230	& \emph{olda}		& 61 \\
\emph{mallonga} (short) 	& 587		& \emph{kurta}		& 29 \\
\emph{malalta} (low) 		& 356		& \emph{basa}		& 19
\end{tabular}
\caption{\emph{Mal}-marked antonyms and alternative forms, in Tekstaro corpus.}
\label{adjalt}
\end{center}\end{table}

From these numbers it appears that there is a clear preference to use the
original, \emph{mal}-marked antonyms, instead of the alternatives.

All of these alternative forms, except `\emph{diferenca}' and `\emph{basa}',
are neologisms introduced specifically to replace the \emph{mal}-forms.
`\emph{Diferenca}' is derived from the noun `\emph{diferenco}' (difference).
`\emph{Basa}', was originally restricted to low as in `a low voice,' but has
acquired some new connotations:

\begin{itemize}
\item ``la landoj basaj'', pli-malpli la nunaj Nederlando, Belgio kaj [...] \\
(``the low countries'', more or less the current Netherlands, Belgium and
[...])
\item sed la plej basaj, putraj, koruptaj agmanieroj \\
(but the lowest, dirtiest, most corrupt behavior)
\end{itemize}

While it is impossible to generalize from this single figurative usage in the
corpus, it seems entirely plausible that the derivational transparancy of
`malalta' blocks the figurative use to express `low manners', which affords
`basa' the opportunity to take it on as an additional meaning. 
%This could suggest that mal-words generally 

With these results in mind it is clear that solely focusing on the
\emph{mal}-antonyms is justified, as the alternatives play a marginal role.


\section{Antonym pairs and markedness}
\subsection{Adjectives and antonyms}

Table \ref{adjgut} lists the ten most frequent adjectives in the gutenberg
corpus, with their frequencies in the larger corpus on the right.  The
\emph{mal}-antonyms of these adjectives occur significantly less often than the
basic forms, in the large corpus (Wilcoxon signed rank test, p-value =
0.001953). In other words, the difference in frequencies between adjectives
with and without \emph{mal} his higher than could be expected from chance. This
is the first indication that \emph{mal}- occurs with negative adjectives. While
it is not an actual test for polarity, the lower frequency is considered as a
side-effect of other properties of negativity, such as non-neutrality (Lehrer
1985).

%in gutenberg corpus
\begin{table}\begin{center}
\begin{tabular}{lrrr}
               &  adj   & \emph{mal}-ant.  & ratio \\
\hline
  \emph{grand}- (big) &  1390  & 481      & 0.346043 \\
  \emph{tut}- (whole) &  1023  & 0        & 0 \\
  \emph{bon}- (good)  &  872   & 198      & 0.227064 \\
  \emph{kar}- (precious) & 494 & 9	  & 0.018219 \\
  \emph{sam}- (same)  & 470    & 29       & 0.061702 \\
  \emph{bel}- (beautiful) & 551& 42       & 0.076225 \\
  \emph{jun}- (young)  &  382  & 274      & 0.717277 \\
  \emph{long}- (long)  &  477  & 124      & 0.259958 \\
  \emph{nov}- (new)    & 498   & 230      & 0.461847 \\
  \emph{alt}- (high)   &  261  & 66       & 0.252874
\end{tabular}
\begin{tabular}{lrrr}
      adj & \emph{mal}-ant. & ratio (ant/adj) \\
\hline
    8160  &  3191   & 0.391054 \\
    6413  &  0      & 0.000000 \\
    4777  &  1349   & 0.282395 \\
    1042  &  30     & 0.028791 \\
    2685  &  268    & 0.099814 \\
    2786  &  216    & 0.077531 \\
    2638  &  1288   & 0.488249 \\
    1950  &  587    & 0.301026 \\
    4405  &  1371   & 0.311237 \\
    1660  &  356    & 0.214458
\end{tabular}
\caption{Adjectives in gutenberg corpus (left) and Tekstaro corpus (right).}
\label{adjgut}
\end{center}\end{table}

Comparing the ratios from the two corpora indicates that they are not
significantly different. A Wilcoxon signed rank test results in a p-value of
0.9057. Taking all the counts from the corpora and dividing them by the
respective number of tokens in the corpus yields relative frequencies, which
makes it possible to do a correlation test. The correlation is 0.924812
(Spearman's rank correlation, p-value $<$ 0.001). This indicates that the
two corpora are sufficiently representative of each other for our purposes.

The total number of adjectives in the large corpus is 343120 (not counting
73341 participles with adjective marking, eg., `mi estas skribanta', `I am
writing'), with 321757 non-antonyms and 21363 \emph{mal}-antonyms. This means
that the ratio of \emph{mal}-antonyms to all adjectives is 6.2261 \%, which
will be used as a rough first estimate of expected frequencies to compare with
further counts. 

Furthermore it seems that there is a particularly high number of adjectives,
namely 8.041686 \% of the total number of words. This is more than three times
as high as the count for English in the BNC\footnote{British National Corpus
(Davies 2004)} (2.421849 \%). This is probably not an artefact of the large
amount of fiction in the Esperanto corpus, because when the search in the BNC
is restricted to fiction, the percentage remains as low as 2.229581 \%.
Stranger yet, when the search in the Esperanto corpus is restricted to
magazines (at 1.4 million words still a sizeable part of the corpus), the
percentage of adjectives is even higher at 10.53739 \%. From these observations
we can conclude that Esperanto has a genuinely high usage of adjectives
compared to English. A very tentative explanation for this could be that the
adjectives allow to make up for less specific and extensive vocabulary, while
at the same time being easier to understand by being more descriptive.

%raw counts:
%\b\w*aj?n?\b	1087726
%\b((ne)?.iaj?n?|la|ja|da|kaj|tra|iaj?n?|ajn)\b 666473
%\biliaj?n?\b 5473
%\b\w*[aoi]n?taj?n?\b  73341
%\bmal\w*[aoi]n?taj?n?\b  1902
%\b(pia|fia|dia)j?n?\b 681
%\bmal\w*aj?n?\b 23265


% tekstaro magazines only:
%\b\w*aj?n?\b	
%\b((ne|i)?.iaj?n?|la|ja|da|kaj|tra|iaj?n?|ajn|\w*[aoi]n?taj?n?)\b 209152
%\b(pia|fia|dia)j?n?\b 138


%BNC
\begin{comment}
\begin{table}\begin{center}
\begin{tabular}{lrlrr}
         adj   & count  & ant    & count   & ant/adj  \\
\hline
	 big   & 24852  & small  & 43118   & 1.734991 \\
        good   & 80204  & bad    & 14935   & 0.186212 \\
       young   & 32325  & old    & 52485   & 1.623666 \\
 	 new   & 123706 & old    & 52485   & 0.424272 \\
        long   & 50890  & short  & 18420   & 0.361957 \\
        high   & 38188  & low    & 16654   & 0.436105 \\
   beautiful   & 8394   & ugly   & 1299    & 0.154753
 % {\em total this list}  & 146911 &        & 505470  & 0.290642
 % total ADJ 2382220. ie., 2.38222 % of all words, versus 8.041686 % in Tekstaro
\end{tabular}
\caption{Adjectives in BNC}
\label{adjbnc}
\end{center}\end{table}

\begin{table}\begin{center}
\begin{tabular}{lrlrr}
word    & english       & esperanto \\ 
\hline
big     & 0.00024852    & 0.00191245502742 \\ 
good    & 0.00080204    & 0.0011195830473 \\ 
beautiful&8.394e-05     & 0.000652953395393 \\ 
young   & 0.00032325    & 0.000618266711072 \\ 
long    & 0.0005089     & 0.000457020502877 \\ 
new     & 0.00123706    & 0.00103239759753 \\ 
high    & 0.00038188    & 0.000389053351167 \\ 
small   & 0.00043118    & 0.000747873038298 \\ 
bad     & 0.00014935    & 0.000316164440196 \\ 
ugly    & 1.299e-05     & 5.06238095495e-05 \\ 
short   & 0.0001842     & 0.00013757488984 \\ 
old     & 0.00052485    & 0.000623188470334 \\ 
low     & 0.00016654    & 8.34355379612e-05
\end{tabular}
\caption{Adjective frequencies of English (BNC) and Esperanto (Tekstaro) 
compared. Note that Esperanto distinguishes between two senses of `old' (not
new and not young), whose frequencies have been combined for comparison.}
\label{adjcomp}
\end{center}\end{table}

> cor.test(d[,1], d[,2],paired=T, method='spearman')

        Spearman's rank correlation rho

data:  d[, 1] and d[, 2]
S = 146, p-value = 0.03306
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho
0.5989011

> wilcox.test(d[,1], d[,2],paired=T)

        Wilcoxon signed rank test

data:  d[, 1] and d[, 2] 
V = 20, p-value = 0.08032
alternative hypothesis: true location shift is not equal to 0
\end{comment}

\subsection{Measure phrases}

Measure phrases in Esperanto occur in at least three varieties: with
adjectives, nouns and verbs:

\begin{enumerate}
\addtolength{\itemsep}{-.35\baselineskip}
\item adjective: ``... proksimume dek centimetrojn longa'' (approximately 10 centimers long)
\item noun: ``... havas ordinare la altecon de 3-4 metroj'' (ordinarily has the height of 3-4 meters)
\item verb: ``mi eltrovis, ke la tegmento altiĝas 12 metrojn aŭ plu'' 
(I found out that the roof becomes-the-height of 12 meters or more)
\end{enumerate}

\begin{table}[htbp]
\begin{center}
\begin{tabular}{p{6cm}r}
\begin{tabular}{lr}
	adj	& count \\ \hline
     \emph{longa} (long) & 18 \\
     \emph{alta} (high) & 13 \\
     \emph{larĝa} (broad) & 9 \\
     \emph{dika} (thick) & 1
\end{tabular}

\vskip 1em
\begin{tabular}{lr}
  verb & count \\ \hline
      \emph{longas} (has-the-length of) & 2 \\
      \emph{malleviĝas} (descends) & 2 \\
      \emph{altiĝas} (becomes-the-height of) & 1
\end{tabular}
&
\begin{tabular}{lr}
	noun	& count \\ \hline
	\emph{alteco} (height)	& 6 \\
        \emph{longeco} (length) & 3 \\
       \emph{diametro} (diameter) & 2 \\
   \emph{distanco} 	(distance) & 2 \\
  \emph{interspaco}	(space in between) & 2 \\
	\emph{longo}   (length) & 1 \\
	\emph{dikeco}  (thickness) & 1 \\
	\emph{alto}	(height) & 1 \\
	\emph{spaco}	(space) & 1 \\
   \emph{profundecon} 	(depth) & 1 \\
 \emph{malproksimeco}  (away-ness) & 1 \\
 \emph{fokusdistanco}  (focus distance) & 1 \\
        \emph{larĝeco} (width) & 1 \\
\end{tabular}
\end{tabular}
\label{mes}
\caption{Breakdown of measure phrases by word type}
\end{center}\end{table}


Out of the 508 occurrences of `meters' (including centimeters and
kilometers etc.), 276 are prefixed by a quantity. Of these measure phrases, 41
are with an adjective, 23 with a noun, and 5 with a verb. See table \ref{mes}
for a breakdown by word type. Almost all of the measure phrases are with
non-\emph{mal}-words such as long (24), high (21) and wide (10). The
exceptions to this:

\ex. \label{m1} ``Kiam estas refluo kaj la akvo malleviĝas per du metroj aŭ pli'' 
	(When there is low tide and the water falls with two meters or more)

\ex. \label{m2} ``Ni ne povis vidi pli malproksime ol unu metron''
        (We couldn't see farther than one meter)

The first two sentences, \ref{m1} and \ref{m2}, are not true exceptions, because the
\emph{mal}-word and the measure phrase are not in the same clause, instead the
measure phrase modifies the verb in \ref{m1}, or only implicitely in \ref{m2}, if we
take an elliptic reading (`farther than one meter far').

\ex. \label{m3} ``mi falis tre rapide tridek metrojn malsupren'' 
	(I fell very rapidly thirty meters downwards)

\ex. \label{m4} ``... milojn da kilometroj malproksime'' 
	(thousands of kilometers away)

The third sentence, \ref{m3} is a measure phrase, but the \emph{mal}-word
is marked with an accusative of direction/movement. This is comparable to the
English `into' instead of `in'. Although the \emph{mal}-word clearly modifies
the measure `meters', it could be the accusative which licenses the \emph{mal}-marked antonym.

The last sentence, \ref{m4}, with the adverb
\emph{malproksime} (far / away) seems to be a counterexample to the hypothesis.
The word `\emph{malproksime}' is an exception to the otherwise succesful
hypothesis that measure phrases require positive, non-antonyms. There are no
measure phrases of meters with `\emph{proksime}' (nearby).

Perhaps conceptually `\emph{malproksime}' is positive, and conceptual
considerations take priority over morphological ones. Interestingly, the word
from which `\emph{malproksime}' is derived, `\emph{proksima}' is defined as
``Apartigita per malgranda distanco''\footnote{Reta Vortare de Esperanto,
\url{www.reta-vortaro.de}, a multi-lingual dictionary of Esperanto maintained
by volunteers.} (separated by a small distance), so its definition references a
negative antonym. Looking at the frequencies of `\emph{proksime}' and
`\emph{malproksime}' reveals another insight: they are almost the same, the
former occurring 560 times, the latter 569 times. This confirms the irregular
nature of the word, because other antonym pairs conform to the expectation that
one is more frequent than the other.

It can be concluded that the typical measure phrase of `quantity measure
adjective' occurs only with non-\emph{mal}-adjectives, as would be expected
from the hypothesis. Except for \emph{malproksime} / \emph{malproksimeco}, this
also goes for measure phrases with verbs and nouns. 


\subsection{Ratio modifiers}

In Esperanto `twice as \textsc{adj}' and `x times as \textsc{adj}' are all expressed using the
same regular suffix, prefixed with a numeral and followed by the comparative
forming word `\emph{pli}'. Examples:

\ex. la distanco al Stokholmo estas multoble pli longa. \\
(the distance to Stockholm is many times longer)

\ex. Lia loĝejo efektive estis dudekoble pli granda, ol la loĝejo de la muso \\
His living quarters were effectively twenty times as big as those of the mouse.

Without the comparative forming `\emph{pli}', what is expressed is often not a
degree, but more something along the lines of `for two distinct reasons':

\ex. ĉar pirati ies plum-frukton sen ties permeso kaj poste eĉ nei sian
misfaron estas konduto duoble neakceptebla. \\ 
(because plagiarizing someone's
pen labor and later even denying one's mistake is conduct twice unacceptable)

\ex. Mi estas duoble bonŝanca, ĉar mi naskiĝis kiel anglo [...] kaj judo. \\
I am twice lucky, because I was born as an Englishman [...] and jew.

Because of this I will restrict the following counts to those with
``\emph{pli},'' which guarantees that results are only about gradable
adjectives.

Of the 88 matches (see table \ref{obleadj}) in the large corpus for the phrase
`twice as \textsc{adj}' and `x times as \textsc{adj}' in Esperanto, only 5 of them were with a
\emph{mal}-antonym:

\ex. \label{fort} ``estas ankoraŭ dekoble pli malforta'' (``was still ten times weaker'')

\ex. \label{bon} ``tio estis centoble pli malbona ol'' \\
 ``that was a hundred times worse than'' 

\ex. \label{facil} ``multoble pli malfacilan'' (``many times harder'')

\ex. \label{ghoj} ``Tio estis unu el tiuj ridetoj, kiuj estas milionoble pli malĝojaj ol larmoj'' \\
        That was one of those smiles that are a million times sadder than tears

\ex. \label{felich} ``oni trovas ke la reĝo estas sepcent-dudek-naŭ-oble pli feliĉa ol la 
        tirano kaj la tirano samoble pli malfeliĉa ol la reĝo." \\
 ``one finds that the king is seven hundred and twenty nine times happier than
the tyrant and that the tyrant is the same number of times more unhappy than
the king.''


\begin{table}\begin{center}
\begin{tabular}{lp{1cm}p{1cm}p{1cm}p{1cm}rrr}
			& A	& B	& C	& D	& A/B	& C/D	& $\frac{A/B}{A/B+C/D}$ \\
adj			& \emph{-oble pli} \textsc{adj} & \emph{pli} \textsc{adj}&  \emph{-oble pli mal-}\textsc{adj}  & \emph{pli mal}-\textsc{adj} &  &  &  \\
\hline
\emph{granda} (big)            & 37    & 758   & 0      & 64    &  4.8813 \%   & 0 \%      & 100 \% \\
\emph{bela} (beautiful)        & 11    & 149   & 0      & 2     &  7.3826 \%   & 0 \%      & 100 \% \\
\emph{multa} (much)            & 5     & 121   & 0      & 4     &  4.1322 \%   & 0 \%      & 100 \% \\
\emph{longa} (long)            & 5     & 111   & 0      & 12    &  4.5045 \%   & 0 \%      & 100 \% \\
\emph{alta} (high)             & 4     & 227   & 0      & 37    &  1.7621 \%   & 0 \%      & 100 \% \\
\emph{fia} (shameful)          & 3     & 5     & 0      & 0     & 60.0000 \%   & -         & - \\
\emph{feliĉa} (happy)          & 3     & 53    & 1      & 11    &  5.6604 \%   & 9.0909 \% & 38.3721 \% \\
\emph{bona} (good)             & 2     & 525   & 1      & 99    &  0.3810 \%   & 1.0101 \% & 27.3859 \% \\
\emph{potenca} (powerful)      & 2     & 52    & 0      & 0     &  3.8462 \%   & -         & - \\
\emph{ĉarma} (charming)        & 2     & 11    & 0      & 0     & 18.1818 \%   & -         & - \\
\emph{forta} (strong)          & 1     & 182   & 1      & 15    &  0.5495 \%   & 6.6667 \% & 7.6142 \% \\
\emph{saĝa} (wise)             & 1     & 51    & 0      & 6     &  1.9608 \%   & 0 \%      & 100 \% \\
\emph{aĝa} (aged)              & 1     & 169   & 0      & 0     &  0.5917 \%   & -         & - \\
\emph{dika} (thick/fat)        & 1     & 12    & 0      & 8     &  8.3333 \%   & 0 \%      & 100 \% \\
\emph{oportuna} (oportune)     & 1     & 20    & 0      & 0     &  5.0000 \%   & -         & - \\
\emph{ofta} (frequent)         & 1     & 20    & 0      & 9     &  5.0000 \%   & 0 \%      & 100 \% \\
\emph{luma} (bright)           & 1     & 8     & 0      & 7     & 12.5000 \%   & 0 \%      & 100 \% \\
\emph{grandioza} (enormous)    & 1     & 6     & 0      & 0     & 16.6667 \%   & -         & - \\
\emph{efika} (efficient)       & 1     & 21    & 0      & 0     &  4.7619 \%   & -         & - \\
\emph{distanca} (distant)      & 1     & 1     & 0      & 0     &       0 \%   & -         & - \\
\emph{ĝoja} (joyful)           & 0     & 5     & 1      & 5     &       0 \%   & 20.0000 \% & 0 \% \\
\emph{facila} (easy)           & 0     & 67    & 1      & 44    &       0 \%   & 2.2727 \%  & 0 \% 
\end{tabular}
\caption{All occurences of `-oble \textsc{adj}' (twice / x times \textsc{adj})}
\label{obleadj}
\end{center}\end{table}

The remaining 83 matches were with non-\emph{mal}-antonyms, apparantly conforming to the
expectation that `twice' and `x times' prefer positive items. However, the
expected average frequency of \emph{mal}-adjectives predicts about 5.5 ocurrences. 

But from the 5 occurrences that were found, \ref{felich} could be discounted
because the antonyms have the discourse function of contrast, and \ref{bon} could
be discounted because good/bad are outliers in other languages as well (see
eg., Sassoon (to appear), who reports that `twice as bad' is twice as
frequent as `twice as good').  The adjective in \ref{facil} is conceptually
positive (in the BNC `twice as hard' has 22 matches, `twice as easy' zero). The
remaining two antonyms in \ref{ghoj} and \ref{fort} seem to be genuinely
negative. Looking at the frequencies shows of the last three adjectives shows
that for \emph{facila} and \emph{ĝoja}, the corresponding \emph{mal}-antonyms are more
frequent than their base forms, and the distribution of \emph{feliĉa} and
\emph{malfeliĉa} roughly corresponds to the distribution of 3 to 1 in
table \ref{obleadj}. This means that these exceptions are not just exceptional
with ratio modifiers, but are generally atypical.

In all the results do support the hypothesis, but with a few exceptions.
%TODO: more.

\subsection{Nominalizations}

From the table of adjectives and their antonyms we can study the frequency of
their nominalizations. There are two possible nominalizations: directly
affixing a noun ending (-\emph{o}) to the root, or using the affix -\emph{eco},
denoting an abstract quality.  The affix -\emph{eco} serves to emphasize and
restrict the meaning to an abstract quality. It is comparable to the English
suffixes -ness, -ity and -ship. For adjectival roots the suffix is in some
cases superfluous, eg., size can be expressed both as `\emph{grando}' and
`\emph{grandeco},' whereas `greatness' can only be expressed by
`\emph{grandeco}'. For noun-roots the suffix differentiates from the normal
meaning, as with `\emph{homo}' (human) and `\emph{homeco}' (humanity). The
advice in grammar textbooks is to use the suffix only when necessary. Some
examples:

\begin{itemize}
\addtolength{\itemsep}{-.35\baselineskip}
\item granda (big) $\rightarrow$ grando (size), grandeco (size, greatness)
\item longa (long) $\rightarrow$ longo (length), longeco (length / longness)
\item homo (human/\textsc{noun}) $\rightarrow$ homa (human/\textsc{adj}),  homeco (humanity)
\end{itemize}

%gutenberg corpus
\begin{comment}
\begin{table}[htbp]
\begin{center}
\begin{tabular}{lrrrrlrrrr}
      &  -\emph{o}  & -\emph{eco}    & sum   & nom/adj  & ant. & -\emph{o} & -\emph{eco} & sum & nom/adj \\
\emph{grand} &  6   & 45      & 51    & 0.036691 &      & 0  & 2    & 2   & 0.004158 \\
\emph{tut}   &  15  & 1       & 16    & 0.015640 &      & 0  & 0    & 0   & -        \\ 
\emph{bon}   &  56  & 24      & 80    & 0.091743 &      & 72 & 3    & 75  & 0.378788 \\
\emph{kar}   &  0   & 7       & 7     & 0.014170 &      & 0  & 1    & 1   & 0.111111 \\
\emph{sam}   &  22  & 2       & 24    & 0.051064 &      & 1  & 4    & 5   & 0.172414 \\
\emph{bel}   &  5   & 57      & 62    & 0.112523 &      & 0  & 3    & 3   & 0.071429 \\
\emph{jun}   &  0   & 49      & 49    & 0.128272 &      & 0  & 7    & 7   & 0.025547 \\
\emph{long}  &  14  & 13      & 27    & 0.056604 &      & 0  & 1    & 1   & 0.008065 \\
\emph{nov}   &  0   & 3       & 3     & 0.006024 &      & 0  & 0    & 0   & - \\
\emph{alt}   &  10  & 43      & 53    & 0.203065 &      & 0  & 2    & 2   & 0.030303
\end{tabular}
\caption{Nominalizations in gutenberg corpus. The column `nom/adj' lists the
ratio of nominalizations (both kinds) to the frequency of the corresponding
adjective, as reported in table \ref{adjgut}}
\label{nomgut}
\end{center}\end{table}
\end{comment}

%tekstaro corpus
\begin{table}[htbp]
\begin{center}
\begin{tabular}{lrrrrlrrrr}
      &  -\emph{o}  & -\emph{eco} & sum & nom/adj  & ant.& -\emph{o} & -\emph{eco} & sum & nom/adj \\
\emph{grand} & 12   & 156     & 168     & 0.020588 &    & 2  & 17   & 19  & 0.005954 \\
\emph{tut}   & 164  & 16      & 180     & 0.028068 &    & 0  & 0    & 0   & -       \\ 
\emph{bon}   & 452  & 174     & 626     & 0.131044 &    & 698 & 60  & 758 & 0.561897 \\
\emph{kar}   & 0    & 2       & 2       & 0.001919 &    & 0  & 2    & 2   & 0.066667 \\
\emph{sam}   & 190  & 5       & 195     & 0.072626 &    & 3  & 15   & 18  & 0.067164 \\
\emph{bel}   & 124  & 319     & 443     & 0.159009 &    & 8  & 10   & 18  & 0.083333 \\
\emph{jun}   & 6    & 224     & 230     & 0.087187 &    & 3  & 80   & 83  & 0.064441 \\
\emph{long}  & 143  & 49      & 192     & 0.098462 &    & 0  & 4    & 4   & 0.006814 \\
\emph{nov}   & 7    & 13      & 20      & 0.004540 &    & 1  & 5    & 6   & 0.004376 \\
\emph{alt}   & 104  & 111     & 215     & 0.129518 &    & 0  & 2    & 2   & 0.005618
\end{tabular}
\caption{Nominalizations in Tekstaro corpus}
\label{nomtek}
\end{center}\end{table}

See table \ref{nomtek} for the nominalization counts in the larger corpus.
For most of the adjectives the basic form has a higher ratio of nominalizations
than the \emph{mal}-antonym. The one notable exception to this is `\emph{malbono}' (bad / evil).
The counts for the other exceptions are too low to draw conclusions: \emph{malkareco}
(cheapness)
\footnote{I have ommited from this count one instance where '\emph{karoj}' occurred
as a pronounciation guide for a name, and another where `\emph{karo}' occurred in a
list of word forms not in use.}     
and \emph{malsameco} (difference). For the latter this is probably related to the fact
that `difference' is preferrably expressed as `\emph{diferenco}' in Esperanto.

Comparing the ratios of nominalizations of \emph{mal}-adjectives and
non-\emph{mal}-adjectives does not indicate a significant difference. However,
if we leave out the more irregular words \emph{tut}- and \emph{kar}- because
they have zero counts, and \emph{bon}- for which the antonym has a higher count
than the non-antonym, the difference is significant after all (Wilcoxon test,
p-value = 0.01563). This supports the hypothesis that nominalizations of
\emph{mal}-marked antonyms are less frequent than those of their base forms.

Another phenomenon is that apparantly the \emph{mal}-antonyms favor the use of the
-\emph{eco} suffix, except for the outliers `\emph{malbono}' (the bad) and `\emph{malbelo}' (the
ugly). This may be because some of the nominalizations without -\emph{eco}
solely express a neutral concept like length or size, which do not have a
common sense antonym, while the nominalizations with -\emph{eco} aditionally
can express the presence of the quality or characteristic described by the
adjective from which it is derived, which does have an antonym. However, a test
for statistical significance does not produce a significant result for the
ratios between -o and -eco nominalizations given \emph{mal}- or
non-\emph{mal}-adjectives (Wilcoxon signed rank test, p = 0.327), so the effect
is not strong enough or the sample too small.

From the attested nominalizations of antonyms it is clear that Esperanto is
more systematic and free in its word formation than a language such as English.
In English, words such as *smallness, *oldness and *lowness are not allowed, while size,
youth and height are, although they do not derive from productive affixes.

It can be concluded that, barring a few exceptions, Esperanto does accord with the
pattern that nominalizations of unmarked adjectives are more frequent than
those of marked antonyms, and that this is probably related to the fact that
nominalizations of non-\emph{mal}-adjectives are additionally tasked to express
neutral concepts, as opposed to only positively expressing the presence of the
quality denoted by the adjectival root.

\section{Additional data}

\subsection{Comparatives with and without antonyms}

\begin{table}[htbp] \begin{center}
\begin{tabular}{lrrr}
			 & A		& B		& B/(A+B) \\
			 & \emph{pli} (more) & \emph{malpli} (less) & ratio \\
\hline
non-\emph{mal}-adjective   & 7459       & 1004          & 11.8634 \% \\
\emph{mal}- adjective      & 616        & 12            & 1.91082 \% \\
total			 & 8075	      & 1016	      & 11.1758 \% \\
ratio ant./total	 & 7.6284 \%    & 1.1811 \%   & %^^ 14.60133 \%
\end{tabular}
\caption{More- and less-comparatives with and without antonyms, from
Tekstaro corpus}
\label{comp}
\end{center}\end{table}

\begin{table}[htbp]
\begin{center}
\begin{tabular}{lrrr}
			 & A		& B		& B/(A+B) \\
                & more \textsc{adj} / comparative \textsc{adj}    & less \textsc{adj}  & ratio \\
\hline
`un-' adjective & 669 / 28	                & 135	    & 16.22596 \% \\
any adjective   & 38879 / 188582                & 7293      &  3.10665 \%  \\
ratio un-/total & 0.30642 \%			& 1.85109 \% &
\end{tabular}
\caption{Comparatives for English in the BNC}
\label{engcomp}
\end{center}\end{table}

Table \ref{comp} shows counts of the four possible configurations of
comparatives in Esperanto. A chi-squared test shows that the results are highly
significant (p $<$ 0.001), which means that whether the comparison is with
`\emph{pli}' or `\emph{malpli}', or with mal-antonym or not, has a strong
effect on the frequencies. There is a strong tendency towards comparisons with
non-antonyms (89\%), and an even stronger tendency for comparisons with `more'
(92\%). 

If we consider the ratio of antonym verus non-antonym adjectives, which is
about 6.2\%, we find that more/\emph{pli} comparisons with antonyms, 616 exceeds
the expected value by just one fifth given this ratio. Antonyms with
less/\emph{malpli} comparisons on the other hand are more than 5 times lower
than the expected value, possibly due to the duplication of `mal': `less
unhappy' is `malpli malfeliĉa' in Esperanto. Due to the systematic application
of \emph{mal}- it might be the case that the two occurrences of \emph{mal}- are
dropped as a sort of `double negation elimination.'

Table \ref{engcomp} compares this with English, with counts from the BNC.

From the ratios it appears that English more strongly prefers more-comparisons
to less-comparisons than Esperanto: in English 97\% of comparisons are with
`more', in Esperanto 89\%.  Perhaps this is because of the asymmetry in English
of having a suffix for more-comparatives (harder) but not for the opposite
(less hard). 

If we take the English prefix un- as a representative antonym marker, we can
compare comparatives with antonyms in English and Esperanto. Here it appears
that Esperanto more strongly prefers positive comparisons given the prefix,
probably due to the already mentioned duplication of \emph{mal}-.

Table \ref{compb} lists a selection of counts of word types in the first 1000
comparatives, those of which did not have zero counts for the \emph{mal}-adjectives. 
 The last column lists the ratio of non-\emph{mal}-adjectives to
\emph{mal}-adjectives, normalized for frequency.  There appears to be a lot of
variance, so it can be concluded that there is no markedness effect in
comparatives, as there has been in the previous sections.

\begin{table}\begin{center} \begin{tabular}{lp{1cm}p{1cm}p{1cm}p{1cm}rrr}
	& A	& B	& C	& D	& A/B	& C/D	& $\frac{A/B}{A/B+C/D}$ \\
        & \emph{pli} \textsc{adj}	&	\textsc{adj}	&	\emph{pli mal}-\textsc{adj}	&	\emph{mal}-\textsc{adj}	&	&	&	\\ \hline
profunda (deep)	&	2	&	676	&	2	&	19	&	0.00296	&	0.10526	&	2.734 \%	\\
dika (thick/fat)	&	3	&	367	&	3	&	205	&	0.00817	&	0.01463	&	35.839 \%	\\
laŭta (loud)	&	3	&	207	&	2	&	117	&	0.01449	&	0.01709	&	45.882 \%	\\
longa (long)	&	4	&	1950	&	4	&	587	&	0.00205	&	0.00681	&	23.138 \%	\\
vasta (vast)	&	4	&	570	&	2	&	108	&	0.00702	&	0.01852	&	27.481 \%	\\
facila (easy)	&	6	&	383	&	6	&	521	&	0.01567	&	0.01152	&	57.633 \%	\\
saĝa (wise)	&	6	&	544	&	2	&	262	&	0.01103	&	0.00763	&	59.098 \%	\\
kara (precious)	&	7	&	1042	&	4	&	30	&	0.00672	&	0.13333	&	 4.797 \%	\\
riĉa (rich)	&	10	&	735	&	3	&	455	&	0.01361	&	0.00659	&	67.358 \%	\\
proksima (near)	&	11	&	587	&	13	&	462	&	0.01874	&	0.02814	&	39.975 \%	\\
forta (strong)	&	18	&	1199	&	2	&	365	&	0.01501	&	0.00548	&	73.260 \%	\\
juna (young)	&	32	&	2638	&	19	&	1288	&	0.01213	&	0.01475	&	45.125 \%	\\
alta (high)	&	40	&	1660	&	11	&	162	&	0.02410	&	0.06790	&	26.192 \%	\\
bona (good)	&	52	&	4777	&	18	&	1349	&	0.01089	&	0.01334	&	44.928 \%	\\
granda (big)	&	109	&	8160	&	10	&	3191	&	0.01336	&	0.00313	&	80.998 \%	\\
\end{tabular}
\label{compb}
\caption{Breakdown of some of the first 1000 comparatives with \emph{pli}.}
\end{center}
\end{table}

%\subsection{Equatives}
Similar observations can be done for equatives. Of the 82 matches for `same
\textsc{adj} kiel' (as \textsc{adj} as) in Esperanto, 7 are with
\emph{mal}-words. This is about two fifths more than expected from the
average frequency of \emph{mal}- with adjectives (5.1). Table \ref{eqadj}
summarizes some of the counts. Contrary to the comparatives, here it does
seem to be the case that non-\emph{mal}-words are usually preferred.
%explain?


\begin{table}\begin{center}
\begin{tabular}{lp{1cm}p{1cm}p{1cm}p{1cm}rrr}
			& A	& B	& C	& D	& A/B	& C/D	& $\frac{A/B}{A/B+C/D}$ \\
adj			& \emph{same} \textsc{adj} \emph{kiel} & \textsc{adj}&  \emph{same mal-}\textsc{adj}  \emph{kiel} & \emph{mal}-\textsc{adj} &  &  &  \\
\hline
granda (big)            & 4     & 8160  & 0      & 3191  & 0.000490      & 0.000000      & 100 \% \\
bona (good)             & 4     & 4777  & 0      & 1349  & 0.000837      & 0.000000      & 100 \% \\
bela (beautiful)        & 3     & 2786  & 3      & 216   & 0.001077      & 0.013889      & 7.1952 \% \\
alta (high)             & 3     & 1660  & 0      & 356   & 0.001807      & 0.000000      & 100 \% \\
trankvila (tranquil)    & 1     & 538   & 1      & 162   & 0.001859      & 0.006173      & 23.1429 \% \\
gaja (cheerful)         & 0     & 557   & 1      & 289   & 0.000000      & 0.003460      & 0.0 \% \\
nova (new)              & 0     & 4405  & 1      & 1371  & 0.000000      & 0.000729      & 0.0 \% \\
supera (superior)       & 0     & 245   & 1      & 45    & 0.000000      & 0.022222      & 0.0 \% \\
\end{tabular}
\caption{Some occurences of `same \textsc{adj} kiel', including all \emph{mal}-words, excluding
63 (non-\emph{mal}) adjectives occurring only once.} 
\label{eqadj}
\end{center}\end{table}


%Does the mal- prefix have different senses?
%Distributional data

%No, all of its use is morphologically transparent.

\subsection{Degree modifiers}

Kennedy \& McNally (2005) present a table demonstrating that the words which
the degree modifiers `very,' `well' and `much' apply to are largely
complementary.  Table \ref{degree} shows similar counts for Esperanto.

% in Tekstaro corpus
\begin{table}\begin{center}
\begin{tabular}{lrrr}
			& bone		& tre / treege 	& multe \\
			& (well)	& (very / VERY) & (much) \\
\hline
informita (informed) 	& 8 		& 0		 & 0	\\
edukita (educated) 	& 17	 	& 0	  	 & 0	\\
konata (known) 		& 57 		& 15		 & 2	\\
ŝatata (liked)		& 0		& 12		 & 1	\\
surprizita (surprised)	& 0		& 6	      	 & 1	\\
konfuzita (confused)	& 0		& 5	      	 & 0	\\
bezonata (needed)	& 0		& 5	      	 & 0	\\
uzata (used)		& 2		& 1	      	 & 4    \\
influita (influenced)	& 0		& 0	      	 & 3
\end{tabular}
\caption{Distribution of degree modifiers with deverbal adjectives in Tekstaro corpus}
\label{degree}
\end{center}\end{table}

It appears that the distributions of `well' and `very' are similarly
complementary as in English. But for the corresponding word for `much' things
are not as clear cut. It does not seem to be a frequently used degree modifier,
from the small survey I performed.


% experiment: took all sentences with a mal-word and those without,
%	stored them as sets of words (minus the mal-words),
% 	trained naive bayes classifier to predict whether sentence
%	was a sentence with or without a mal-word.
%	test sentences (20%) of total scored 49% correct (less than chance).


\section{Conclusion}

The researche quostion can be answered in the affirmative: 
it does seem to be the case that in Esperanto antonym morphology co-occurs 
with all the symptoms of negative polarity adjectives that we reviewed. This
can be attributed to the fact that most of the adjectival roots in Esperanto
are conceptually positive, leading the corresponding antonym to be negative;
but it also strengthens the case that Esperanto behaves very much like other
natural langauges.

However, and perhaps suprisingly, Esperanto is not more systematic in this
regard than English. There are exceptions such as the words `\emph{facila}'
(easy) and `\emph{proksima}' (nearby), which are conceptually negative in most
languages. But these exceptions do not occur in a single test of negativity,
they diverge from the other predictions of marked antonyms as well.

The case for \emph{mal}-antonyms and negativity could be strengthened in
future research by questionnaires, which should study neutralization in
questions, evualativity and entailments.

%Furthermore it appears that the application of antonym morphology is fully
%systematic and transparant.

\subsection{References}

\begin{description}
\item[Baayen, R. H. \& Lieber, R.,] (1997) 
	``Word Frequency Distributions and Lexical Semantics.''
	\emph{Journal of Computers and the Humanities}, vol. 30 pp. 281-291.

\item[Davies, M.] (2004-) `BYU-BNC: The British National Corpus.' 
	Available online at \url{http://corpus.byu.edu/bnc}

\item[Hay, J. \& Baayen, R. H.] (2002), 
	``Parsing and productivity.''
	In Booij, G. and van Marle, J. Yearbook of Morphology 2001. 
	Kluwer Academic Publishers, pp. 203-235.

\item[Heim, I.] (2008). ``Decomposing Antonyms?''
	\emph{Proceedings of Sinn Und Bedeuting 12}. Oslo.

\item[Jansen, W.] (2007). ``Woordvolgorde in het Esperanto: normen, taalgebruik
	en universalia'' (Word-order in Esperanto: norms, usage and universals). 
	PhD thesis, LOT Utrecht.

\item[Kennedy C.] (2001). 
	``Polar opposition and the ontology of `degrees’ ''.
	\emph{Linguistics and Philosophy}, vol. 24, pp. 33–70.

\item[Kennedy, C. \& McNally, L.] (2005). 
	``Scale structure, degree modification, 
	and the semantics of gradable predicates.'' 
	\emph{Language}, vol. 81, number 2.

\item[Lehrer, A.] (1985). 
	``Markedness and Antonymy.''
	\emph{Journal of Linguistics}, Vol. 21, No. 2 (Sep., 1985), pp. 397-429

\item[Sassoon, G. W.] (2010)
	``The Degree Functions Of Negative Adjectives.''
	\emph{Natural Language Semantics}, to appear.

\item[Wennergren, B.] (2003-). ``Tekstaro de Esperanto.'' 
	Esperantic Studies Foundation. Available online at \url{www.tekstaro.com}

\end{description}

\end{document}