Classifier details

Basic data
Classifier ID73
TypeMarkov
Size of n-grams3
TitleLanguage detection
DescriptionDetects 75 different languages. Trained with short abstracts from DBpedia.
Statuschecked
Languagen.a.
Created2014-02-13 19:31:06 MET
Distinct texts74999
Precision 97.2746% (sample standard deviation: 0.17644%)
Precision has been measured with 10-fold cross-validation

Training class distribution

Precision per class

Class Precision % of mistakes
en 100.0% 12.182%
af 100.0% 3.96282%
ka 100.0% 0.0%
ml 100.0% 0.0%
el 100.0% 0.0%
fa 100.0% 0.782779%
gd 100.0% 0.0%
he 99.9% 0.0%
ta 99.9% 0.0%
mg 99.9% 0.0%
hy 99.9% 0.0489237%
bg 99.9% 1.27202%
kk 99.9% 0.0%
bn 99.9% 0.0%
cs 99.9% 10.3718%
ar 99.8% 0.244618%
lt 99.8% 0.0978474%
th 99.8% 0.0978474%
lv 99.8% 0.146771%
fi 99.8% 1.0274%
is 99.8% 0.0489237%
hu 99.8% 0.0489237%
my 99.8% 0.195695%
kn 99.7% 0.0%
te 99.7% 0.0%
ru 99.7% 1.07632%
eo 99.7% 0.0489237%
vi 99.7% 0.0%
ga 99.6% 0.0489237%
pl 99.6% 0.0978474%
uk 99.6% 0.0489237%
br 99.6% 0.0%
cy 99.6% 0.391389%
gu 99.5% 0.0%
de 99.5% 0.880626%
eu 99.5% 0.0%
be 99.5% 0.0%
et 99.5% 0.489237%
ky 99.5% 0.0489237%
fr 99.5% 1.07632%
gl 99.4% 2.6908%
tr 99.4% 2.39726%
it 99.3% 2.64188%
es 99.3% 5.91977%
am 99.3% 0.0489237%
ne 99.1% 0.342466%
da 99.1% 2.39726%
sq 99.1% 0.0%
mr 99.0% 0.53816%
sw 99.0% 0.195695%
tg 98.8% 0.0%
yo 98.8% 0.0489237%
ur 98.7% 0.0978474%
vo 98.7% 0.0489237%
nn 98.6% 1.12524%
hi 98.4% 0.0%
ro 98.3% 0.0489237%
ca 98.2% 0.195695%
uz 98.1% 0.0489237%
mk 98.0% 1.17417%
id 97.9% 13.3072%
pt 97.6% 0.293542%
ko 97.5% 0.0489237%
sv 96.9% 1.61448%
sl 96.5% 0.684932%
ja 94.5% 1.07632%
nl 93.7% 0.146771%
az 93.1% 0.0489237%
sr 90.4% 0.0978474%
ia 85.2% 0.0%
hr 80.6807% 14.9217%
zh 79.6% 0.0%
sk 79.0% 0.146771%
ms 73.0% 0.684932%
bs 72.8% 12.2309%

(c) 2017 netEstate GmbH • Imprint