Impact, Characteristics, and Detection of Wikipedia Hoaxes

Wikis are ubiquitous in organisational and private use and provide a wealth of textual data. Maintaining the currency of this textual data is important and ...







TS Wikipedia Corpus - LDC Catalog
ABSTRACT. Wikipedia is commonly viewed as the main online encyclope- dia. Its content quality, however, has often been questioned.
Statistical Measure of Quality in Wikipedia
An n-gram in turn is a substring of n tokens of t, where a token can be a character, a word, or a part- of-speech (POS) tag. The Term Frequency ? Inverse ...
Identifying Featured Articles in Spanish Wikipedia - SEDICI
... character set ... Word processors or HTML. Markdown was created by John Gruber in 2004 and is the default mechanism for docu- menting ...
GitHub Wiki Design and Implementation
It introduces the most relevant definitions and the related work for the research fields of semantic relatedness, named entity recog- nition, word sense ...
Utilising Wikipedia for Text Mining Applications - SciSpace
Model that uses both local (exact matching of n- grams of characters) and distributed (word embeddings) representations to compute a relevance score (Mitra ...
Design and Implementation of the Sweble Wikitext Parser
It presents the de- sign and implementation of a parser for Wikitext, the wiki markup language of MediaWiki. We use parsing expres- sion grammars where most ...
Cross-domain Text Classification using Wikipedia
Abstract?Traditional approaches to document classification requires labeled data in order to construct reliable and accurate classifiers.
How-To Wiki - iGEM
Compactness: The single-page-WIF needs to encode all structural information of a wiki, e. g. nested lists, headlines, tables, nested paragraphs, emphasised or ...
Towards a Wiki Interchange Format (WIF) - CEUR-WS.org
MediaWiki syntax allows authors to append or prepend text directly to the link to the effect that the pre- or postfix will be rendered as part of the link.
Design and Implementation of Wiki Content Transformations and ...
Articles from the source language Wikipedia are translated into the target lan- guage in advance and then transformed into training data TDS. In ...
Lindicle D2.1 Cross-lingual Infobox Alignment
tomatic method, which primarily consists of word labeling and feature vector generation, to generate the training data set TD = {(x, g(x))} from these.
Edinburgh Research Explorer - Transfer Learning Based Cross ...
recherche cross-modale apprise de manière cross-modale. les titres des articles Wikipédia, qui sont également susceptibles de contenir la nature.