pragmatism first

APERTIUM(1) - Linux manual page online | User commands

This application is part of ( apertium ).

apertium(1) apertium(1)


apertium - This application is part of ( apertium ) This tool is part of the apertium machine translation architecture: http://aper‐


apertium [-d datadir] [-f format] [-u] [-a] {language-pair} [infile [outfile]]


apertium is the application that most people will be using as it simplifies the use of apertium/lt-toolbox tools for machine translation purposes. This tool tries to ease the use of lt-toolbox (which contains all the lexical processing modules and tools) and apertium (which contains the rest of the engine) by providing a unique front-end to the end-user. The different modules behind the apertium machine translation architecture are in order: · de-formatter: Separates the text to be translated from the format information. · morphological-analyser: Tokenizes the text in surface forms. · part-of-speech tagger: Chooses one surface forms among homographs. · lexical transfer module: Reads each source-language lexical form and delivers a corresponding target-language lexical form. · structural transfer module: Detects fixed-length patterns of lexical forms (chunks or phrases) needing special processing due to grammatical divergences between the two languages and performs the corresponding transformations. · morphological generator: Delivers a target-language surface form for each target- language lexical form, by suitably inflecting it. · post-generator: Performs orthographical operations such as contractions and apos‐ trophations. · re-formatter: Restores the format information encapsulated by the de-formatter into the translated text and removes the encapsulation sequences used to protect certain characters in the source text.


-d datadir The directory holding the linguistic data. By default it will use the expected installation path. language-pair The language pair: LANG1-LANG2 (for instance es-ca or ca-es). -f format Specifies the format of the input and output files which can have these values: · txt (default value) Input and output files are in text format. · html Input and output files are in "html" format. This "html" is the one accepted by the vast majority of web browsers. · html-noent Input and output files are in "html" format, but preserving native encoding characters rather than using HTML text entities. · rtf Input and output files are in "rtf" format. The accepted "rtf" is the one generated by Microsoft WordPad (C) and Microsoft Office (C) up to and including Office-97. -u Disable marking of unknown words with the '*' character. -a Enable marking of disambiguated words with the '=' character.


These are the two files that can be used with this command: -m memory.tmx use a translation memory to recycle translations -o direction translation direction using the translation memory, by default 'direction' is used instead -l lists the available translation directions and exits direction typically, LANG1-LANG2, but see modes.xml in language data infile Input file (stdin by default). outfile Output file (stdout by default).


lt-proc(1), lt-comp(1), lt-expand(1), apertium-tagger(1).


Lots of...lurking in the dark and waiting for you!


(c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved.
2006-03-08 apertium(1)
This manual Reference Other manuals
apertium(1) referred by apertium-deshtml(1) | apertium-deslatex(1) | apertium-desmediawiki(1) | apertium-desodt(1) | apertium-despptx(1) | apertium-desrtf(1) | apertium-destxt(1) | apertium-deswxml(1) | apertium-desxlsx(1) | apertium-filter-ambiguity(1) | apertium-gen-deformat(1) | apertium-gen-reformat(1) | apertium-postlatex(1) | apertium-postlatex-raw(1) | apertium-prelatex(1) | apertium-preprocess-transfer(1) | apertium-pretransfer(1) | apertium-rehtml(1) | apertium-relatex(1) | apertium-remediawiki(1)
refer to apertium-tagger(1) | lt-comp(1) | lt-expand(1) | lt-proc(1)
Download raw manual
Index № 1 (+39907)
Go top