Skip to main content

A Simplified English Grammar and Style Checker/Corrector

Objective

Many companies are nowadays operating in an international, multilingual environment. In such an environment, fast and effective communication in the language of the customer is a key factor for success. Using a reduced subset of a language, companies are able to produce consistent, unambiguous and easily understandable handbooks, technical documentation and training material. The task of technical writers taking advantage of simplified language would be greatly simplified if they could rely on specific writing tools for simplified languages. The SECC project intends to develop precisely such a tool, namely a grammar and style checker for simplified/controlled English (SE).

The SECC tool is meant to serve two purposes. In the first place, SECC will be a writing tool for technical writers who have to produce easily readable, unambiguous texts (manuals, for instance) that should be understandable by a wide audience of non-native speakers/writers of English. In the second place, the tool should be able to serve as a front-end or pretranslator for machine translation products helping to improve translation quality and reduce postedition work by simplifying the input.

The tool will perform syntactic and lexical (terminological) checking, on all levels of the text. As such, it goes beyond the usual upper boundary of the sentence: the syntax (layout) of paragraphs, sections and overall text will also be checked against the SE rules governing those levels. Special attention will be paid to mistakes by non-native (viz. Dutch, French and German) writers of SE.

The different interfaces for the user will form an important AI/IT subpart of the project. The SECC tool will run both in batch mode (checking of completed texts) and in interactive mode (checking of subparts of a text while it is written), from within the Interleaf5DTP package on Sun workstations.

Beside the major objective of developing the tool, the project will also set up an industrial interest group working together on international developments related to controlled language, and organize an international SE workshop, bringing together developers and users to propagate the use of SE.

Approach and Methodology ..SP 1 The SECC tool will be based on existing NLP technologies, being built within an existing machine translation framework, and it will reuse NLP and linguistic resources. The task of the tool will be to translate from English to a subset of it (SE); in this respect, SECC will not limit itself to the output of diagnoses of mistakes, it will also attempt to correct (translate) erroneous sentences as much as possible. The tool will reuse the analysis component for English (grammar and lexicon) of the Metal MT system in order to do a thorough syntactic analysis of the input. For the SE rules and lexicon, again existing sources will be reused. A solid 140-rule grammar of SE developed in the context of the telecommunication subdomain of telephony, as well as a union of electronically available existing basic SE lexicons plus technical terminology will together form the "transfer" modules of the checker/corrector. This MT approach to syntax checking has already been successfully applied (albeit only experimentally) t German in the context of the ESPRIT TWB project, using the same system.

Interface developments will include the complexities of communication between the DTP package and the NLP application, user-friendly interfaces using the Motif standard, hypertext-like presentation of the checker's output, and internal representation of the output using the SGML standard. SGML-related tools will also be used to develop the checking and correcting modules beyond the sentence level.

Exploitation and Future Prospects

The SE Grammar and Style Checker/Corrector will be a powerful tool that can be used by any organization with strong needs for efficient communication (text production and translation) in an international context or market.

First of all, the tool will be used in house by one of the partners in the production process of texts for technical courses and telecom user documentation. In addition, it is planned to offer SECC as a product all over Europe through the commercial divisions of the partners involved.

In order to get as broad a dissimination of results as possible, descriptive results (user requirements, overall system approach, academically interesting results relating to grammar, lexicon, restricted languages, etc.) will be made public.

As to the future technological prospects, SECC will be part of the EUROLANG developments, aiming at offering a widespread European NLP platform, based on the same technology as SECC.

Coordinator

Siemens Nixdorf Software Center Liege (CSL)
Address
Rue Des Fories 2
4020 Liege
Belgium

Participants (4)

Alcatel BELL
Belgium
CAP Gemini Innovation
France
Sietec Systemtechnik München
Germany
Address

München
University of Leuven (KUL CCL)
Belgium