Summary

 

Data-to-text systems are one of the most interesting tools to provide solutions in the area of accessible, intelligent and friendly communication with information through the increasingly sophisticated computing devices for a multimodal communication offered by the market, such as smartphones, tablets, wearable devices, etc. One of the strongest challenges in this area, is the interaction with visual information using natural language. The LiDViS (LInguistic Description of Visual Information using data mining and Soft computing techniques) project aims to contribute to this challenge by providing a significant step towards the automatic generation of linguistic descriptions of an image using natural language expressions close to the user.

Generating linguistic descriptions of data consists of two main tasks: a task that involves a process of knowledge extraction, which in a broad sense can be considered as a process of knowledge discovery in databases (KDD) and another one representing a process of linguistic expression that enhances the understandability and usefulness of the knowledge acquired through an appropriate narrative text using natural language. The integration of data mining techniques based on Flexible Computing with Natural Language Generation has proven to be an appropriate approach to tackling the design and development of systems for automatically generating linguistic descriptions.

The linguistic description of images is a particular case of linguistic description of data, with the peculiarity that, since the images are unstructured data, a preliminary step is necessary to provide a structured representation of visual information. Since this representation is to be used as input to build a linguistic description, it should be formed by semantic representations of visual concepts that can be used by humans to describe their perception of the information contained in the image.