- Digitalization of knowledge
- Human language technologies
- Bioinformatics
- Data mining
- Spatial data
- Mathematical modelling
Application of Human Language Technologies
"Human Language Technologies" denotes a set of software algorithms, tools and resources for processing texts written in natural languages. These types of activities can be seen as an example of the knowledge digitalization about language; however, because of their specifics and great applicability, we have chosen to separate those activities from other types of digitalization.
Application of human language technologies in biological and biotechnological sciences involves:
- development of terminological dictionaries and taxonomies,
- setting up standards and specifications in textual data processing,
- development of meta data and schemas for annotating data in biological and biotechnological sciences,
- development and application od ontologies (ordered sets of terms and expressions, with clearly defined relations and hierarchy),
- automatic acquisition of knowledge from texts in different formats, its organization and archiving etc.
Generally speaking, all of these activities are very sophisticated. In biotechnology, ontologies are being used for annotating data such as sequences, genes, experiments etc. Annotated data can then be used in a number of ways: for linking databases, for complex search systems, for knowledge transfer and so on. There are several well developed ontologies in biological and agricultural domain, such as AGROVOC, Gene Ontology, EUROVOC, Plant Ontology and others.
Text mining is one of the most recent text processing developments. It includes techniques for extracting relevant, specific information and knowledge from large text corpora (scientific articles, encyclopedias, and it is especially valuable for medical and biological literature processing.
When it comes to Serbia and Serbian language, the situation is quite different. In our country, only a few researchers are familiar with these technologies. Researchers from other scientific fields who could potentially benefit from the human language technologies are mostly not aware of the benefits and possibilities that these techniques can give them. Moreover, there are not as many resources for Serbian as for other languages such as English. The reasons are economical (compared to English, a smaller number of people speak Serbian) and language-specific (a rich morphological system). Some of the resources developed for Serbian can be found on Human Language Technologies Group web site.