Data Mining

Data mining is a scientific subfield responsible for the efficient processing of large sets of data. It combines mathematics, statistics and computer science. The rapid development of IT has also produced a lot of data and it is often hard to find any connection between the observed parameters. If we are not sure what we are looking for it is impossible to use classical statistical methods. With data mining however, it is possible to seek data sequences, classify, predict, cluster and summarize the data. Even though these methods primarily rely on statistics, they are also enriched with algorithms suitable for the processing of big sets of data.

One type of data mining is text mining. Text mining is important because it helps users and scientists extract different kind of information from a big number of texts and scientific articles. For example, the use of some herbicide is most likely described in various texts and from each of them we can find out about its different characteristics such as its efficiency, side effects and others. Since it is impossible to read all of the articles and work out the facts 'manually', we can instead use algorithms for automatic processing of large text corpora, such as text mining and get everything we need quickly and efficiently.