Software tools and technologies for analyzing the flow of Internet memes

Memetics of social networks is a popular section of scientific research. The article deals with the problems of meme distribution, mathematical modeling of distribution processes, and tools for socio-political research. It is shown that the life cycle of a stream of Internet memes and a separate meme has its own specifics and ecology. The task of identifying the real stage of the life cycle (LC) is much more difficult than for the economic LC of the enterprise. In General, the problem is incorrect, depending on the availability of data about the selected meme stream in the network. Network monitoring for the identification of the meme LC is associated with the query system, the technology of automatic database formation (knowledge) and its subsequent use in forecasting based on the conclusion by analogy with the use of neural network approaches. The initial stage of the research project on the flow of Internet memes is considered.

Online social network communities act as consumers of information flows that build the “artificial intelligence” of their members. Among different types of information circulating within social networks, of particular interest are Internet Memes (IM). They are presented in a visual, easy to understand image-based form and have a viral spreading pattern. The already developed process of IM flow propagation contributes to the formation of both positive and negative stereotypes. Social and political effects of IM are mild and manageable. The information field is self-organized according to the principle of least resistance in a destructive direction, and a great effort is required to manage such a process. That said, comprehensive interdisciplinary studies aimed at examining the IM flow seem quite relevant and are in high demand. An emerging area of interdisciplinary research called “memetics” makes an impact on algorithms for solving the NP-hard problems of discrete optimization in the form of evolutionary algorithms related to the viral nature of information propagation on the Internet. And vice versa, the methods of studying high-dimensional complex networks together with associated optimization problems are implicated in the analysis of processes occurring in online networks.
While it is possible to examine the life cycle of a single meme, its circulation often gives rise to a flow of derivative memes. Depending on the introduced concept of a meaning proximity (analogy, precedence), the calculated flow rates and intensity parameters will be different. Taking into account the probabilistic nature of the process, it is still important to be able to work with a single meme or a small number of memes. Typically, the life cycle is qualitatively displayed as a graph of a function that depends on time, with a characteristic increase, maximum value, period of stabilization and degradation (the function value tends to zero). It is difficult to find out a stage related to the IM flow under examination. The needed parameters can be extracted from a close data set relevant to the analyzed memes as a result of regular monitoring of the process. At the same time, quantitative characteristics must be measured in different parts of a circulation network, which is complicated. As a result, to identify the life cycle of the IM flow, it is necessary to involve expert communities, mathematical modeling, as wells as the Big Data and Data Mining technologies. Based on the logic of dynamic systems, mathematical models of spread of viral diseases, rumors, diffuse processes, etc., require adaptation to networks that change over time. In the simplest-case scenario, it might be sufficient to obtain statistical data on the quantity, frequency, and so on for the flow of tested memes followed by regression and factor analyses. On the other hand, similar to high-dimension dynamical systems, one can expect the presence of channels and jokers – low-dimension models that can qualitatively reflect the ongoing processes of IM propagation.
The identification and prediction of the IM flow life cycle are primarily centered on studying the IM effects on the activist youth audience and effective management needed to eliminate possible destructive influences. For example, the life cycle of the IM “cats” would let us study the audience most sensitive to such an influence, and a corresponding cluster of related communities. Of note is that only indirectly measured data would be available for further analysis. The acts of creating and propagating one’s own IM flow must comply with actual legal prohibitions and regulations. Of most interest is to find a prospective test IM, which appears to be quite doable given the contingent nature of meme emergence. That said, studying the IM life cycles is of great importance and implies the creation of relevant tools for accumulating data, analyzing the processes of IM propagation and making a corresponding software product to help process memes in automatic mode.

The paper provides the results of the initial stage of work on the project designed to study a spread dynamics of Internet memes. The importance of developing specific tools for collecting, processing and studying the IM life cycle is emphasized. Here we have elaborated a general structure, visualization methods and ways of implementing a software product in the life cycle analysis. For its further development, we intend to implement a neural network approach for the tasks of intellectualized processing of the flow of Internet memes in order to give an estimate of their impact on the audience of Internet communities.