1 of 10

THE METHODOLOGY OF TEXT CLUSTERING APPLICATION ON THE BASE OF LINGUISTIC RULES FOR RESEARCHING THE NEEDS OF THE POPULATION IN SOCIAL PROTECTION AND SOCIAL SECURITY

2 of 10

Table 1. List of Internet sources, information from which was used for analysis

Number

Name of the source

Resource address

Number of texts

1

UkrInform

50

2

Public. News

25

3

Website of the international scientific publication "Financial and credit activity: problems of theory and practice"

7

4

The newspaper «Government Courier»is the official printed publication of the Cabinet of Ministers of Ukraine.

30

5

The official website of the Kyiv Regional Council of Professional Unions

5

6

Official website of the National Bank of Ukraine

10

7

The official site of the magazine «Forbes Ukraine»

15

8

Website of the electronic publication «Sudovo-yuridychna Gazeta»

20

3 of 10

�Table 2. Statistical characteristics of the classification model of the studied texts

Statistics

Data set

training

test

TP ( true positive )

30

11

TN ( true negative )

67

26

FP ( false positive )

10

6

FN ( false negative )

7

5

M.I.S.C., %

(proportion of incorrectly classified values)

15

23

Ginny

0.82

0.71

ROC

0.79

0.67

4 of 10

Fig.1 – ROC curve for the built classification model

5 of 10

The obtained values were normalized on a scale from 0 to 100 according to formula:

6 of 10

Table 3. Results of cluster analysis of textual information on issues of social protection and social security by regions of Ukraine

Name of the region

Popularity of the texts of the corresponding cluster

Cluster 1 �(pension reform)

Cluster 2 �(accrual and payment of pensions and social benefits by the Pension Fund of Ukraine)

Cluster 3 �(problems of social protection of internally displaced persons)

Cluster 4 �(issues related to losses due to military conflict)

Cluster 5 �(issues of social protection and social security of refugees)

Cluster 6. �(issues related to victims of the accident at the Chernobyl nuclear power plant)

Vinnytsia region

94

65

24

72

79

Volyn region

87

57

20

100

63

the city of Kyiv

82

49

32

37

26

33

the city of Sevastopol

-

-

-

-

-

-

Dnipropetrovsk region

58

39

43

33

14

Donetsk region

27

32

59

37

Zhytomyr region

94

73

19

34

62

88

Transcarpathian region

67

45

29

40

75

Zaporizhzhia region

58

39

90

30

Ivano-Frankivsk region

87

66

24

63

72

Kyiv region

84

42

28

37

37

100

7 of 10

Table 3. Results of cluster analysis of textual information on issues of social protection and social security by regions of Ukraine (continuation 1)

Name of the region

Popularity of the texts of the corresponding cluster

Cluster 1 �(pension reform)

Cluster 2 �(accrual and payment of pensions and social benefits by the Pension Fund of Ukraine)

Cluster 3 �(problems of social protection of internally displaced persons)

Cluster 4 �(issues related to losses due to military conflict)

Cluster 5 �(issues of social protection and social security of refugees)

Cluster 6. �(issues related to victims of the accident at the Chernobyl nuclear power plant)

Kirovohrad region

92

88

32

73

46

Autonomous Republic of Crimea

-

1

1

-

-

-

Luhansk region

22

8

Lviv region

73

45

20

60

57

Mykolayiv region

76

70

64

47

18

Odesa region

32

24

27

13

13

Poltava region

75

63

32

73

42

77

Rivne region

100

64

17

81

100

Sumy region

92

100

52

43

30

Ternopil region

50

56

24

63

Kharkiv region

47

35

100

15

9

8 of 10

Table 3. Results of cluster analysis of textual information on issues of social protection and social security by regions of Ukraine (continuation 2)

Name of the region

Popularity of the texts of the corresponding cluster

Cluster 1 �(pension reform)

Cluster 2 �(accrual and payment of pensions and social benefits by the Pension Fund of Ukraine)

Cluster 3 �(problems of social protection of internally displaced persons)

Cluster 4 �(issues related to losses due to military conflict)

Cluster 5 �(issues of social protection and social security of refugees)

Cluster 6. �(issues related to victims of the accident at the Chernobyl nuclear power plant)

Kherson region

71

62

89

Khmelnytskyi region

87

55

28

78

73

Cherkasy region

87

47

30

50

55

74

Chernihiv region

81

58

24

50

29

Chernivtsi region

83

31

25

1

61

9 of 10

Fig. 2. Cluster 1, popularity of texts on the topic of "Pension reform" by regions of Ukraine

10 of 10

Fig. 3. Cluster 2, the popularity of texts on the topic "Questions related to the pension fund in general" by regions of Ukraine.