Preview

Izmeritel`naya Tekhnika

Advanced search
Open Access Open Access  Restricted Access Subscription Access

Generalization of Jaccard index for interval data analysis

https://doi.org/10.32446/0368-1025it.2022-12-15-22

Abstract

The article is devoted to the analysis of data samples with interval uncertainty. We propose to use the Jaccard measure (index), which is widely used when comparing sets in various problem areas, as a measure (functional) of the compatibility of interval values and their samples. Information about interval analysis, classical and complete (Kaucher) interval arithmetic is presented. For interval quantities, the necessary concepts and definitions of operations are introduced, in particular, generalizations of the concepts of intersection and union of sets. The Jaccard measure is generalized to the case of data with interval uncertainty and samples of interval data. Various variants of interval relations are described in detail – from their coincidence to incompatible cases. Various definitions of the Jaccard measure, both symmetric and non-symmetric with respect to operands, are considered. The question of the connection of the proposed measure with the interval mode and for estimating the results of calculations with twins is discussed. A practical example of finding the information set of an interval problem using a new measure is given.

About the Authors

A. N. Bazhenov
Ioffe Institute; Peter the Great St. Petersburg Polytechnic University, Institute of Applied Mathematics and Mechanics
Russian Federation

Alexander N. Bazhenov

 St. Petersburg



A. Yu. Telnova
Ioffe Institute
Russian Federation

Anna Yu. Telnova

 St. Petersburg



References

1. Semkin B. I., On the relation between mean values of two measures of inclusion and measures of similarity, Byulleten’ Botanicheskogo sada-instituta DVO RAS, 2009, vol. 3, рр. 91–101. (In Russ.)

2. Kearfott R. B., Nakao M. T., Neumaier A., Rump S. M., Shary S. P., van Hentenryck P., Standardized notation in interval analysis, Computational Technologies, 2010, vol. 15, no. 1, pp. 7–13.

3. Shary S., Numerical computation of formal solutions to interval linear systems of equations, arXiv:1903.10272v1 [math. NA]. https://doi.org/10.48550/arXiv.1903.10272

4. Kabir S., Wagner C., Havens T. C., Anderson D. T., Aickelin U., IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2017), 2017. https://doi.org/10.1109/FUZZ-IEEE.2017.8015623

5. Wilkin T., Beliakov G., IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2019), 2019, pp. 1–6. https://doi.org/10.1109/FUZZ-IEEE.2019.8858850

6. Shaily Kabir, Christian Wagner, Zack Ellerby, Towards Handling Uncertainty-at-Source in AI – A Review and Next Steps for Interval Regression, arXiv:2104.07245 [cs.LG]. https://doi.org/10.48550/arXiv.2104.07245

7. Bazhenov A. N., Zhilin S. I., Kumkov S. I., Sharyj S. P., Processing and analysis of data with interval uncertainty, 2022, available at: http://www.nsc.ru/interval/Library/ApplBooks/InteData Processing.pdf (accessed: 10.11.2022).

8. Hu C., Hu Z. H., On statistics, probability, and entropy of interval-valued datasets, Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2020. Communications in Computer and Information Science, eds. Lesot M. J. et al., Cham, Springer, 2020, vol 1239. https://doi.org/10.1007/978-3-030-50153-2_31

9. Nesterov V. M., Twin arithmetics and their application in methods and algorithms of two-sided interval estimation: diss. doct. phis. math. sci. St. Petersburg (St. Petersburg Institute of Informatics and Automation of the Russian Academy of Sciences), 1999. (In Russ.)

10. Shary S., Computational technologies, 2017, vol. 2, no. 2, pp. 150–172. (In Russ.) http://dx.doi.org/10.14529/mmph170105

11. Shary S., Journal of Computer and Systems Sciences International, 2017, vol. 56, no. 6, pp. 897–913. https://doi.org/10.7868/S0002338817060014

12. Shary S., Identifi cation of outliers in the maximum matching method in the analysis of interval data, Proceedings of the All-Russian Conference in Mathematics with International Participation “MAC-2018”, Barnaul, AltGU Publishing House, 2018, pp. 215–218. (In Russ.)

13. Shary S., On a variability measure for estimates of parameters in the statistics of interval data, Computational technologies, 2019, vol. 24, no. 5, pp. 90–108. (In Russ.) https://doi.org/10.25743/ICT.2019.24.5.008

14. Shary S. P., Data fi tting problem under interval uncertainty in data, Industrial laboratory. Diagnostics of materials, 2020, vol. 86 (1), pp. 62–74. (In Russ.) https://doi.org/10.26896/1028-6861-2020-86-1-62-74

15. Zhilin S. I., Reliable Computing, 2005, vol. 11, рр. 433– 442. https://doi.org/10.1007/s11155-005-0050-3

16. Zhilin S. I., Chemometrics and Intelligent Laboratory Systems, 2007, vol. 88, no. 1, рр. 60–68. https://doi.org/10.1016/j.chemolab.2006.10.004

17. Kumkov S. I., Processing of experimental data on ionic conductivity of molten electrolyte by methods of interval analysis, Russian Metallurgy (METALLY), 2010, no. 3, pp. 79–89. (In Russ.)

18. Kumkov S. I., Mikushina Yu. V., Reliable Computing, 2013, vol. 19, рр. 197–214.

19. Nguyen H. T., Kreinovich V., Wu B., Xiang G., Computing Statistics under Interval and Fuzzy Uncertainty. Applications to Computer Science and Engineering, Springer, Berlin-Heidelberg, 2012. https://doi.org/10.1007/978-3-642-24905-1


Review

For citations:


Bazhenov A.N., Telnova A.Yu. Generalization of Jaccard index for interval data analysis. Izmeritel`naya Tekhnika. 2022;(12):15-22. (In Russ.) https://doi.org/10.32446/0368-1025it.2022-12-15-22

Views: 560


ISSN 0368-1025 (Print)
ISSN 2949-5237 (Online)