Statistics is a science that deals with data. In its development, statistics is used as a means to inform the decision-making process in the face of uncertainties that exist in the fields of science and humanities. This happens because statistics has the concepts of randomness, variability, error, and probability. Statistics is starting to develop as part of a combination of sciences such as data science and biostatistics.
Prof. Dr. Dra. Titin Siswantining, DEA, through her research entitled “The Role of Statistics in Data Science in Predicting Healthcare Intelligence to Welcome the Era of Society 5.0”, revealed that data science emerged as a combination of STEM and social sciences. The sciences that are the main support in data science consist of mathematics, statistics, computer science, information systems, management, and communication science. Data science uses statistics to collect, review, analyze, and draw conclusions from data, as well as apply measured mathematical models to appropriate variables.
Meanwhile, Society 5.0 is a concept that defines that technology and humans will coexist in order to improve the quality of human life in a sustainable manner. Data science is at the root of these technologies. Its usefulness can be felt in various fields, including healthcare. The role of statistics in healthcare is aided by machine learning methods.
Machine learning is a field of science that develops algorithms or models that can extract knowledge from data, just like the human learning process. Machine learning can be used to replace the role of humans especially for data that is large, complex, and needs a quick response, such as in the world of healthcare.
The phenomenon of healthcare that involves technology and digitization to record, process, and predict is called intelligence healthcare. In utilizing statistics and machine learning in healthcare, it is highly recommended to ensure the state of the data, whether it is ready to be processed or not. This is because the data found is often incomplete, illegible, or outlier data, i.e. values that are too low or high.
The application of machine learning in the healthcare field that is integrated with the care management process, utilization, and accommodating the needs of the target population is covered in intelligent healthcare. The concrete role of statistics and data science is in the application of clustering, predicting, and data imputation methods. The data to be inputted in the case of healthcare intelligence is diverse, ranging from microarray data, DNA chains, CT scans, patient data, and protein interaction data.
Machine learning is divided into supervised learning, unsupervised learning, and reinforcement learning. Research in the health sector using machine learning methods in the supervised learning sub-field, including Classification of Diabetic Retinopathy Stages Using Histogram of Oriented Gradients and Shallow Learning (2018); Feature Selection Using Random Forest Classifier for Predicting Prostate Cancer (2019); Ovarian Cancer Classification Using Bayesian Logistic Regression (2019); Multiclass Classification of Acute Lymphoblastic Leukemia Microarrays Data Using Support Vector Machine Algorithms (2020); Kernel PCA and SVM-RFE Based Feature Selection for Classification of Dengue Microarray (2020); and Covid-19 Classification Using X-Ray Imaging with Ensemble Learning (2021).
Meanwhile, research related to unsupervised learning is done through clustering, which is a method of grouping unlabeled data. Clustering evolved into biclustering and triclustering. Biclustering is a data mining technique that allows clustering of matrix rows and columns simultaneously. Whereas a tricluster is built from two datasets by selecting a subset of features from each dataset and a subset of rows shared among all rows. Triclustering is an extension of clustering and biclustering methods that work on three-dimensional data.
The research titled “Triclustering Method for Finding Biomarkers in Human Immunodeficiency Virus-1 Gene Expression Data” reflects the role of statistics in data science in predicting healthcare intelligence in the era of society 5.0. In general, society 5.0 is a combination of IoT (Internet of Think), Big Data, and AI (Artificiel Intelligence). Biostatistics is a tool in predicting the Healthcare Intelligence of a patient in a hospital.
“We hope that in the future a research platform or incubator and Biostatistics laboratory will be created in FMNS UI as a pioneer of Artificial Intelligence-Quality Improvement (AI-QI) in Indonesia to welcome the era of society 5.0 that we will face,” said Prof. Titin.
After the speech, Prof. Titin was officially inaugurated as a Permanent Professor of Statistics, Faculty of Mathematics and Natural Sciences (FMNS), Universitas Indonesia (UI). The inauguration of the professor was led by UI Rector, Prof. Ari Kuncoro, SE, MA, Ph.D. and broadcast live virtually through the UI Teve Youtube channel.
The event, which was held on Saturday (6/8), was attended by invited guests, including Pangdam Iskandar Muda, Major General Moh. Hasan; Head of the Center for Social Welfare Data and Information, Prof. Dr. Agus Zainal Arifin, S.Kom., M.Kom.; Chairman of the Adhoc PAK UI Team, Prof. Heru Suhartanto, Drs, M.Sc., Ph.D.; UGM Professor, IndoMS Advisor, Prof. Dr. Sri Wahyuni, S.U.; CEO of Global Risk Management (GRM), Rinaldi Anwar, S.Si, MM, FSAI; Ph.D. Supervisor, Professor Queensland University of Technology (QUT), Australia, Prof. Dr. Kevin Burrage. Kevin Burrage; Professor Perdana University (Malaysia), Prof. Mohammad Asif Khan, Ph.D.; Professor UNPAD, Advisor IndoMS, Prof. Dr. Budi Nurani Ruchjana, MS; Chairman SAU UNAND, Prof. Dr. Syafrizal; and Professor ITB, Pehimpuanan Biomathematics Indonesia, Prof. Edy Soewono, Ph.D..
Prof. Dr. Dra. Titin Siswantining, DEA is a Lecturer at the Department of Mathematics, FMNS UI. She completed her Bachelor of Statistics studies at the Sepuluh Nopember Institute of Technology, Surabaya in 1984; DEA en Mathematique Applique, EHESS – Universite de Paris V in 1990; and Doctoral Degree in Statistics, Bogor Agricultural University in 2013.
Some of her published scientific works are Pathway-Based Triclustering and Gene Onthology in Analyzing Gene Sample Time in Cancer Data (2022); Genomic Study with The Application of Triclustering Algorithm to Predict Chronic Diseases Using Machine Learning Method (2020/2021); Parallel Clustering Algorithms and Implementations for Big Data Analytic (2020/2021); Implementation of 3D Microarray Gene Expression Data using δ-Trimax, EDISA, and OPTricluster Algorithms (2020/2021); and Computer-Aided Diagnosis (CAD) for Early Detection of Diabetic Retinopathy (2020/2021).