The concept of big data in healthcare has been introduced previously. It refers to the massive amounts of data generated by the healthcare industry that are too large and difficult to manage with typical database management systems. Big data analytics can help analyze and make sense of this data, improving healthcare quality, reducing costs, and facilitating timely decision-making. Most of the review focuses on the sources of big data and the benefits of big data in healthcare but rarely on big data challenges and strategies to overcome those challenges. The aim and objective of this review are to provide an overview of the characteristics of big data, its potential benefits and challenges, and its impact on the future of medicine. It also explores the challenges associated with big data in healthcare and strategies to overcome those challenges in healthcare big data which include data quality assessment, data governance and privacy, cloud computing, data analytics, data sharing and collaboration, interoperability, patient engagement, data visualization, data security and protection, and continuous improvement. Interoperability can help manage and analyze data, and patient engagement can increase the volume and diversity of data available for analysis. Data visualization techniques can help healthcare organizations better understand and communicate insights from big data. As the quantity of healthcare data continues to increase, the opportunity to utilize this data to enhance healthcare delivery and results will increase. Ultimately, big data in healthcare has the potential to lead to better patient outcomes, cost reduction, faster medical research, improved population health, and personalized medicine.
Keywords: Big data, Healthcare data, Electronic medical records, Big data analytics (BDA), Data sources, Patients.
Full HTML:
References
D. 3D data management: controlling data volume, velocity and variety. META group research note. Vol. 6(10); 2001.
Philip Chen CL, Zhang CY. Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci. 2014;275:314-47. doi: 10.1016/j.ins.2014.01.015.
Dash S, Shakyawar SK, Sharma M, Kaushik S. Big data in healthcare: management, analysis and future prospects. J Big Data. 2019;6(1):54. doi: 10.1186/s40537-019-0217-0.
Kohli R, Tan SS. Electronic health records: how can IS researchers contribute to transforming healthcare? MIS Q. 2016;40(3):553-73. doi: 10.25300/MISQ/2016/40.3.02.
Costa FF. Big data in biomedicine. Drug Discov Today. 2014;19(4):433-40. doi: 10.1016/j.drudis.2013.10.012, PMID 24183925.
Leff DR, Yang G. Big data for precision medicine. Engineering. 2015;1(3):277-9. doi: 10.15302/J-ENG-2015075.
Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential. Health Inf Sci Syst. 2014;2:3. doi: 10.1186/2047-2501-2-3, PMID 25825667.
Asante-Korang A, Jacobs JP. Big Data and paediatric cardiovascular disease in the era of transparency in healthcare. Cardiol Young. 2016;26(8):1597-602. doi: 10.1017/S1047951116001736, PMID 28148322.
Cox M, Ellsworth D. Application-controlled demand paging for out-of-core visualization. Proceedings. 1997:235-244,. doi: 10.1109/VISUAL.1997.663888.
Diebold FX. Big data dynamic factor models for macroeconomic measuring and forecasting. Advances in economics and econometrics. Eighth World Congress of the Econometric Society; 2003. p. 115-22.
Laney D. META delta. Applied delivery strategies. Vol. 949(4); 2001.
Manyika J, Chui M, Big Data BB. The next frontier for innovation, competition, and productivity. McKinsey Global Institute; 2011. p. 2018.
Feldman B, Martin EM, Skotnes T. Big data in healthcare-hype and hope. Semant Sch. 2012;1:122-25.
Scruggs SB, Watson K, Su AI, Hermjakob H, Yates JR 3rd, Lindsey ML et al. Harnessing the heart of big data. Circ Res. 2015;116(7):1115-19. doi: 10.1161/CIRCRESAHA.115.306013, PMID 25814682.
Gui H, Zheng R, Ma C. An architecture for healthcare big data management and analysis. International Conference on Health Information Science; 2016. p. 154-60.
Cyganek B, Graña M, Krawczyk B, Kasprzak A, Porwik P, Walkowiak K et al. A survey of big data issues in electronic health record analysis. Appl Artif Intell. 2016;30(6):497-520. doi: 10.1080/08839514.2016.1193714.
Sukumar SR, Natarajan R, Ferrell RK. Quality of Big Data in health care. Int J Health Care Qual Assur. 2015;28(6):621-34. doi: 10.1108/IJHCQA-07-2014-0080, PMID 26156435.
Belle A, Thiagarajan R, Soroushmehr SM, Navidi F, Beard DA, Najarian K. Big data analytics in healthcare. BioMed Res Int. 2015;2015:370194. doi: 10.1155/2015/370194, PMID 26229957.
Weng C, Kahn MG. Clinical research informatics for big data and precision medicine. Yearb Med Inform. 2016;25(1):211-8. doi: 10.15265/IY-2016-019, PMID 27830253.
Sun J, Reddy CK. Big data analytics for healthcare. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining – KDD ’13, Chicago, USA; 2013:1525-. doi: 10.1145/2487575.2506178.
Raja R, Mukherjee I, Sarkar BK. A systematic review of healthcare big data. Sci Program. 2020;2020:1-15. doi: 10.1155/2020/5471849.
Swan M. The quantified self: fundamental disruption in big data science and biological discovery. Big Data. 2013;1(2):85-99. doi: 10.1089/big.2012.0002, PMID 27442063.
Smith C. How many people use the top social media? [internet]; 2011. [cited Mar 17, 2014].
IHTT. Transforming Health Care through Big Data: strategies for leveraging big data in the health care industry; 2013.
2TF7 healthcare subgroup. Big Data Technol Healthc Needs Oppor Chall. 2016.
Phillips-Wren G, Iyer LS, Kulkarni U, Ariyachandra T. Business analytics in the context of big data: a roadmap for research. Commun Assoc Inf Syst. 2015;37(1):23. doi: 10.17705/1CAIS.03723.
Watson HJ. Tutorial: big data analytics: concepts, technologies, and applications. Commun Assoc Inf Syst. 2014;34(1):65. doi: 10.17705/1CAIS.03465.
Gandomi A, Haider M. Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manag. 2015;35(2):137-44. doi: 10.1016/j.ijinfomgt.2014.10.007.
Delen D, Data Mining R-W. Applied business analytics and decision making. Upper Saddle River, NJ: FT Press; 2014.
Riabacke M, Danielson M, Ekenberg L. State-of-the-art prescriptive criteria weight elicitation. Adv Decis Sci. 2012;2012:1-24. doi: 10.1155/2012/276584.
Oussous A, Benjelloun FZ, Ait Lahcen AA, Belfkih S. Big data technologies: a survey. J King Saud Univ Comput Inf Sci. 2018;30(4):431-48. doi: 10.1016/j.jksuci.2017.06.001.
Zikopoulos P, Deroos D, Parasuraman K, Deutsch T, Giles J, Corrigan D. Harness the power of big data: the IBM big data platform. New York: McGraw-Hill Professional; 2012.
Zikopoulos P, Eaton C. Understanding big data: analytics for enterprise class Hadoop and streaming data. New York: McGraw-Hill Osborne Media; 2011.
Oussous A, Benjelloun F-Z, Ait Lahcen A, Belfkih S. Big data technologies: a survey. J King Saud Univ Comput Inf Sci. 2018;30(4):431-48. doi: 10.1016/j.jksuci.2017.06.001.
Rajaraman V. Big data analytics. Resonance. 2016;21(8):695-716. doi: 10.1007/s12045-016-0376-7.
Idris M, Hussain S, Ali M, Abdulali A, Siddiqi MH, Kang BH et al. Context-aware scheduling in MapReduce: a compact review. Concurrency Computat.: Pract Exper. 2015;27(17):5332-49. doi: 10.1002/cpe.3578.
Senger H, Gil-Costa V, Arantes L, Marcondes CAC, Marín M, Sato LM, et al. BSP cost and scalability analysis for MapReduce operations. Concurrency Computat.: Pract Exper. 2016;28(8):2503-27. doi: 10.1002/cpe.3628.
The Hadoop Ecosystem: HDFS, YARN, Hive/Pig, HBase, and growing. Data science central [internet] [cited Apr 6 2023]. Available from: https://www.datasciencecentral.com/profiles/blogs/thehadoop-ecosystem-hdfs-yarn-hivepig-hbase-and-growing.
Bhadani AK, Jothimani D. Big data: challenges, opportunities, and realities. In: Effective big data management and opportunities for implementation. PA, PA: IGI Global; 2016. p. 1-24. doi: 10.4018/978-1-5225-0182-4.ch001.
Khan N, Yaqoob I, Hashem IAT. Big data: survey, technologies, opportunities, and challenges. Sci World J. 2014;2014:1-18.
Alonso SG, de la Torre Díez I, Rodrigues JJPC, Hamrioui S, López-Coronado M. A systematic review of techniques and sources of big data in the healthcare sector. J Med Syst. 2017;41(11):183. doi: 10.1007/s10916-017-0832-2, PMID 29032458.
Fouad MM, Oweis NE, Gaber T, Ahmed M, Snasel V. Data mining and fusion techniques for WSNs as a source of the big data. Procedia Comput Sci. 2015;65:778-86. doi: 10.1016/j.procs.2015.09.023.
Al-Janabi S, Patel A, Fatlawi H, Kalajdzic K, Al Shourbaji I. Empirical rapid and accurate prediction model for data mining tasks in cloud computing environments Int. Congr. Technol Commun Knowledge, ICTCK 2014. Vol. 2015; 2014. p. 1-8. doi: 10.1109/ICTCK.2014.7033495.
Bates DW, Saria S, Ohno-Machado L, Shah A, Escobar G. Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff (Millwood). 2014;33(7):1123-31. doi: 10.1377/hlthaff.2014.0041, PMID 25006137.
Nambu M, Nakajima K, Noshiro M, Tamura T. An algorithm for the automatic detection of health conditions. An image processing technique for diagnosing poor health in the elderly. IEEE Eng Med Biol Mag. 2005;24(4):38-42. doi: 10.1109/memb.2005.1463394, PMID 16119211.
Platt R, Carnahan R, Brown JS. The U.S. Food and Drug Administration’s mini-sentinel program. Pharmacoepidemiol Drug Saf. 2012;21:1-303. doi: 10.1002/pds.3230.
Mayo CS, Moran JM, Bosch W, Xiao Y, McNutt T, Popple R, et al. American Association of Physicists in Medicine task group 263: standardizing nomenclatures in radiation oncology. Int J Radiat Oncol Biol Phys. 2018;100(4):1057-66. doi: 10.1016/j.ijrobp.2017.12.013, PMID 29485047.
Kos A, Umek A. Wearable sensor devices for prevention and rehabilitation in healthcare: swimming exercise with real-time therapist feedback. IEEE Internet Things J. 2018;6(2):1331-41. doi: 10.1109/JIOT.2018.2850664.
Hoens TR, Blanton M, Steele A, Chawla NV. Reliable medical recommendation systems with patient privacy. ACM Trans Intell Syst Technol. 2013;4(4):1-31. doi: 10.1145/2508037.2508048.
Wang G, Jung K, Winnenburg R, Shah NH. A method for systematic discovery of adverse drug events from clinical notes. J Am Med Inform Assoc. 2015;22(6):1196-204. doi: 10.1093/jamia/ocv102, PMID 26232442.
Wang G, Jung K, Winnenburg R, Shah NH. A method for systematic discovery of adverse drug events from clinical notes. J Am Med Inform Assoc. 2015;22(6):1196-204. doi: 10.1093/jamia/ocv102, PMID 26232442.
Luo J, Wu M, Gopukumar D, Zhao Y. Big data application in biomedical research and health care: a literature review. Biomed Inform Insights. 2016;8:1-10. doi: 10.4137/BII.S31559, PMID 26843812.
Doi K. Computer-aided diagnosis in medical imaging: historical review, current status and future potential. Comput Med Imaging Graph. 2007;31(4-5):198-211. doi: 10.1016/j.compmedimag.2007.02.002, PMID 17349778.
Manogaran G, Lopez D, Vijayakumar V, Abbas KM, Sundarsekar R. Big data knowledge system in healthcare. Springer; 2017. p. 133-57.
Heart T, Ben-Assuli O, Shabtai I. A review of PHR, EMR and EHR integration: a more personalized healthcare and public health policy. Health Policy Technol. 2017;6(1):20-5. doi: 10.1016/j.hlpt.2016.08.002.
Wang Y, Kung L, Byrd TA. Big data analytics: understanding its capabilities and potential benefits for healthcare organizations. Technol Forecast Soc Change. 2018;126:3-13. doi: 10.1016/j.techfore.2015.12.019.
Mehta N, Pandit A. Concurrence of big data analytics and healthcare: a systematic review. Int J Med Inform. 2018;114:57-65. doi: 10.1016/j.ijmedinf.2018.03.013, PMID 29673604.
Elshazly H, Azar AT, El-korany A, Hassanien AE. Hybrid system for lymphatic diseases diagnosis. In: Proceedings of the international conference on advances in computing, communications and informatics (ICACCI ’13). IEEE Publications; 2013. p. 343-7. doi: 10.1109/ICACCI.2013.6637195.
Dougherty G. Digital image processing for medical applications. Cambridge University Press; 2009.
Gessner RC, Frederick CB, Foster FS, Dayton PA. Acoustic angiography: a new imaging modality for assessing microvasculature architecture. Int J Biomed Imaging. 2013;2013:936593. doi: 10.1155/2013/936593, PMID 23997762.
Cross SS, Harrison RF. Discriminant histological features in the diagnosis of chronic idiopathic inflammatory bowel disease: analysis of a large dataset by a novel data visualisation technique. J Clin Pathol. 2002;55(1):51-7 [FREE Full text] [Medline]. doi: 10.1136/jcp.55.1.51, PMID 11825925.
Kelly D, C Zhang Q, M Soucie J, Manco-Johnson M, Dimichele D, Joint Outcome Subcommittee of the Coordinating Committee for the Universal Data Collection Database and the Hemophilia Treatment Center Network Investigators. Prevalence of clinical hip abnormalities in haemophilia A and B: an analysis of the UDC database. Haemophilia. 2013;19(3):426-31. doi: 10.1111/hae.12073, PMID 23252621.
Elshazly MB, Martin SS, Blaha MJ, Joshi PH, Toth PP, McEvoy JW, et al. Non-high-density lipoprotein cholesterol, guideline targets, and population percentiles for secondary prevention in 1.3 million adults: the VLDL-2 Study (very large database of lipids). J Am Coll Cardiol. 2013 November 19;62(21):1960-5. doi: 10.1016/j.jacc.2013.07.045, PMID 23973689.
Bernatowicz K, Keall P, Mishra P, Knopf A, Lomax A, Kipritidis J. Quantifying the impact of respiratory-gated 4D CT acquisition on thoracic image quality: a digital phantom study. Med Phys. 2015;42(1):324-34. doi: 10.1118/1.4903936, PMID 25563272.
Scholl I, Aach T, Deserno TM, Kuhlen T. Challenges of medical image processing. Comput Sci Res Dev. 2011;26(1-2):5-13. doi: 10.1007/s00450-010-0146-9.
Liebeskind DS, Feldmann E. Imaging of cerebrovascular disorders: precision medicine and the collaterome. Ann N Y Acad Sci. 2016;1366(1):40-8. doi: 10.1111/nyas.12765, PMID 25922154.
Hussain T, Nguyen QT. Molecular imaging for cancer diagnosis and surgery. Adv Drug Deliv Rev. 2014;66:90-100. doi: 10.1016/j.addr.2013.09.007, PMID 24064465.
Baio G. Molecular imaging is the key driver for clinical cancer diagnosis in the next century! J Mol Imaging Dyn. 2013;2(2):1.
Mustafa S, Mohammed B, Abbosh A. Novel preprocessing techniques for accurate microwave imaging of human brain. IEEE Antennas Wirel Propag Lett. 2013;12:460-3. doi: 10.1109/LAWP.2013.2255095.
Desjardins B, Crawford T, Good E, Oral H, Chugh A, Pelosi F, et al. Infarct architecture and characteristics on delayed enhanced magnetic resonance imaging and electroanatomic mapping in patients with postinfarction ventricular arrhythmia. Heart Rhythm. 2009;6(5):644-51. doi: 10.1016/j.hrthm.2009.02.018, PMID 19389653.
Hussain AM, Packota G, Major PW, Flores-Mir C. Role of different imaging modalities in assessment of temporomandibular joint erosions and osteophytes: a systematic review. Dento Maxillo Fac Radiol. 2008;37(2):63-71. doi: 10.1259/dmfr/16932758, PMID 18239033.
Tempany CM, Jayender J, Kapur T, Bueno R, Golby A, Agar N et al. Multimodal imaging for improved diagnosis and treatment of cancers. Cancer. 2015;121(6):817-27. doi: 10.1002/cncr.29012, PMID 25204551.
Lusher SJ, McGuire R, van Schaik RC, Nicholson CD, de Vlieg J. Data-driven medicinal chemistry in the era of big data. Drug Discov Today. 2014;19(7):859-68. doi: 10.1016/j.drudis.2013.12.004, PMID 24361338.
Ward JC. Oncology reimbursement in the era of personalized medicine and big data. J Oncol Pract. 2014;10(2):83-6. doi: 10.1200/JOP.2014.001308, PMID 24633283.
Strang KD, Sun Z. Analyzing relationships in terrorism big data using Hadoop and statistics. J Comput Inform Syst. 2016;56(5):55-65.
Zaragoza MG, Kim HK, Chung Y. U-healthcare Big data analytics process control. Int J Control Autom. 2017;10(11):165-74. doi: 10.14257/ijca.2017.10.11.15.
Wu J, Li H, Liu L, Zheng H. Adoption of Big data and analytics in mobile healthcare market: an economic perspective. Electron Com Res Appl. 2017;22:24-41. doi: 10.1016/j.elerap.2017.02.002.
Cheng CH, Kuo YH, Zhou Z. Tracking nosocomial diseases at individual level with a real-time indoor positioning system. J Med Syst. 2018;42(11):222. doi: 10.1007/s10916-018-1085-4. PMID 30284042.
Lin YK, Chen H, Brown RA, Li SH, Yang HJ. Healthcare predictive analytics for risk profiling in chronic care: a Bayesian multitask learning approach. MIS Q. 2017;41(2):473-95. doi: 10.25300/MISQ/2017/41.2.07.
Boudhir AA, Ben Ahmed M, Soumaya F. ’Big data architecture for decision making in protocols and medications assignment.’ in Proceedings of the Mediterranean Symposium on Smart City Application, Tangier, Morocco. Acad Med. 2017:(5).
Shao Y, Wang K, Shu L, Deng S, Deng DJ. Heuristic optimization for reliable data congestion analytics in crowdsourced ehealth networks. IEEE Access. 2016;4:9174-83. doi: 10.1109/ACCESS.2016.2646058.
Koliogeorgi K, Masouros D, Zervakis G, Xydis S, Becker T, Gaydadjiev G et al. AEGLE’s cloud infrastructure for resource monitoring and containerized accelerated analytics. In: Computer Society Annual Symposium on VLSI, Bochum, Germany. IEEE Publications; 2017. p. 362-7. doi: 10.1109/ISVLSI.2017.70.
Filkins BL, Kim JY, Roberts B, Armstrong W, Miller MA, Hultner ML et al. Privacy and security in the era of digital health: what should translational researchers know and do about it? Am J Transl Res. 2016;8(3):1560-80. PMID 27186282.
Vaidhyanathan S, Bulock C. Knowledge and dignity in the era of big data. Ser Libr. 2014;6:49-64.
Van Otterlo M. Automated experimentation in Walden 3.0: the next step in profiling, predicting, control and surveillance. Surveill Soc. 2014;12(2):255-72. doi: 10.24908/ss.v12i2.4600.
Strang KD, Sun Z. Hidden big data analytics issues in the healthcare industry. Health Inform J. 2020;26(2):981-98. doi: 10.1177/1460458219854603, PMID 31264509.
Al Ameen M, Liu J, Kwak K. Security and privacy issues in wireless sensor networks for healthcare applications. J Med Syst. 2012;36(1):93-101. doi: 10.1007/s10916-010-9449-4, PMID 20703745.
Zikopoulos P, Eaton C, DeRoos D, et al. Understanding big data: analytics for enterprise class Hadoop and streaming data. New York: McGraw-Hill Osborne Media; 2011.
Wang H, Jiang X, Kambourakis G. Special issue on security, privacy and trust in network-based big data. Inform Sci. 2015;318:48-50. doi: 10.1016/j.ins.2015.05.040.
Angiuli O, Blitzstein J, Waldo J. How to de-identify your data. Commun ACM. 2015;58(12):48-55. doi: 10.1145/2814340.
Shen Y, Zhang Y. Transmission protocol for secure big data in two-hop wireless networks with cooperative jamming. Inform Sci. 2014;281(1):201-10. doi: 10.1016/j.ins.2014.05.037.
Shull F. The true cost of mobility? IEEE Softw. 2014;31(2):5-9. doi: 10.1109/MS.2014.47.
Strang KD, Alamieyeseigha S. What and where are the risks of international terrorist attacks: a descriptive study of the evidence. Int J Risk Conting Manag. 2015;4(1):1-20. doi: 10.4018/ijrcm.2015010101.
Strang KD. Exploring the relationship between global terrorist ideology and attack methodology. Risk Manag J. 2015;17(2):65-90. doi: 10.1057/rm.2015.8.
Bellazzi R, Zupan B. Predictive data mining in clinical medicine: current issues and guidelines. Int J Med Inform. 2008;77(2):81-97. doi: 10.1016/j.ijmedinf.2006.11.006, PMID 17188928.
Binder H, Blettner M. Big data in medical science-a biostatistical view. Dtsch Ärztebl Int. 2015;112(9):137-42. doi: 10.3238/arztebl.2015.0137, PMID 25797506.
O’Connor F. Health-IT early adopters well poised for big-data advances in clinical medicine [internet]; 2013 [cited Mar 17, 2014]. Available from: http://www.computerworld.com/s/article/9238063/Health_IT_early_adopters_well_poised_for_big_data_advances_in_clinical_medicine.
Kuo MH. Data quality assessment for healthcare data. J Data Inf Qual. 2015;6(3):10.
Ristevski B, Chen M. Big data analytics in medicine and healthcare. J Integr Bioinform. 2018;15(3):20. doi: 10.1515/jib-2017-0030.