In this blog post, we explore the innovative possibilities of big data and how to balance the ethical dilemma of privacy infringement and control.
As evidenced by the fact that companies such as Facebook, Twitter, and Google, which rely solely on digital data, are among the top 10 companies in the world in terms of market capitalization, big data has enormous potential. In fact, Google was able to predict the timing of the flu epidemic much faster and more accurately than the US health authorities simply by using Google search data from Americans in a field where no research had been conducted. In addition, the US police analyzed crime data (time, rate, and location of crimes) to build a platform, which they used to successfully predict crimes and significantly reduce the crime rate in the state. As such, big data is having a powerful impact in many fields, including medicine, society, education, and business, heralding the arrival of a new era of big data.
First, let’s look at the definition of big data. The book Big Data World defines big data as data that exceeds a certain capacity, while another book, Big Data, defines it as “data generated in a digital environment that is vast in scale, short in generation cycle, and includes not only numerical data but also text and image data.” The market research firm IDC defines big data as data that has economic value among vast amounts of data. As such, the meaning and value of big data can vary depending on who defines it, how it is viewed, how it is used, and the given environment. In order to understand the various values of big data, we will first look at the advantages of big data and how it is used in society.
The invention of new scientific tools has enabled research that has made a significant mark on the history of science. The invention of the telescope led to tremendous advances in astronomy, and the microscope revealed the existence of cells and atoms in biology and chemistry, presenting a new paradigm. The invention of these new scientific tools enabled us to see things we could not see before, leading to a revolutionary shift in thinking. Here, we can see the advantages and future role of big data. Countless trivial actions and events that we overlook in our daily lives are recorded and analyzed as big data, enabling us to observe things that we could not see before. Furthermore, unlike previous scientific tools, big data can be utilized in all fields, and will therefore bring about positive changes throughout society in the future.
Social sciences have contributed to the analysis of various social phenomena and the establishment of many theories to date. However, the social science research methods used for analysis, such as surveys, direct observation, and aptitude tests, have been in use for hundreds of years. Furthermore, these methods are subject to interpretation depending on the researcher, and there are clear limitations to the scale of the research. MIT researchers have developed a sociometer, a sensor that can accurately record social phenomena, to analyze various human behaviors. A representative example is the “speed dating experiment,” in which the success rate of dates was predicted with a high degree of accuracy by analyzing data collected solely from social cues, without referring to the content of conversations. In addition, it has become possible to conduct objective and accurate analyses based on vast amounts of data for research that was previously impossible, such as salary negotiations and corporate culture.
As in the above example, big data analyzes individuals and collects information to enable research that was previously impossible and create new value. However, recording all of an individual’s characteristics and behaviors raises ethical issues related to personal information protection. In fact, a large supermarket in the United States continuously sent pregnancy-related products and coupons to high school girls based on search big data. The father, who was offended, called to complain, but after some time, he found out that his daughter was pregnant and immediately called back to apologize. The fact that a company that owned big data was able to find out about his daughter’s pregnancy raises not only privacy issues but also ethical issues, as a company found out about such an important matter before the family did.
Such issues surrounding personal information are morally problematic and act as a major obstacle to technological development. Many efforts have been made to integrate individual medical records to prevent unnecessary or duplicate prescriptions and tests and provide improved personalized medical services, but none of these efforts have even gotten off the ground due to personal information protection laws. For example, South Korea’s SK Group launched a project to start a medical revolution, but it was abolished due to legal restrictions. On the other hand, the US and China have changed their stance on this issue and are actively promoting open data policies and precision medicine plans to become global leaders in personalized healthcare services. However, the scope of data disclosure and the extent of authority to be granted remain major social issues in each country.
As a solution to these personal information issues, the initial approach was to use DATA SHADOWING, which makes the owner of the data anonymous. However, even if names are deleted, it is still possible to easily identify individuals when there is a large amount of personal information, so this method still has problems. To solve this problem, Alex Sandberg took a different approach. He argued that while personal data should be made available for use, the data provider and user must first reach an agreement on how it will be used. Based on this, he proposed three principles: “prior notice and prior consent,” “the right of individuals to control their own data,” and “data should be aggregated before being sent to third parties.” Integrating data means grouping people with similar tendencies into specific groups so that they cannot be identified from the data, and then analyzing and utilizing only the data of the group. In fact, the UK’s big data policy, MIDATA, is based on similar logic and has been well received by society.
Today, the use of big data is becoming more and more powerful and is becoming a hot keyword of the 21st century. Big data is helping in various fields by making things possible that were not possible before, such as plans to prevent accidents and eliminate traffic congestion by collecting data from vehicles in operation to build an intelligent transportation system, and plans to prevent potential risks in advance based on the physiological data of newborns. However, in order for big data to provide better services, ethical issues arising from the use of personal information must be addressed at the societal level, which is currently a hot topic of debate. If good policies for protecting personal information are proposed and social consensus is reached, big data will contribute to improving people’s lives and bring about beneficial changes across society.