Information Seeking With Big Data

Information Seeking With Big Data

CHAPTER 5 Information Seeking With Big Data: Not Just the Facts INTRODUCTION Data are everywhere. They are in every part of our lives. In no period o...

645KB Sizes 1 Downloads 69 Views

CHAPTER 5

Information Seeking With Big Data: Not Just the Facts INTRODUCTION Data are everywhere. They are in every part of our lives. In no period of history before has there been such massive amounts of data being collected at unprecedented rates. We experience it every day whether it is through Amazon recommending purchases, Facebooks “Friends,” Google’s “auto fill,” police predicting when and where crimes might occur, or banks making lending decisions based on the massive amounts of customer data that it is has received. These are all uses of big data. In this chapter, we will focus on big data and its implications for libraries (Fig. 5.1).

WHAT IS BIG DATA? As the name implies, big data is huge. SAS Institute refers to big data as a large volume of data that is both structured and unstructured and that inundates a business on a day-to-day basis. But it’s not the amount of data

Figure 5.1 Here is big data.

Emerging Library Technologies DOI: https://doi.org/10.1016/B978-0-08-102253-5.00005-8

© 2018 Elsevier Ltd. All rights reserved.

95

96

Emerging Library Technologies

that’s important. It is what organizations do with the data that matters (https://www.sas.com/en_us/insights/big-data/what-is-big-data.html). Mayer-Schonberg and Cukier define big data in their book Big Data: A Revolution That Will Transform How We Live, Work and Think as: Big data refers to things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value, in ways that change markets, organizations, the relationships between citizens and governments and more.

Big data is often characterized by the “3Vs” (volume, velocity, and variety). Doug Laney, defined the “3Vs” of data management in his 2001 article, 3-D Data Management: Controlling Data Volume, Velocity, and Variety. Volume refers to the amount of data, velocity refers to the speed of processing, and variety refers to the number of types of data (whatis. com, http://whatis.techtarget.com/definition/3Vs). Big data refers to our ability to crunch a vast quantity of information, analyze it instantly, and draw sometimes astonishing conclusions from it (Mayer-Schonberg and Cukier, 2013). Big data describes the enormous amount of data that inundates businesses on a constant basis, as well as the industry which has grown around attempts to collect, analyze, and act upon that data (https://www.statista. com/statistics/254266/global-big-data-market-forecast/). There are many definitions for big data, but just how big is big data?

HOW BIG IS BIG DATA? Big data is so large that it cannot be measured in megabytes (books or photos) or gigabytes (movies). However, it can be measured in terabytes (all the books in the world), petabytes or exabytes (all the books in multimedia formats in the world), zettabytes or yottabytes (everything recorded in human history) (Fig. 5.2). By one estimate, the total amount of data stored globally is now measured in thousands of exabytes (an exabyte is equal to a billion gigabytes) (Ford, Rise of the Robots). Martin Ford writes in his Rise of the Robots: Technology and the Threat of a Jobless Future, that Google’s servers alone handle about 24 petabytes (equal to a million gigabytes). It is primarily information about what its millions of users are searching for each and every day.

Information Seeking With Big Data: Not Just the Facts

97

Figure 5.2 Sheer volume of big data.

HISTORY OF BIG DATA Big data has been around longer than imagined. Big data has had a long buildup. The term first popped up in an Institute of Electrical and Electronics Engineers publication in 1997 discussing the challenge of working with blocks of information as large as 100 gigabytes. By the late 2000s, breakthroughs in storage capacity and computing power were leading to euphoric proclamations about the transformational potential of data and analytics in almost every aspect of business and society (http:// insights.som.yale.edu/insights/what-is-the-impact-of-big-data).

APPLICATIONS OF BIG DATA Big data is having a revolutionary impact in a wide range of areas including business, politics, medicine, and nearly every field of nature and social science (Ford, Rise of the Robots). Big data is about predictions and applying math to huge quantities of data in order to infer probabilities. For example, being able to determine the likelihood that an email message is spam; that the typed letters “teh” are supposed to be “the” (Schonenberger, Big Data). Big data can be applied to diagnosing illnesses, recommending treatments, perhaps even identifying “criminals” before one actually commits a crime (Schonenberger, Big Data).

98

Emerging Library Technologies

Through big data, Amazon can recommend the ideal book; Google can rank the most relevant websites, Facebook knows our likes, and LinkedIn divines who we know (Schonenberger, Big Data). Major retailers use big data to track customer shopping preferences in order to make precisely targeted offers that increase revenue while helping to build customer loyalty. Police departments across the globe are turning to algorithmic analysis to predict the times and locations where crimes are most likely to occur and then deploying their forces accordingly. The City of Chicago’s data portal allows residents to see energy usage, crime, performance metrics for transportation, schools, and healthcare, and the number of potholes patched in a given period of time (Ford, Rise of the Robots). Big data relies on all the information, or at least as much as possible. It allows us to look at details or explore new analyses without the risk of blurriness (Mayer-Schonberger, Big Data: A Revolution). In law, big data is being applied to predict the likely outcome of cases, especially in supreme courts. Credit card companies are using big data to understand and evaluate the risk of default. Law enforcement agencies are using big data to allocate resources to predict where and when crimes might occur. CancerLinQ, a health information technology platform aimed at enhancing and improving the treatment of cancer, is utilizing big data to collect data on the care of hundreds of thousands of cancer patients, and use it to help guide treatment of other patients across the healthcare system (https://cancerlinq.org/how-it-works).

CHALLENGES AND OPPORTUNITIES FOR BIG DATA With any new technology, there are challenges and opportunities that need to be addressed. Big data is no exception.

CHALLENGES FOR BIG DATA Big data is used for analyzing voluminous amounts of data in order to make well-informed decisions. Those decisions can sometimes be made on erroneous data or even human error. With big data, algorithms will predict the likelihood that one will get a heart attack (and pay more for

Information Seeking With Big Data: Not Just the Facts

99

health insurance), default on a mortgage (and be denied a loan), or commit a crime (and perhaps get arrested in advance), or an innocent person being put on a no-fly list and, as a result, losing out on opportunities for business, school, or even personal family matters. Errors in analysis and prediction—A major challenge for big data is when users can’t understand the analysis and there are errors in interpretation of the data. Users should be able to see not just the results, but also to understand why they are seeing those results. Volume and transfer speed—The sheer volume and transfer speed of unimaginable amounts of data is challenging. As the name implies, big data is massive. However, the transfer speed cannot match the size. More robust processing power—Processors have gotten larger, but not at the same rate as the volume of data that they need to process. Therefore, the infrastructure must be developed, but at higher costs. By trying to compress huge volumes of data and then analyze it, is a tedious process which might ultimately prove to be ineffective (Tole, Big Data Challenges). Speed of data transfer rate—The speed at which the data transfers is quite a challenge as transfer rates are limited but requests are unlimited, so streaming data in real-time is a big challenge (Tole, Big Data Challenges). Relevancy and redundancy—The big data system receives the data in all types of unsorted formats. The challenge is being able to sort through these huge amounts of data files for relevancy and “readability” and then being able to analyze the data and make accurate, informed decisions. Data privacy and security—Data privacy and data security are huge concerns with big data. There is great fear and trepidation from people needing to share their personal data, especially through the linking of data from multiple sources. There are many ways that a user’s location, identity, and affiliations can be tracked. For example, the user’s location can be tracked through cell tower locations. A person’s presence at a political event can suggest their political affiliation. Or health-related purchases might reveal a person’s illness or condition such as in the case of the retailer Target where using their very accurate prediction pregnancy tool, they revealed a teen’s pregnancy to her family based on purchases that she made. They were able to predict the pregnancy using about 25 products analyzed together and assigned a “pregnancy prediction” score (Duhigg, 2012. URL: http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html).

100

Emerging Library Technologies

Pressing challenges—Employees and executives working with big data have listed security, cost, and a lack of technical big data expertise as some of their most pressing concerns, and many executives believe that maintaining the quality of collected data remains a significant challenge (https://www.statista.com/statistics/254266/global-big-data-market-forecast/). Data quality—Dirty data cost companies in the United States $600 billion every year. Common causes of dirty data that must be addressed include user input errors, duplicate data, and incorrect data linking. In addition to being meticulous at maintaining and cleaning data, big data algorithms can also be used to help clean data (https://www.qubole.com/ resources/big-data-challenges/).

OPPORTUNITIES FOR BIG DATA Just as there are many challenges for big data, there are a myriad of opportunities as well. According to Agrawal D., in Challenges and Opportunities with Big Data, “A major investment in Big Data, properly directed, can result in major scientific, advances, but also lay the foundation for the next generation of advances in science, medicine, and business.” Increase competitive advantage—Organizations using big data analysis can speed up their processes and reveal patterns that can improve their competitive advantage. They can receive information in real time about their customers and use that information for planning and forecasting. Forecasting—With predictive analytics, big data can be used for forecasting to see where they are heading. Tole writes in Big Data Challenges that “a telecommunications company can use data stored from length of call, average text message sent, average bill amount to see which customers are likely to discard their services.” Identifying student risk—School districts can analyze big data to identify at-risk students in order to put the programs in place to help them to achieve adequate yearly progress and provide the resources to educators to ensure that students are successful. New products and services—Entrepreneurs have also capitalized on big data technology to create new products and services. https://www.qubole.com/resources/big-data-challenges/.

Information Seeking With Big Data: Not Just the Facts

101

INDUSTRIES IMPACTED BY BIG DATA Education—Innovators in the education industry are using big data to identify and predict how to improve student learning and engage students in the same manner as students are glued to their screens and smart devices. Healthcare—Doctors and hospitals and other healthcare providers are leveraging big data to analyze and predict a myriad of complex questions based on data analytics, such as which blood pressure range is normal or how much sugar patients should consume each day or even more complex questions. In the future, as more patients utilize the healthcare functionality of their smartphones and wearable devices, doctors will be able to better understand, analyze, and make more informed healthcare predictions for their patients. Agriculture—Big data is used in agriculture to predict the weather and even squeeze the maximum productivity out of the land. Big data will continue to grow and become more essential for feeding a growing world population. Transportation—Amtrak and Southwest transportation companies are leveraging the power of big data to keep their planes, trains, and automobiles running on time and most efficiently. Finance—The banking industry uses big data to determine how customers use their accounts and identify any potential security risks. http://blog.syncsort.com/2017/02/big-data/big-data-industries-dataanalytics/.

BIG DATA IMPLICATIONS FOR LIBRARIES As experts at searching, retrieving, and analyzing data, librarians are uniquely suited to work with big data. Librarians possess a unique set of interpersonal and technical skills that can turbo-charge research and development of new datasets and also reduce the amount of startup work involved (Huwe, 2017, p. 12). Mary Ellen Bates writes in “Big Data Ain’t So Big” that “We info pros were using Big Data (which is what valueadded online services are) long before most of our colleagues knew what online research even meant.” Big data creates competitive advantages for organizations, and how librarians can make big data visible, accessible, and usable by creating

102

Emerging Library Technologies

taxonomies, designing metadata, and developing systematic retrieval methods. In addition, librarians can use big data tools to analyze data sets to make them simple, searchable, and useable. Big data can be used in many areas in information sciences including data management, curation and archiving, search and retrieval, interdisciplinary research, and the LIS curriculum. Some other areas of growth for big data in library and information science include high-intensity performance computing, advanced statistical and computational methods, virtual reality systems, diversity formats data management, digital preservation, and curation (African Journal of Library, Archives, and Information Science. Vol. 26, No. 2 (October 2016) 93 96). Huwe writes that “prestigious university libraries, including the Universities of California, Michigan, Pittsburgh, and Washington, have launched data management as a core service (Huwe, 2017, p. 11).

BIG DATA LIBRARY EXAMPLES Many academic libraries are experimenting with big data. They are examining their own data metrics in order to make key decisions.

UNIVERSITY OF CALIFORNIA BERKELEY LIBRARIES University of California Berkeley libraries host several data initiatives on their campus. They include the D-Lab that is a social sciences-focused program, the Berkeley Institute for Data Sciences (BIDS), and the California Policy Lab. UC Berkeley D-Lab The D-Lab assists social scientists and humanists collect, process, and visualize data. They collaborate with data-intensive social scientists and collaborators in industry, the social sector, and government (http://dlab.berkeley.edu). Berkeley Institute for Data Sciences (BIDS) Founded in 2013, the Berkeley Institute for Data Science (BIDS) is a central hub of research and education at UC Berkeley designed to facilitate and nurture data-intensive science. They bring together broad constituents of the data science community, including domain experts from the life, social, and physical sciences and methodical

Information Seeking With Big Data: Not Just the Facts

103

experts from computer science, statistics, and applied mathematics (https://bids.berkeley.edu). California Policy Lab Researchers in UC Berkeley’s California Policy Lab were awarded a $1 million grant from the Laura and John Arnold foundation for the California Policy Lab to produce cutting-edge policy research on issues from education and criminal justice to social services and labor. The lab partners with several government agencies to create a new secure data warehouse that links administrative data at the city, county, and state levels, allowing researchers to do major longitudinal analyses on California’s economic, social service, and education and criminal justice systems (Unlocking Government Administrative Data with New California Policy Lab by Public Affairs, UC Berkeley, 11/29/ 2016, http://news.berkeley.edu/story_jump/unlocking-governmentadministrative-data-with-new-california-policy-lab/).

NEW YORK UNIVERSITY ELMER HOLMES BOBST LIBRARY New York University’s Elmer Holmes Bobst Library offers training, support, and consulting expertise through the entire research data life cycle. NYU offers several tools and services to support quantitative, qualitative, and geographical research at NYU. Their Data Services include access to specialty software packages for statistical analysis, Geographic Information Systems (GIS), and Qualitative Research Support, among others (New York University, http://www.nyu.edu/life/information-technology/research-and-data-support/data-services.html).

HARVARD UNIVERSITY LIBRARY ANALYTICS TOOLKIT The Library Analytics Toolkit is a dashboard that pulls library data together in a way that allows both librarians and library users to identify and respond to trends and changes in collections, usage, and other data. It enables libraries to understand, analyze, and visualize the patterns of activities, including checkouts, returns, and recent acquisitions, and to do so across multiple libraries (Harvard Library Lab, https://osc.hul.harvard. edu/liblab/projects/library-analytics-toolkit).

104

Emerging Library Technologies

MASSACHUSETTS INSTITUTE OF TECHNOLOGY (MIT) LIBRARIES Massachusetts Institute of Technology (MIT) libraries’ Data Management Service emphasizes the library as a partner in organizing and managing data. They assist MIT faculty and researchers manage, store, and share the data they produce. They provide assistance with creating data management plans; individual consultations; and workshops that teach how to manage data more efficiently and share data with others (MIT Libraries, Data Management, https://libraries.mit.edu/data-management/services/).

UNIVERSITY OF MICHIGAN LIBRARY The University of Michigan Library provides a full array of data services through their Research Data Services and their Deep Blue Data services. They provide a suite of services as well as a repository that will support researchers throughout all phases of the research data lifecycle, which includes planning, creation, organization, sharing, and preservation. The library’s Research Data Services is a network of tools and expertise that “the library is uniquely equipped to provide,” says Elaine Westbrooks, associate university librarian for research. Research Data Services—Research Data Services is a network of services throughout the Library that assists patrons through all phases of the research data lifecycle. They provide services in the following areas: • Data Management Planning; • Discovery and Access; • Data Organization and Management; • Metadata and Documentation; • Data Sharing and Publication; • Preservation; • Data Visualization. (https://www.lib.umich.edu/research-data-services) Deep Blue Data—Deep Blue Data is an expansion of Deep Blue, the university’s institutional repository which was established in 2006 and currently holds more than 110,000 deposits. It offers a new platform specialized for datasets that enables University of Michigan researchers to meet data-sharing mandates and achieve their goals of making their research datasets more readily available to colleagues and peers throughout the world.

Information Seeking With Big Data: Not Just the Facts

105

Volker Sick, professor of mechanical engineering and associate vice president for the Office of Research, has been making extensive use of Deep Blue since 2011. He writes, “Our extensive experimental data are used by researchers worldwide.” (https://record.umich.edu/articles/library-launches-research-dataservices-and-deep-blue-data).

CONCLUSION Data-driven decision making using big data is in every facet of our lives and continues to increase. Managing terabytes, petabytes, or exabytes of data might have been unheard of five or ten years ago. However, today it is becoming even more common with big data. There are many challenges with managing big data, such as privacy, security, and infrastructure, but there continue to be great opportunities in every area of our society where we have not even touched the surface yet. Librarians are excellent at researching, analyzing, and presenting information to make informed decisions. Librarians can utilize this expertise to help patrons learn what big data can and cannot do. Librarians can help lead the big data movement in their libraries and communities through collaborations with other libraries, departments, communities, schools, and businesses to harness the great potential and power of big data. Big data is everywhere, continues to grow every day, and is here to stay.

QUESTIONS FOR FURTHER DISCUSSION 1. Are you familiar with big data and how you might use it in your library? 2. Are you currently using big data in your library? If so, in what capacity? 3. Where will you obtain the data and how will manage the data? 4. Who is responsible for the data decision making? 5. Do you have the necessary hardware, software, infrastructure, bandwidth to support your big data initiatives? 6. Have you investigated privacy issues? How will you protect people’s privacy?

106

Emerging Library Technologies

7. Have you investigated security issues? How will you secure the information? 8. Do you have the proper resources, such as personnel and training, to support your big data initiatives? 9. Have you engaged your legal department to discuss the implications for implementing big data? 10. Which schools, universities, departments, agencies, organizations, and others can you partner with for your big data initiatives?

CONSIDERATIONS FOR IMPLEMENTATION Big data is an emerging technology that is used in almost every part of our daily lives, from predicting which students might be more likely to fail and drop out of school to what people might be interested in reading based on their past reading experiences, to what people will purchase on Amazon or other online retail websites. To this end, there are several considerations that you should address before you consider implementing big data into your libraries. I am including several suggestions to consider before implementing big data in your library below. 1. Obtain stakeholder buy in—Research and build your case for how you can utilize the power of big data in your academic, public, or school library. Be prepared to present your case to your stakeholders whether it is administration, principals, upper management, or board of directors, trustees, etc. 2. Know your audience—Do a needs analysis to determine who your audience is for big data. Patrons, librarians, teachers, information professionals, and anyone interested in learning big data or might be interested in careers in data analytics. 3. Costs—Research and determine all of your costs for implementing big data in your library. Where will you find the money? Who will pay for these initiatives? Do a cost benefit analysis and determine what your overall costs will be. Write grants and partner with other organizations. 4. Personnel—Do you have a staff member who is an expert or good at analyzing huge data? Do you have anyone on staff who can fill this role or do you need to consider hiring an additional staff member to fill this role? Will you need to hire a full-time person or a contractor? What will be the cost?

Information Seeking With Big Data: Not Just the Facts

107

5. Training—What type of training will you offer staff? Will you utilize the train the trainer model? Will you need to hire a big data expert? How much will this cost? What can you do to obtain for free or at a significantly reduced cost? 6. Build strategic relationships—Build strategic partnerships with other libraries, departments, librarians, schools, community colleges, universities, government officials, and others who can assist you with your big data projects. 7. Market programs and resources—How will you market your big data initiatives at your library? How will you use social media and print media to market and advertise your big data? What are the costs? Can you obtain any services for free? 8. Do the research—Locate as much information on big data that is available and share it with your patrons and colleagues. Do you currently have resources on big data in both print and online format? Are there departments, libraries, and other organizations that have already implemented big data resources, programming, and content that you can partner with and share materials? Are there materials online for big data that you can obtain? 9. Programming and workshops—What resources, programming, and workshops will you offer on big data? 10. Safety and security—You will need to determine and plan for how secure your data will be and that the person(s) analyzing the data are accurate in their analysis.

PROPOSAL After you have addressed these “Considerations for Implementation,” write them into a proposal and submit it to your stakeholders, legal department, and anyone else who can support and fund this proposal to implement big data in your library.

GLOSSARY 3Vs (volume, variety, and velocity) The three defining properties of dimensions of big data. Volume refers to how much data, variety refers to the various types of data, and velocity refers to how fast the data are processed (Whatis.com). Algorithm A procedure or formula for solving a problem, based on conducting a sequence of specified actions. It is a set of instructions designed to perform a specific task. API (application programming interface)—A defined protocol that allows computer programs to use functionality and data from other software systems.

108

Emerging Library Technologies

Big data The capability to manage a huge volume of disparate data, at the right speed and with the right time frame, to allow real-time analysis and reaction. Big data is typically broken down to the 3Vs (volume, velocity, and variety). Byte A basic and physical unit of information in computing and digital communications. Data mining The process of exploring and analyzing large amounts of data to find patterns. Exabytes An exabyte (EB) is a large unit of computer data storage that is approximately 1 billion gigabytes or 1000 petabytes. 1 exabyte 5 2 million personal computers; 5 exabytes 5 All words spoken by mankind; 15 exobytes 5 Total data held by Google (whatisabyte.com). Hadoop Hadoop is designed to parallelize data process across computing nodes to speed computations and hide latency. Two major components of Hadoop exist; a massively scalable distributed file system that can support petabytes of data and a massively scalable MapReduce engine that computes results in batch. Open source A movement in the software industry that makes programs available along with the source code used to create them so that others can inspect and modify how programs work. Changes to source code are shared with the community at large. Petabytes A petabyte (PB) is a multiple of the unit byte for digital information. A petabyte encompasses about 1000 terabytes. 20 petabytes 5 The amount of data processed by Google on a daily basis (whatsabyte.com). Predictive analytics A statistical or data-mining solution consisting of algorithms and techniques that can be used on both structured and unstructured data to determine future outcomes. It can be deployed for prediction, optimization, forecasting, simulation, and many other uses. SOAP (Simple Object Access Protocol) A protocol specification for exchanging data. Along with REST, it is used for restoring and retrieving data in the Amazon storage cloud. Structured data Data that have a defined length and format. Examples of structured data include numbers, dates, and groups of words and numbers called strings (e.g., a customer’s name, address, telephone number, etc.). Terabytes (TB) A measure of computer storage capacity that is approximately one trillion bytes. A terabyte is 1,024 gigabytes (GB). The prefix tera is derived from the Greek word for monster (Searchstorage.techtarget.com). 1 terabyte 5 3.6 million 300 kilobyte images; 1 terabyte 5 300 hours of good-quality video; 1 terabyte 5 1000 copies of the Encyclopedia Britannica; 10 terabytes 5 The printed collection of the entire Library of Congress (whatsabyte.com). Unstructured data or unstructured information Data or information that does not follow a specified data format. Unstructured data can be text, video, images, and content. Unstructured information can be text heavy but may contain data such as dates, numbers, and facts also (Wikipedia).

SUGGESTIONS FOR FURTHER READING There are several technologies for managing big data that are included below. I have provided links to additional information to learn more about each technology. MapReduce—Programmers often use this when they are confronted with large amounts of data. It is a tool for mapping and reducing datasets.

Information Seeking With Big Data: Not Just the Facts

109

https://www.ibm.com/analytics/us/en/technology/hadoop/mapreduce/ https://en.wikipedia.org/wiki/MapReduce. https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html. Hadoop—The most popular open source software that is used for mining and sorting data. It is used by Facebook, Google, and other large companies. It provides massive storage for any kind of data, enormous processing power, and the ability to handle virtually limitless concurrent tasks or jobs. It is an open source, Java-based programming framework that supports the processing and storage of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation. https://hadoop.apache.org. http://searchcloudcomputing.techtarget.com/definition/Hadoop. https://www.sas.com/en_us/insights/big-data/hadoop.html. IBM DB2—A fast and solid data manipulating system. DB2 is a database product from IBM. It is a Relational Database Management System (RDBMS). DB2 is designed to store, analyze, and retrieve the data efficiently. DB2 product is extended with the support of Object-Oriented features and nonrelational structures with XML. https://www.ibm.com/analytics/us/en/db2/. https://www-03.ibm.com/systems/power/software/i/db2/. https://www.tutorialspoint.com/db2/db2_introduction.htm. Oracle—Oracle provides a complete solution for managing large amounts of data from creating the solution from top to bottom based on NoSQL (Not SQL). It is based on ACQUIRE . ORGANIZE . ANALYZE . DECIDE (Oracle Big Data Strategy Guide). http://www.oracle.com/us/technologies/big-data/big-data-strategy-guide-1536569.pdf. https://www.oracle.com/big-data/index.html. http://www.oracle.com/technetwork/topics/entarch/articles/oea-big-data-guide-1522052.pdf. SAS Viya—Provides a high-performance analytic solution that is more oriented to providing software solutions to help companies benefit from data that they have stored. https://www.sas.com/en_us/insights/articles/business-intelligence/a-brave-new-worldof-analytics.html. https://www.sas.com/en_us/insights/big-data.html. https://www.sas.com/en_us/insights/articles/business-intelligence/a-brave-new-world-ofanalytics.html. IFLA Big Data SIG—https://www.ifla.org/about-big-data. IFLA Trend Report—Big Data—https://trends.ifla.org/literature-review/big-data. Big Data Ted Talks—https://www.ted.com/search?q 5 big 1 data.

BIBLIOGRAPHY Agrawal D., Bernstein P., Bertino E., Davidson S., Dayal U., Franklin M., et al. (2012). Challenges and Opportunities with Big Data: A white paper prepared for the Computing Community Consortium committee of the Computing Research Association. http://cra.org/ccc/resources/ccc-led-whitepapers/. http://cra.org/ccc/ wp-content/uploads/sites/2/2015/05/bigdatawhitepaper.pdf. Duhigg, C., 2012. How companies learn your secrets. The New York Times. Available from: http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html. Ford, M., 2015. Rise of the Robots. Technology and the Threat of a Jobless Future. Basic Books. Perseus Books, Pennsylvania: Philadelphia, 2015. Huwe, T.K. May/June 2017. Onlinesearcher.net. Librarians and Data: Curator, Creator, or Both?

110

Emerging Library Technologies

Laney, D., February 2001. 3D Data Management: Controlling Data Volume, Velocity, and Variety. META Group Research. Available from: http://blogs.gartner.com/doug-laney/ files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-andVariety.pdf. Mayer-Schonberger, Vr, Cukier, K., 2013. Big Data: A Revolution That Will Transform How We Live, Work, and Think. Houghton-Mifflin Harcourt, Boston. Mutula, S., October 2016. Big data industry: implication for the library and information sciences. Afr. J. Library Arch. Inf. Sci. 26 (2), 93 96. “The Big Data Conundrum: How to Define It?” MIT Technology Review. October 3, 2013. https://www.technologyreview.com/s/519851/the-big-data-conundrum-howto-define-it/. Tole, A.A. Big data challenges. Database Syst. J. IV. 3/2013. http://www.dbjournal.ro/ archive/13/13_4.pdf. Pentland, A., 2014. Social Physics: How Good Ideas Spread The Lessons from A New Science. The Penguin Press, New York. Winslow, R., 2013. ‘Big Data’ for cancer cure. Wall Street J. Available from: https://www. wsj.com/articles/SB10001424127887323466204578384732911187000.