Social media-based sleeping beauties: Defining, identifying and features

Social media-based sleeping beauties: Defining, identifying and features

Journal of Informetrics 14 (2020) 101012 Contents lists available at ScienceDirect Journal of Informetrics journal homepage: www.elsevier.com/locate...

2MB Sizes 0 Downloads 44 Views

Journal of Informetrics 14 (2020) 101012

Contents lists available at ScienceDirect

Journal of Informetrics journal homepage: www.elsevier.com/locate/joi

Regular article

Social media-based sleeping beauties: Defining, identifying and features Jianhua Hou a,∗ , Xiucai Yang b a Sun Yat-sen University, Guangzhou University, Huandong Road, No. 132 Waihuan East Rd., Guangzhou University City, Guangzhou, 510006, China b School of Economics and Management, Dalian University, Dalian 116622 Dalian Economic Technological Development Zone, Dalian 116622, China

a r t i c l e

i n f o

Article history: Received 21 August 2019 Received in revised form 14 January 2020 Accepted 15 January 2020 Keywords: Social media–based sleeping beauties Citation-based sleeping beauties Identification Awakening mode

a b s t r a c t The sleeping beauties in science signify a unique knowledge diffusion trajectory created by citation after the publication of scientific literature. However, in social media, scientific knowledge creates a new diffusion trajectory through social media metrics such as View, Save, Discussed, and Recommendations. This study aims to define social media–based sleeping beauties S-SB in science by using social media metrics which we termed as citationbased Sleeping Beauties in science, C-SB. We constructed a quantitative method to identify S-SB and conducted an empirical study of all types of 4019 articles published in PLOS Biology. Comparison of the S-SB and C-SB results revealed that from the perspective of social media metrics, C-SB has become the literature of S-total elements early gradual awakening type, S-total elements delay gradual awakening type, and S-early sudden awakening type. Moreover, the awakening time of C-SB literature under the action of social media metrics was found to be 4–5 years earlier than that under the action of citation-based indicators. Both C-SB and S-SB included significant “Editorial Material,” establishing that “Editorial Material” type literature is noteworthy while promoting the diffusion of scientific knowledge. Overall, this study extends the perspective of sleeping beauties in science. © 2020 Elsevier Ltd. All rights reserved.

1. Introduction Several factors like the novelty, application value, and influence of the paper ascertain the trajectory of knowledge diffusion through citations. In the academic community, the citation network created by the citation of quoted papers primarily determines the diffusion trajectory of scientific literature. However, on the social media platform, the influence and application value of the paper mainly determine the diffusion trajectory of scientific literature. Hence, compared with the diffusion trajectory through the citation network, the diffusion trajectory of scientific literature has markedly changed on social media platforms, becoming even more complicated. During the 1960s to 1980s, researchers revealed that some scientific papers were rarely cited at the beginning of publication but were suddenly cited extensively after a period; this phenomenon was termed as “existed discovery” (Barber, 1961; Cole, 1976), “delayed recognition” (Garfiled, 1980), and “premature discovery” (Stent, 1972). Based on the citation diffusion trajectory after the scientific literature publication, van Raan (2004) proposed the concept of Sleeping Beauty in Science to elucidate the phenomenon of scientific papers that did

∗ Corresponding author. E-mail address: [email protected] (J. Hou). https://doi.org/10.1016/j.joi.2020.101012 1751-1577/© 2020 Elsevier Ltd. All rights reserved.

2

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

not gain attention (citation) for some time after publication but suddenly garnered attention (citation) later on. However, this “Sleeping Beauty in Science” phenomenon is also noted in the diffusion trajectory of social media platforms after the publication of scientific literature. Since the proposition of the concept of sleeping beauty, studies have investigated the “Sleeping Beauty in Science” phenomenon in various fields, revealing that the phenomenon of “sleeping beauties” extensively exists in scientific research. Meanwhile, some studies also highlighted that some papers exhibited different characteristics of “sleeping-awakening” from sleeping beauties. Among those different types of citation diffusion trajectories, typical citation tracks included “citation classics” or “sticky knowledge claims” and “transient knowledge claims” (Baumgartner & Leydesdorff, 2014); “hits” (Lange, 2005); “shooting stars” (Minger, 2007); “Smart Girl” or “flash in the pan” (Li, 2014; van Dalen & Henkens, 2005; Ye & Bornmann, 2018); “black–white swans” (Zeng, Qi, Li, Stanley, & Fred, 2017); “swan-groups” (Zhang, Zuccala, & Fred, 2019); “all-elements-sleeping-beauties” (Li & Ye, 2014); and “zero-cited documents” (Egghe, Guns, & Rousseau, 2011; Van Noorden, 2017). Previous studies investigating the special types of scientific literature mentioned above mainly identified “Sleeping Beauties,” “Smart Girl,” “Flash in the Pan,” and other types of literature in different research fields based on (a) the change law of citation diffusion trajectory after the literature publication; (b) whether to set specific threshold parameters; and (c) quantification. Moreover, the studies investigated the reasons, rules, and mechanisms of the awakening of sleeping beauties literature. The research methods and processes were based on the citation diffusion trajectory of the literature; we termed the phenomenon of these sleeping beauties in science as citation-based sleeping beauties (C-SB). Besides the diffusion trajectory of scientific knowledge based on the citation network, scientific knowledge can also be spread through other forms of scientific exchange activities like “invisible college,” academic conference reports, literature reading, and curriculum learning. However, quantitatively exploring the trajectory of knowledge diffusion by data statistics remains challenging. Nevertheless, with the emergence of social media, the forms of scientific communication and knowledge diffusion are becoming increasingly diversified. In addition, the trajectory of scientific knowledge diffusion based on social media can be examined statistically and quantitatively using data records social media metrics. The impact of a published paper includes not only the academic impact based on citations but also the social media impact comprising social media metrics or Altmetrics (Bornmann, Haunschild, & Adams, 2019). The diffusion trajectory of scientific literature based on the social media platform is significant in revealing the evolution mechanism of scientific knowledge. Amid the recent and rapid advancement of social media, several studies have investigated the relevance of Altmetrics indicators and citation indicators, reporting the role of social media metrics in the impact assessment of papers or scholars. Thus, in the study of the literature influence, examining only from the perspective of citation-based academic influence is not comprehensive, and it is essential to combine academic influence with social influence to comprehensively illustrate the literature influence from the perspective of social media platforms. Accordingly, the concept, identification method, and evolution mechanism of sleeping beauties in science warrant reexamination from the social media perspective. Regarding social media, we termed it social media–based sleeping beauties (S-SB), which are formed by social media metrics of this scientific literature based on the social media platforms. Prior studies on the identification of sleeping beauties in science primarily considered two crucial types of scientific literature, “Article” and “Review,” to identify the scientific sleeping beauty. Later, the “Editorial Material” type of literature was also noted to play a crucial role in the diffusion of scientific knowledge (Rousseau, 2009; van Leeuwen, Costas, CaleroMedina, & Visser, 2013). Numerous academic journals, such as Science, Proceedings of the National Academy of Science of the United States of America (PNAS), and PLOS Biology, have published numerous “Editorial Material” type literature (van Leeuwen et al., 2013), creating significant research interest in whether S-SB also exists in the “Editorial Material” type literature. Based on the questions and research interests mentioned above, this study primarily addresses the following questions: (1) How do we define S-SB from the perspective of the social media platform? What are the criteria for their identification? How to precisely identify S-SB by quantitative methods? (2) What are the new characteristics of S-SB compared with C-SB? In particular, what is the difference between S-SB’s “sleeping-awakening” mode and C-SB’s? (3) What are the different characteristics of S-SB in different document types? Is the “Editorial Material” type of S-SB literature worth exploring? 2. Literature review Traditionally, the diffusion trajectory of citations forms the basis of the research on the diffusion of scientific literature. Various diffusion trajectories of citation revealed the process of knowledge diffusion after the publication of scientific literature. Usually, scientific papers are cited by other papers in the years after publication and reach the peak citations gradually, followed by gradually declining citations until they are forgotten; such citation trajectory is called the classical (or normal) citation trajectory. However, several special types of citation trajectories are involved in the process of scientific development, which are prominent in the evolution of scientific knowledge. Among them, “delayed recognition” is a specific subgroup among “citation classics” papers because the lasting impact is not combined with (substantial) citation impact shortly after publication (Ye & Bornmann, 2018). The “delayed recognition” type scientific papers are significant for major scientific discoveries, technological inventions, hastening the advancement of science and technology; such papers are normally indispensable to science (Hu & Wu, 2014; Wang, Ma, Chen, & Rao, 2012) and reveal the mechanisms of scien-

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

3

Fig. 1. The distribution of publication time of papers based on the quantitative identification method of sleeping beauty in science (partially).

¨ tific information flow through citations (Braun, Glanzel, & Schubert, 2010). However, the emergence of scientific sleeping beauty could also account for their excessive presence, possibly causing idleness and waste of knowledge (Wang et al., 2012). Since the 1960s, studies have focused on the “delayed recognition” phenomenon of scientific papers (Barber, 1961), which is also known as “premature discovery” (Stent, 1972). After the 1980s, some studies suggested basic methods to identify the “delayed recognition” literature (Garfiled, 1980, 1989a, 1989b, 1990). Later, some studies presented several quantitative identification methods and empirical studies of the “delayed recognition” (Glanzel, Schlemmer, & Thijs, 2003; Glanzel & Garfield, 2004; Lachance & Larivière, 2014). 2.1. Previous studies of sleeping beauties in science In 2004, van Raan defined the citation trajectory of “delayed recognition” as “Sleeping Beauty in Science” and proposed the criteria and parameters to identify the Sleeping Beauty in Science. Since then, Sleeping Beauty in Science has garnered considerable attention from the academia world. For example, “all-elements-sleeping-beauties in science” (Li & Ye, 2014), and “Sleeping Beauty in Patent” (Hou & Yang, 2019) were proposed subsequently. The previous studies on sleeping beauty mostly include the following aspects: (a) exploring the identification methods of the sleeping beauty literature; (b) examining the identification of Prince’s literature; (c) identifying the scientific sleeping beauty literature in different fields; and (d) the forming reasons and characteristics of the sleeping beauty literature and the Prince’s literature. van Raan (2004) proposed a reference standard and measurement indexes to determine the scientific sleeping beauty literature, which became the reference standard to identify the scientific sleeping beauty literature in most follow-up studies. Then, Li and Ye (2016) proposed the criteria for distinguishing sleeping beauties. In the latest work, Hu & Rousseau (2019) proposed that under-cited influential articles could experience delayed recognition or be sleeping beauties. On the other hand, the quantitative recognition method without parameters received considerable attention in the study of recognition of scientific sleeping beauty: quartile-based criteria (Costas, Leeuwen, & Raan, 2010); “all-elements-sleepingbeauties” (Li & Ye, 2014); “heartbeat spectrum” (Li, Shi, Zhao, & Ye, 2014); “beauty coefficient” (denoted as B; Ke, Ferrara, Radicchi, & Flammini, 2015); Gs index (Sun, Min, & Li, 2016); K value indicator (Teixeira, Vieira, & Abreu, 2017); dynamic citation angle ␤ (Ye & Bornmann, 2018); dynamically normalized citation impact scores (Bornmann, Ye, & Ye, 2018); and “beauty coefficient percentage” (Bcp index; Du & Wu, 2018; Fig. 1). In addition, studies have identified sleeping beauties in several disciplines and study fields like psychology (Ho & Hartley, 2017), medical and biological engineering (Chhapola, Tiwari, Deepthi, & Kanwal, 2018; Huang, Hsu, & Ciou, 2015; Ohba & Nakao, 2012), philosophy of science (Comins & Leydesdorff, 2016), physics, chemistry, engineering, and computer science (Dey, Roy, Chakraborty, & Ghosh, 2017; van Raan, 2015, 2017), besides studying Nobel laureates’ papers in physics (Li, 2014) and meme diffusion (Zhang, Xu, & Zhao, 2017). Currently, a common feature exists in the existing research on sleeping beauties in science—all studies used citation-based literature influence diffusion trajectory to identify sleeping beauties in science. However, with the swift rise of social media platforms, the diffusion trajectories of scientific papers based on the citation indexes have been supplemented markedly. The diffusion trajectories of scientific papers not only include those based on the citation indexes but also include social media-based indexes, including View, Save, Discussed, and Recommended. 2.2. The influence of scientific literature based on Altmetrics Altmetrics constitutes the measurement indicators used to measure the diffusion trajectory of social influence in scientific literature (Priem et al., 2010). In addition, subsequent studies have reported different definitions of Altmetrics (Bornmann et al., 2019; Erdt, Nagarajan, Sin, & Theng, 2016). Whether Altmetrics indicators can measure the impact of scientific literature remains debatable to date (Adie, 2014; Bornmann, 2015a, 2014; Das & Mishra, 2014; Zahedi, Costas, & Wouters, 2014). Of note, general consensus exists that academic impact is reflected in citation analysis, whereas “measuring societal impact is problematic” (Moed, 2017). Altmetrics primarily reflects “attention” rather than influence or impact (De Winter, 2015; Haustein, Peters, Sugimoto, Thelwall, & Larivière, 2014; Moed, 2017). Reportedly, the correlation between some Altmetrics

4

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

and citations is weak (Costas, Zahedi, & Wouters, 2015; Waltman & Costas, 2014). For example, Haustein et al. (2014) analyzed 1,431,576 biomedical papers from 2010 to 2012 and reported a weak correlation between the number of tweets and citations, indicating that the number of tweets did not necessarily denote the true impact of the study. Likewise, De Winter (2015) concurred with Haustein et al. (2014). However, most studies proposed that Altmetrics can measure the impact of the scientific literature because people who participate in online academic discussions tend to be well educated (Forkosh-Baruch & Hershkovitz, 2012; Mohammadi, Thelwall, Haustein, & Larivière, 2015; Puschmann & Mahrt, 2012). Using empirical analysis, some studies validated the relevance of different Altmetrics indicators and citation indicators (Bornmann, 2015b, Bornmann & Haunschild, 2018; Bornmann, 2015a; Costas et al., 2015; De Winter, 2015; Eysenbach, 2011; Ebrahimy, Mehrad, Setareh, & Hosseinchari, 2016; Mohammadi et al., 2015; SUD & Thelwall, 2014; Thelwall & Haustein et al., 2013). Moreover, studies have reported that Altmetrics indicators (save, discussion, download, read in Mendeley [save], number of readers in Mendeley, recommendation measures, the number of tweets, F1000, and bookmarks) strongly correlate with the citation indicators. The studies of F1000 by Bornmann and Leydesdorff (2013), and Shema, Bar-Ilan, and Thelwall (2014), Wardle (2010) reported a high correlation between citations of recommended articles in blogs and Altmetrics. Followed by Bornmann and Haunschild (2018), the correlation between Altmetrics and the quality of scientific papers was further examined by F1000 Prime, revealing that citation-based metrics and readership counts more significantly correlated with quality than tweets. Zahedi, Costas, and Wouters (2014a;2014b) reported a correlation of r = 0.49 between the number of readers (preserved) and citation index in Mendeley, and the literature with Mendeley readership had a higher citation rate than that without Mendeley readership. Moreover, Ebrahimy et al. (2016) discussed the mediating role of save, discussion, and recommendation measures in the correlation between visibility and citation in 2009–2013 biomedical papers, revealing that these indicators exerted significant impact on the number of future citations. Based on 45 journals in the field of physics from 2004 to 2008, Haustein and Siebenlist (2011) established that the correlation between the preservation and citation of articles was r = 0.215. Mazarei (2013) and Bar-Ilan, Shema, and Thelwall (2013) ascertained that bookmarks on the Web in the field of information science and Mendeley bookmarks positively correlated with citation indicators. Furthermore, Shuai, Pepe, and Bollen (2012), Eysenbach (2011), Bornmann (2015b), and Bornmann (2014) reported a significantly strong correlation between microblog counting and citation. When researchers examined the correlation between Altmetrics and citation indicators, they started assessing literature or scholars based on Altmetrics (Adie, 2014; Bornmann, 2013, 2016; Das & Mishra, 2014; Derrick & Samuel, 2016). 2.3. The influence of document types on the diffusion trajectory of scientific literature During the diffusion of scientific literature, besides “Article” and “Review,” other types of literature also affect the diffusion of scientific knowledge, such as “Editorial Material” type document, which is also crucial to the diffusion of scientific knowledge (Ding, Ahlgren, Yang, & Yue, 2016; Donner, 2017; Garfield, 1987; van Leeuwen et al., 2013v), indicating that within Web of Science Core Collection, the “Editorial Material” document type denotes an article that provides the opinions of a person, group, or organization, including editorials, interviews, commentary, and discussions between individual, post-paper discussions, round table symposia, and clinical conferences (van Leeuwen et al., 2013). The research on Editorial Material began with Garfield’s (1987) in highly cited journals in the medical field. To investigate the composition of Editorial Material, Rousseau (2009) proposed the identification algorithm of Editorial Material and examined the characteristics of Editorial Material in highly cited medical journals. Later, van Leeuwen et al. (2013) reported that some Editorial Materials as citation subjects played a crucial role. Likewise, Ding et al. (2016) reported that during 1999–2014, the number of “Article” type of literature in Nature and Science declined, whereas the Editorial Material type of literature increased. Meanwhile, Ding et al. (2016) highlighted some errors in the classification of document types in the Web of Science, which significantly affected the accuracy of the analysis. Then, Donner (2017) validated the match between the document type data in the Web of Science and the document type data on the official website of the journal, reporting that 94 % of the selected 791 scientific samples were correct. Meanwhile, Haustein, Costas, and Larivière (2015) explored the mutual influence of document types, as well as citation patterns, and revealed that Editorial Material plays a crucial role in the diffusion path of social media. In particular, studies analyzed the types of document frequently used in social media and the impact of different types of document on their citations (Braun, Glänzel, & Schubert, 1989; Campanario, Carretero, Marangon, Molina, & Ros, 2011; Ding et al., 2016; Frandsen, 2008; Haustein et al., 2015; Iefremova, Wais, & Kozak, 2018; Rousseau, 2009; Sigogneau, 2000; van Leeuwen, Moed, & Reedijk, 1999, 2013). Thus, in the study of sleeping beauties in scientific literature, we should not only identify sleeping beauties in science of “Article” and “Review” types but also focus on their identification and characterisTable 1 Social media metrics and data sources of literature in PLOS Biology used in this paper. Indicator name

Data Source

View Save Discussed

PLOS, Figshare PubMed Central CiteULike, Mendeley, ORCID, and so on. Nature Blogs, Science Seeker, Research Blogging, Wordpress.com, Twitter, Facebook, Reddit, news media, blogs, reference material, and institutions. F1000Prime

Recommended

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

5

Fig. 2. Annual distribution of the number of papers published in PLOS Biology (2003–2018).

tics in the Editorial Material type document, which is a beneficial supplement to the study of existing sleeping beauties in science. In the following section, we define S-SB in science through social media metrics. Furthermore, we propose a quantitative method to identify S-SB. Based on the analysis of all types of papers in PLOS Biology, we perform a comparative analysis of S-SB and C-SB in the research results, revealing the new features of S-SB and the “sleeping-awaken” mode of S-SB compared with C-SB.

3. Data and methods 3.1. Data collection and processing The data were retrieved from PLOS Biology Open Access Platform and Web of Science-Core Collection database, and we analyzed the literature of PLOS Biology journals. The academic influence indexes (citation indicators) of the literature were based on the citation information from articles published annually in PLOS Biology, which was included in the Web of Science-Core Collection. In the Web of Science-Core Collection, we searched the journal “PLOS Biology” and selected all literature of “PLOS Biology” from 2003 to 2018. In addition, the number of citations obtained annually for each literature was extracted by creating the citation report of the search results. The social media impact of literature (social media metrics) primarily included View, Save, Discussed, and Recommended indicators (Moed, 2017; Haustein et al., 2015; Sugimoto, Work, Larivière, & Haustein, 2017), which were derived from open access data of PLOS Biology website. There, View indicator data were the sum of PLOS, Figshare PubMed Central page views, and downloads. The Save indicator was the number of times a document had been saved in document managers like CiteULike, Mendeley, and ORCID. The Discussed indicator was based on the following: (a) the number of times literature had been shared on blogs (Nature Blogs, Science Seeker, Research Blogging, and Wordpress.com); (b) the number of times of the paper was shared on social media (e.g., Twitter, Facebook, and Reddit); (c) the number of “likes,” “shares,” “comments,” and so on; (d) the number of times of online discussions like general article coverage (news media, blogs, reference material, and institutions), online encyclopedia, journal comments, and the like. The Recommended indicator was provided by the PLOS Publishing Group through platforms (such as the online recommendation channel), and source data on the official recognition of PLOS research papers (such as the F1000 Prime platform; Table 1). To collect data, we used “jsoup” to examine the uniform resource locator (URL) of each article in the PLOS Biology website and obtained the specific location of each tag. Notably, jsoup is a hypertext markup language (HTML) parser of Java, which can directly parse a URL address and HTML text content. In this process, we needed to open each article in the website sequentially through the loop to obtain the response JavaScript Object Notation (JSON). After getting all URLs and the specific location of each tag, we spliced the data fields we wanted through JSON. First, it was grouped by three fields of “name,” “display name,” and “group name.” We defined “group name” based on the indicator name in Table 1, while characterized “name” and “display name” in light of data source listed in Table 1. Then, we extracted the data of “by month” field under different groups in the above-mentioned three fields of each article through Digital Object Unique Identifier (DOI), including the following data “year,” “month,” “PDF,” “HTML,” “readers,” “comments,” “likes,” and “total.” Afterwards, the data extracted by JSON were transformed into an Excel table. Then, we extracted the relevant data of each article in PLOS Biology and collected a total of >2 million pieces of data through DOI number of the literature. Next, we cleaned the data obtained and removed irrelevant data like Withdrawal, Correction, Letter, and some Editorial Material. Finally, over 2 million related data of the remaining 4019 documents were used as the target data of this study Figs. 2–4), and these target data were classified and calculated by Excel and MATLAB 2018b.

6

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

Fig. 3. Annual distribution of the number of Discussed and Recommended obtained by papers published in PLOS Biology (2003–2018). (Note: The number of Y-axis of each node is the number of discussed or recommended obtained by papers published in the year of X-axis from their publication to March 2019).

Fig. 4. Annual distribution of the number of Viewed and Saved obtained by papers published in PLOS Biology (2003–2018). (Note: The number of Y-axis of each node is the number of Viewed and Saved obtained by papers published in the year of X-axis from their publication to March 2019).

3.2. Concept and metrics In this study, we identified sleeping beauties on social media from a social media metrics perspective. To facilitate the research, it is essential to redefine some concepts in the process of research. 3.2.1. Citation-based sleeping beauties We identified sleeping beauties in science through the academic (citation) diffusion trajectory of scientific literature, which was called citation-based sleeping beauties (C-SB). Among them, for the “all-elements sleeping beauties in scientific literature,” we called it citation-based all-elements sleeping beauties (Ca-SB). 3.2.2. Social media-based sleeping beauties In this study, we identified sleeping beauties in science based on the diffusion trajectory of scientific literature through social media metrics, which we called social media-based sleeping beauties (S-SB). Among them, for the social media diffusion trajectory of “all-elements sleeping beauties in scientific literature,” we termed them the social media–based all-elements sleeping beauties (Sa-SB). 3.2.3. Social media influence index The function value of the community effect of social media influence generated monthly after the literature publication. To describe the social media diffusion trajectories of a published paper, we used View (V), Save (S), Discussed (D), and Recommended (R) to describe changes in the social media diffusion trajectories of a published paper.

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

7

Thus, since the publication of a literature, the dynamic change impact of social media–based sleeping beauty index (IS index) in the i month is as follows: IS i = Wt v−s · Vi + Wts−s · Si + Wtd−s · Di + Wtr−s · Ri where Wtv−s , Wts−s , Wtd−s , Wtr−s are the corresponding weights of V, S, D, and R indicators, respectively; i denotes time, that is, the i month after the document’s publication; IS i is the social media influence of the i month after the paper’s publication; Vi denotes the amount of View in the i month after paper’s publication; Si denotes the amount of Save in the i month after the paper’s publication; Di denotes the volume of Discussed in the i month after the paper’s publication; and Ri denotes the amount of Recommend in the i month after the paper’s publication. To determine Wtv−s , Wts−s , Wtd−s , and Wtr−s , based on the analytic hierarchy process, we constructed a structure matrix according to the degree of five types of indicators on the social media diffusion trajectory of a document and assigned different weight values to each index. The basic steps of the analytic hierarchy process are as follows: To compare the influence of n factors (X1, X2, . . ., Xn) at a certain level on one factor (influence) at the next level, we can compare the contribution (or importance) of Xi and Xj to influence from X1, X2, . . ., Xn. Assign Xi/Xj on a scale of 1–9 (Hou & Yang, 2019). To determine the weight Wtv−s , Wts−s , Wtd−s , Wtr−s : Xij

Vi

Sa

Re

Di

Vi Sa Re Di

1 2 7 7

1/2 1 5 5

1/7 1/5 1 1

1/7 1/5 1 1



1 1/2

⎢2 1 ⎣7 5

N=⎢

7 5

1/7 1/7



1/5 1/5 ⎥ 1

1

1

1

⎥ ⎦

We obtained the maximum of the eigenvalue and the eigenvector under the maximum eigenvalue. max =4.0159(0.0912 0.1528 0.6958 0.6958)T We tested the results for consistency, where RI is a random consistency index (when n = 4, RI = 0.9). CI=

4.0159 – 4 max– n = = 0.0053 n– 1 4 – 1

CR=

CI =0.059 RI

Hence, the matrix passed the conformance test. Then, we normalized the eigenvalue vector (0.0912 0.1528 0.6958 0.6958)T to get the weight vector (0.0558 0.0934 0.4254 0.4254)T , namely, Wtv−A = 0.0558, Wts−A = 0.0934, Wtd−A = 0.4254, Wtr−A = 0.4254. Hence, the social media influence index (IS) can be expressed as follows: IS i = 0.0558Vi + 0.0.0934Si + 0.4254Di + 0.4254Ri 3.3. Redefining the “sleeping and awakening” of sleeping beauties on social media In the delayed identification and sleeping beauties in science studies, the definitions of sleeping and awakening in the literature were similar among different researchers. Based on the definition of sleeping period and awakening period of sleeping beauties in science (Garfield, 1989a, 1989b, 1990; Glanzel et al., 2003; Glanzel & Garfield, 2004; Ke et al., 2015; van Raan, 2004), researchers generally believed that the characteristics of sleeping beauties (delayed recognition) literature were as follows: (a) sleeping period: an average of one to two citations per year in 3–5 years; (2) waking up period: during a certain period (>4 years) after the sleep period, it was cited widely (cited >20 times). The judgment of literature in sleep or awakening period is based on the form of literature in a year and the duration of this form. If literature lasts for a long time in a certain form (waking or sleeping), it is considered that the literature is in this form (waking or sleeping) during this period. From the perspective of citation-based beauties, researchers generally believed that literature that awakened after >4-year sleep should be called a sleeping beauty (delayed recognition). The time statistic unit used in C-SB research is year-by-year. However, the annual statistics will contain a big-time gap in documents published in January and December every year because the publishing cycle of many documents is monthly. If a document published in January revives after 4-year sleep, its sleeping period will be ≥47 months, whereas if the paper is published in December, the period of awakening will be

8

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

≥36 months after 4-year sleep. However, from the social media platforms perspective, literature diffusion in social media platforms is faster and the influence diffusion trajectory of monthly statistical literature can more precisely reflect the dynamic diffusion process of literature on social media platforms. Thus, under the comprehensive diffusion trajectory, we took Th = 36 months as the standard; that is, when a document wakes up after a sleep period of ≥36 months, we considered it as S-SB (delayed recognition). In this study, we assessed different states of documents at different times by the persistence of different states of documents. However, unlike the citation indicators based on the identification of sleeping beauties in science, we introduced social media indicators for dynamic monitoring and identification of sleeping beauties on social media. Thus, we redefined the “sleeping and awakening” state of literature from the social media perspective. Next, we explain the states of documents and indicators. 3.3.1. Awakening In the identification studies of C-SB, the time unit is usually a year, the scientific literature needs to be awake for four consecutive periods (years), which is called real wake up. However, depending on the time of month, it does not need to be continuous. To ensure the study consistency, we also selected four consecutive time intervals to identify the literature recovery. Nevertheless, on social media platforms, we defined the impact value (ISi ) as 4 consecutive months. Of note, ISi was ¯ (ISn · · ·ISn−3 ) > Ab. ¯ The higher than the average of comprehensive influence of all articles in this journal in each month (Ab) indicator was selected in the study of sleeping beauty’s awaking state because it is unfair to compare publications from different scientific disciplines, as a considerable difference exists in the speed and frequency of citation accumulation across different fields of science (Mcallister, Narin, & Corrigan, 1983; Ravenscroft, Liakata, Clare, & Duma, 2017; Waltman, 2016). 3.3.2. Sleeping ¯ 2 (IS n ≤ Ab/2). ¯ The average social media impact of 4 consecutive months in the literature is ≤ Ab/ 3.3.3. Dogsleep During a specific period, when no continuous change occurs in the awakening or sleeping state of a paper (persistent awake or sleeping state does not exceed 4 months), it is considered in the state of “dogsleep”; this state is often ignored by researchers. 3.3.4. Awakening time (Tw-k) The duration of the waking state, where K is the duration for which the waking state continues, and when K = 1, it is the time of the first waking up. 3.3.5. Dogsleep time (Tb-k) The duration of the state of sleep, where K is the duration for which that dogsleep continues, and when K = 1, it denotes the time of the first sleep. 3.3.6. Sleeping time (Ts-k) The duration of the sleeping state, where K is the duration of time, and when K = 1, it is the initial sleep time. 3.3.7. Awakening intensity ¯ (j−i) ) and  standard To illustrate the characteristics of the literature in different stages, we used two variables (mean (IS deviation) to illustrate the characteristics of the literature in different stages. We hypothesized that the process of literature awakening is: “sleeping/sleeping–waking–sleeping/sleeping–waking–sleeping/sleeping–waking–sleeping/sleeping ¯ (j−i) ≤ . . .” (for the “Flash in the pan” type of literature, the first slumber is 0). For a paper in the sleep period, when IS ¯ Ab ¯ IS ¯ (j−i) and  is smaller, its sleeping intensity is higher. For a paper in the wake-up period, when IS ¯ (j−i) ≥ Ab, ¯ (j−i) and  is , IS 2 ¯ Ab ¯ ¯ IS ¯ (j−i) > Ab/2, ¯ (j−i) < Ab, ¯ (j−i) and  higher, its recovery intensity is higher. For a paper in the dogsleep period, when IS < IS 2 ¯ the paper is considered to be in a discontinuous waking state. ¯ (j−i) > Ab, is higher, its dogsleep intensity is higher. When IS 3.4. Definition of the awakening type of S-SB We classified and expressed the literature in different states based on the indicators mentioned above (Table 2) to describe the degree of sustained change of literature in different states and the awakening type of literature. 4. Results We conducted an empirical study on the papers retrieved from PLOS Biology to verify the validity of the definitions and indicators mentioned above. Based on different knowledge diffusion trajectories of the literature, we identified the S-SB and

Second-generation “Sleeping-Awakening”

Second Awakening Period

First Sleep/Dog Sleep Period

Time.

IS (j-i)

Time.

IS (j-i)

Time

IS (j-i)

Time

IS (j-i)

Published to date – Ts-1 < Th Tb-1 < Th Tb-1 ≥ Th Ts-1 ≥ Th Ts-1 < Th or Tb-1 < Th

≤Ab/2 – ≤Ab/2 >Ab/2 >Ab/2 ≤Ab/2 ≥0

– Published to date Tw-1 ≥ 4 Tw-1 ≥ 4 Tw-1 ≥ 4 Tw-1 ≥ 4 Tw-1 ≥ 4

– ≥Ab ≥Ab ≥Ab ≥Ab ≥Ab ≥Ab

– – Dog Sleep/Sleep Dog Sleep/Sleep Uncertainty Uncertainty Ts-2 < Th

– – – – – – ≤Ab/2

– – – – – – Tw-2 ≥ 4

– – – – – – ≥Ab

Ts-1 < Th or Tb-1 < Th

≥0

Tw-1 ≥ 4

≥Ab

Tb-2 < Th

>Ab/2

Tw-2 ≥ 4

≥Ab

Ts-1 < Th or Tb-1 < Th

≥0

Tw-1 ≥ 4

≥ Ab

Tb-2 ≥ Th

>Ab/2

Tw-2 ≥ 4

≥Ab

Ts-1 < Th or Tb-1 < Th

≥0

Tw-1 ≥ 4

≥ Ab

Ts-2 ≥ Th

≤Ab/2

Tw-2 ≥ 4

≥Ab

Sleeping all the time Keep awakening First-generation “Sleeping-Awakening”

First Awakening Period

Document type

S-Early sudden awakening S-Early gradual awakening S-Delay gradual awakening S-SB S-total-element early sudden awakening S-total elements early gradual awakening S-total elements delay gradual awakening Sa-SB

Second Sleep/Dog Sleep Period

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

Table 2 Identification criteria for different types of S-SB.

9

10

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

Fig. 5. The diffusion trajectory based on the citation and social media of Gross L(2007a). (Notes: The horizontal axis is in months. The number of nodes corresponding to the vertical axis represents the number of citations and IS in the corresponding month).

Fig. 6. The diffusion trajectory based on the citation and social media of Stavenga and Arikawa (2008). (Notes: The horizontal axis is in months. The number of nodes corresponding to the vertical axis represents the number of citations and IS in the corresponding month).

C-SB literature, respectively, and then compared and analyzed the two types of SB to describe the new characteristics of different types of literature. 4.1. Identification of different types of documents under social media Based on the definition of social media–based sleeping beauty index and related concepts, we estimated the social media influence diffusion trajectories of all articles in PLOS Biology each month. First, we evaluated the average of the combined ¯ = 5.38. Thus, based on the papers published in PLOS Biology, the awakening state was defined as (IS · · ·ISn−3 ) > effects Ab n

5.38, and sleeping state as ISn ≤ 2.69. We evaluated the social media impact value of 4019 articles each month since publication and classified the articles based on the degree of continuous change of their impact value (Appendix A). The results revealed that among 4019 articles, 1143 were sleeping all the time, and 32 were awake all the time. In addition, 1029 articles occurred in the first-generation “sleeping-awakening,” and 1815 in the second-generation “sleepingawakening.” Among them, the highest proportion of the first-generation recovery literature was the S-early sudden awakening literature; however, the largest proportion of the second-generation recovery literature was the S-total elements early gradual awakening literature. We identified three S-SB papers: Gross (2007a), Stavenga and Arikawa (2008), and Del Cul, Baillet, and Dehaene (2007); Figs. 5–7); of these, the trajectory of social media metrics value of Stavenga and Arikawa (2008) had a high peak in the early stage but because it only has 3 consecutive months of social media metrics value >5.38, the number of consecutive times does not exceed the 4 consecutive months defined by awakening. Hence, this paper is not considered to be awakening at this time. Meanwhile, as the average value of social media metrics of Stavenga and Arikawa (2008) is 2.458 (<2.69) in the duration from publication to actual awakening, this paper is considered to have been in deep sleep before actual awakening; however, this paper belongs to S-SB. On the other hand, we also identified 17

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

11

Fig. 7. The diffusion trajectory based on the citation and social media of Del Cul et al. (2007). (Notes: The horizontal axis is in months. The number of nodes corresponding to the vertical axis represents the number of citations and IS in the corresponding month).

Fig. 8. The diffusion trajectory based on the citation and social media of Neugebauer (2006). (Notes: The horizontal axis is in months. The number of nodes corresponding to the vertical axis represents the number of citations and IS in the corresponding month).

Sa-SB papers, such as Neugebauer (2006), Baltimore (2011), Weber, Gerton, and Polancic (2004), Gross (2007b), and Norio and Schildkraut (2004; Fig. 8; Appendix B). 4.2. Comparative analysis of S-SB and C-SB Owing to the trajectory of social media metrics, we distinguished S-SB literature and Sa-SB literature. In the further analysis of these documents, we used the citation indicators to identify the consistency of the literature. If inconsistent, what were the new features of C-SB in a new perspective, and what were the differences in their trajectories? Which kind of waking type document did they belong to? 4.3. Is the S-SB consistent with the C-SB? According to van Raan (2004) and Li (2014), based on the definition of C-SB and Ca-SB, we identified five C-SB articles in PLOS Biology but did not find any Ca-SB articles (Tables 3 and 4). We compared these five articles with the S-SB articles and Sa-SB articles identified under the social media metrics trajectory and found that they were completely inconsistent. The published time distribution of S-SB and Sa-SB identified by social media metrics trajectory was between 2003 and 2011. The published time distribution of C-SB identified by citation-based indicators was in 2003–2004, and two articles of C-SB in 2003 were Gibson (2003) and Market and Papavasiliou (2003). In 2004, there were three C-SB articles, namely, McKay (2004); Rodríguez-Gironés and Santamaría (2004), and Servedio (2004); among them, McKay (2004); Rodríguez-Gironés and Santamaría (2004), and Market and Papavasiliou (2003) are S-total elements early gradual awakening type from the

12

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

Table 3 The sleeping-awaken information of 5 C-SB papers identified in PLOS Biology by the criteria of van Raan (2004). Papers

McKay (2004)

Servedio (2004)

Gibson (2003)

Rodríguez-Gironés and Santamaría (2004)

Market and Papavasiliou (2003)

Ts/Tb-1 (month) Tw-1 (month) Tw-1-IS Ts/Tb-1-IS Tw-1- ␴ Ts/Tb-1- ␴ Ts/Tb-2 (month) Tw-2 (month) Tw-2-IS Ts/Tb-2-IS Tw-2- ␴ Ts/Tb-2- ␴

0 44 16.3012 0 31.7778 0 3 9 8.5808 5.2452 1.8162 0.9777

0 6 17.1989 0 11.9118 0 91 5 9.0731 2.8971 3.8648 1.1286

0 130 17.5768 0 26.7923 0 56 0 0 5.4700 0 5.51

0 9 23.8948 0 28.1432 0 2 4 6.5286 4.3524 0.5375 0.4735

0 14 24.0219 0 37.0509 0 1 78 13.7244 4.5198 8.7052 0

Table 4 The Numbers of pages, references, and citations of 5 C-SB papers identified in PLOS Biology by the criteria of van Raan (2004). Delay Identification Method van Raan (2004)

Name of document

DOI

Sleeping duration (years)

Number of citations in 4 years of awakening /total citations

Pages

References

McKay (2004) Servedio (2004) Gibson (2003) Rodríguez-Gironés and Santamaría (2004)) Market and Papavasiliou (2003)

10.1371/journal.pbio.0020302 10.1371/journal.pbio.0020420 10.1371/journal.pbio.0000015 10.1371/journal.pbio.0020350

4 4 5 5

24 / 68 22 / 52 19/27 21 / 79

4 4 2 5

24 30 6 23

10.1371/journal.pbio.0000016

6

23/61

4

17

perspective of social media metrics trajectory. Furthermore, Servedio (2004) is S-total elements delay gradual awakening type, and Gibson (2003) belongs to S-early sudden awakening type. 4.4. Causes of awakening of C-SB under social media metrics trajectory Based on the social media metrics trajectory, we found that all five papers experienced early resuscitation; however, four papers, McKay (2004); Rodríguez-Gironés and Santamaría (2004), Servedio (2004), and Market and Papavasiliou (2003), had two wakes, whereas Gibson (2003) only waked once. We found that from the perspective of social media, social media metrics augmented the awakening of literature (Table 3). The average first awakening time of five literature under the social media metrics trajectory was 4–5 years earlier than that under the trajectory of citation-based indicators. Moreover, the causes of early awakening were all caused by social media metrics, and they received massive views, downloads, and shares in the early stage after publication. Although Gibson (2003) only had one awakening, its awakening time lasted for 130 months; during this period, the average value of his social media influence was 17.5768, the awakening intensity was 26.7923, and the awakening time was much higher than that of other C-SB papers under social media. Gibson (2003) did not fall into a deep sleep directly after waking up for 130 months but entered a state of pseudosomnia. Market and Papavasiliou (2003); McKay (2004); RodríguezGironés and Santamaría (2004) and Servedio (2004) did not make the transition from the first to the second recovery period through sleep (“wake sleep wake”) but through different periods of sleep (“wake sleep wake”). Furthermore, the average value and intensity of influence in the first awakening period of these four articles were significantly higher than that in the second awakening period. 4.5. The characteristics of document types in C-SB Based on the type and times cited of documents, all the identified C-SB documents were Editorial Material type and had high citation frequency; however, C-SB documents were not found in the “Article” and “Review” type documents; this finding corroborated the research of van Leeuwen et al. (2013). “Given this change, editorial materials have somewhat more time to gain momentum in impact development. The number of received citations is only established after a number of years.” Considering the page length and the number of references in McKay (2004); Market and Papavasiliou (2003); Servedio (2004), the length was four pages, and the number of references was 24, 30, and 17, respectively. Rodríguez-Gironés and Santamaría (2004) had five pages and 23 references. Gibson (2003) had two pages and 6 references (Table 4). Overall, regarding the number of pages in a document, the longer they are, the more references they have, and more frequently they are cited (van Leeuwen et al., 2013; Haustein et al., 2015).

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

13

4.6. Literature characteristics of S-SB and Sa-SB 4.6.1. The time characteristics of “sleeping-awakening” in S-SB and Sa-SB To further explore the characteristics of S-SB and Sa-SB documents identified under the social media diffusion trajectory, we further analyzed 20 documents identified in PLOS Biology (Table 5). Based on the sleep time and wake time of these 20 articles, the average sleep time in the delayed identification stage of S-SB literature was 67.33 months, whereas the average sleep time in the delayed identification stage of Sa-SB literatures was 83.29 months, and the average sleep time in Sa-SB literature was more than that of S-SB literature. However, these articles had longer sleep duration than C-SB articles. On the other hand, based on the duration of wake-up time after delayed recognition, there are two levels of differentiation in wake-up time of different types of literature. First, in the S-SB literature, Gross (2007a) had a relatively short sleep time of 37 months but a prolonged wake-up time of 57 months. Moreover, Gross (2007a) woke up for another 33 months after a month’s sleep. Del Cul et al. (2007) woke up for 5 months after 88 months of sleep, and again for 26 months after a month of dogsleep. However, after 77 months of sleep, Stavenga and Arikawa (2008) only woke up for 4 months and slipped into a deep sleep again. In the Sa-SB literature, their first wake-up time was similar, but after different sleep periods, the wake-up time varied. In Derisi et al. (2003), the longest time to wake up after delayed recognition was 71 months. In other literature, the time to wake up after delayed recognition was shorter, and no more than 20 months. Per the average time of awaking after delayed recognition of S-SB and Sa-SB documents, the average time of awaking after delayed recognition of S-SB documents was significantly higher than that of Sa-SB documents. After delayed recognition of S-SB literature, the average awaking time was 22 months, whereas after delayed recognition of Sa-SB literature, the average recovery time was 10.7 months.

4.6.2. The awake characteristics of S-SB and Sa-SB From the perspective of the average influence value and awakening intensity of the literature in the delayed recognition stage, the average influence value was 10.0286 and the awakening intensity was 3.6 of the literature in S-SB after the delayed recognition stage; both were higher than the average influence value (7.9716) and awakening intensity (1.9890) of the Sa-SB literature. Conversely, the average influence value (21.4244) and the average awakening intensity (19.7931) of the Sa-SB literature at their first awakening generation were also much higher than those at the second generation of average influence value and average awakening intensity. Hence, the average awakening intensity of the Sa-SB literature at the first-generation was higher than that of S-SB literature at the first-generation and that of Sa-SB literature at the second-generation awakening intensity. Based on the times cited and awakening intensity of 20 articles, the times cited displayed no direct correlation with the awakening intensity of articles. Besides highly cited literature, zero- and low-cited literature also exist in S-SB literature and Sa-SB literature (Figs. 5–8), which is markedly different from the traditional C-SB literature. Moreover, the average influence value and awakening intensity of the first-generation of zero-cited literature are significantly higher than that of other literature. For example, Gross (2007a) is a zero-cited literature, which belongs to the S-SB type, and its average influence value (14.8625) and awakening intensity (8.4497) in the awakening period are significantly higher than Stavenga and Arikawa (2008)) and Del Cul et al. (2007) Likewise, the average influence value and awakening intensity of Baltimore (2011) and Robinson (2009), zero-cited literature in the Sa-SB type, were higher than those in others Sa-SB literature.

4.6.3. The characteristics of document types of S-SB and Sa-SB After classifying 20 literature according to the document types (Table 6), we found that the S-SB and Sa-SB literature of article type had a higher cited frequency. However, the cited frequency of editorial material, comment, and biological item types of S-SB and Sa-SB literature is relatively low. Of these, two of the three zero-cited literature belonged to the comment document type and one belonged to the biological item document type; this phenomenon is contrary to the C-SB literature characteristics of the “editorial material” document type identified by the citation-based index. Thus, social media indicators hastened the recovery of literature (van Leeuwen et al., 2013v). In the diffusion trajectory of social media metrics, “editorial material, comment, biological item” type literature will not need a long time to obtain citation accumulation and can exert considerable influence, as well as enhance their recovery. Based on the page length and references of eight S-SB and Sa-SB documents of “editorial material, comment, biological item” type, their characteristics are similar to those of C-SB documents of “editorial material” type under the function of citation index. The longer the page, the more references, the higher the citation frequency.

5. Conclusions This study extends the perspective of sleeping beauties in science from the diffusion trajectory based on the citation indicators to the social media diffusion trajectory based on the social media metrics. We defined C-SB and S-SB and their recognition indicators and standards. The study enriches the identification and classification of sleeping beauties in science, more in line with the actual diffusion and evolution law of scientific literature, reflecting the dynamic and real-time diffusion trajectory characteristics of scientific literature under the social media platforms. Meanwhile, we constructed a quantitative method to identify the S-SB literature, proposed the IS index to identify sleeping beauties in social media, and defined eight

14

Table 5 List of the document types, awaking time, and sleeping time of S-SB and Sa-SB in PLOS Biology based on the social media. Document Name

Times Cited

Document Type

Ts/Tb-1 (month)

Tw-1 (month)

Tw-1-IS

Ts/Tb-1-IS

Tw-1-␴

Ts/Tb-1-␴

Ts/Tb-2 (month)

Tw-2 (month)

Tw-2-IS

Ts/Tb-2-IS

Tw-2-␴

Ts/Tb-2-␴

S-SB S-SB

Gross (2007a) Stavenga and Arikawa (2008) Del Cul et al. (2007) Neugebauer (2006) Baltimore (2011) Weber et al. (2004) Rechkoblit, Malinina, and Cheng (2006) Derisi, Kennison, and Twyman (2003) Gross (2007b) Norio and Schildkraut, (2004) Genc et al. (2004) Tsuriel et al. (2006) Dutta et al. (2005) Nordstrom et al. (2006) De Blas et al. (2005) Mori et al. (2007) Chao et al. (2005) Robinson (2009) Miller et al. (2005) Powell (2007)

0 11

Comment Editorial Material

37 77

57 4

14.8625 8.2772

1.0451 2.4584

8.4497 1.5173

0.6942 14.6451

1 52

33 0

35.4821 0.0000

3.5854 1.8249

25.3076 0.0000

0.0000 2.9747

302 6 0 107 106

Article Editorial Material Biographical-Item Article Article

88 0 0 0 0

5 6 4 9 4

6.9462 22.9059 44.4813 14.0368 24.6078

2.5140 0.0000 0.0000 0.0000 0.0000

0.8434 30.5074 43.9803 11.4340 20.8348

4.3595 0.0000 0.0000 0.0000 0.0000

1 61 41 98 93

26 6 10 4 11

9.1216 11.1414 5.9948 5.6219 8.7735

5.2800 2.0591 2.1104 2.1124 2.1631

2.4622 4.2000 0.2979 0.1603 3.1696

0.0000 1.0550 2.5128 1.3451 1.5087

4

Editorial Material

0

8

24.7403

0.0000

33.2142

0.0000

92

71

10.2674

2.2049

3.4159

1.8318

1 51

Comment Article

0 0

18 4

17.8736 9.3605

0.0000 0.0000

26.3559 4.4587

0.0000 0.0000

44 113

4 4

8.5164 8.2818

2.2160 2.4100

4.5296 2.1832

2.2707 1.4845

38 107 23 80

Article Article Article Article

0 0 0 0

6 4 7 5

13.1595 20.7716 18.9959 21.1482

0.0000 0.0000 0.0000 0.0000

5.4315 12.0100 13.8529 16.8409

0.0000 0.0000 0.0000 0.0000

108 72 93 83

8 10 8 4

7.4333 10.7289 6.8490 6.4536

2.4518 2.4847 2.5031 2.5313

1.2550 4.7178 1.5659 0.4153

1.5708 1.0786 1.5711 1.5186

81 92 77 0 119 13

Article Article Article Comment Article Editorial Material

0 0 0 0 0 0

7 4 6 4 5 17

20.6221 22.2642 11.7273 37.9022 17.1418 22.4765

0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

16.6849 10.4421 5.7142 34.8332 14.4596 35.4282

0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

89 74 102 58 94 101

17 5 8 4 4 4

7.2383 6.6053 7.7316 6.5971 6.7047 10.5789

2.6071 2.6092 2.6207 2.6612 2.6621 2.6820

1.6132 0.7039 1.9121 1.5633 1.0438 1.0664

1.4827 1.3667 1.4484 2.9860 1.4558 1.5783

S-SB Sa-SB Sa-SB Sa-SB Sa-SB

Sa-SB

Sa-SB Sa-SB Sa-SB Sa-SB Sa-SB Sa-SB Sa-SB Sa-SB Sa-SB Sa-SB Sa-SB Sa-SB

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

Item

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

15

Table 6 List of the times cited, pages, and references of S-SB and Sa-SB of Editorial Material, Comment, Biographical-Item types in PLOS Biology based on the social media. Document Name

Document Type

Times Cited

Pages

References

Stavenga and Arikawa, (2008) Neugebauer (2006) Powell (2007) Derisi et al. (2003) Gross (2007a) Gross (2007b) Robinson (2009) Baltimore (2011)

Editorial Material Editorial Material Editorial Material Editorial Material Comment Comment Comment Biographical-Item

11 6 13 4 0 1 0 0

3 3 6 2 2 2 1 2

21 7 11 1 1 1 1 0

types of literature awakening according to the “sleeping-awakening” of literature under the social media diffusion trajectory of literature. The main conclusions of this study are as follows: (1) From the perspective of the social media platform, this study furthered the sleeping beauties in science. In addition, the sleeping beauty in social media metrics is proposed and the differences between S-SB and C-SB are analyzed. Using a quantitative method, we constructed the IS index of literature under the social media metrics and identified the S-SB and Sa-SB under the social media perspective. Furthermore, the open access data of PLOS Biology was used as an example to validate the feasibility of this method. (2) Based on the IS index, we identified the actual diffusion trajectories of 4019 articles in PLOS Biology and detected 3 S-SB articles, 17 Sa-SB articles, and other types of articles. The new features of the S-SB and Sa-SB literature from the perspective of social media platform are further studied, revealing that the influence of social media–based index plays a leading role in the awakening process of sleeping beauties from the perspective of social media. This finding differs from the diffusion trajectory recognition of sleeping beauties based on the citation indicators. We found that 12 articles are of “Article” type and 4 of “Editorial Material” type, establishing that the “Editorial Material” type of literature is also of great value to the diffusion of scientific knowledge. (3) This study reveals differences between C-SB literature and S-SB literature in terms of “sleeping-awakening” mode, length of sleep, and document types. From the perspective of social media metrics, C-SB has become the literature of S-total elements early gradual awakening type, S-total elements delay gradual awakening type, and S-early sudden awakening type. Social media metrics hastened the awakening time of the literature, and the first-generation awakening time of CSB literature under the action of social media metrics was 4–5 years earlier than that under the action of citation-based indicators. All early awakening was caused by social media–based indicators. Moreover, the C-SB literature in PLOS Biology is of the Editorial Material type, re-establishing that “Editorial Material” type literature is a type of literature worth noting in the process of promoting the diffusion of scientific knowledge. 6. Discussion Scientific research is an open system, and it is often characterized by cross-fertilization between different research areas. Thus, invisible colleges often trespass the disciplinary boundaries of a specific scientific field (or specialty; Sedita, Caloffi, & Lazzeretti, 2018). Based on social media, a new invisible college is established, which differs from the traditional invisible college, through the diffusion and exchange mechanism of scientific knowledge. The new invisible college is not only more convenient and timelier in executing scientific exchanges but also more dispersed in the geographical distribution of its members. In addition, the subjects of research and focus exhibit apparent diversification characteristics. Moreover, on the social media, all scientific communication and knowledge diffusion behaviors through View, Discussed, even Download and Save, can be recorded and saved through data, providing the data basis for us to conduct statistical and quantitative research. The invisible college built by the informal scientific communication system of social media indicators plays a major role in the diffusion of scientific knowledge and scientific communication system. Hence, the sleeping beauties in science based on social media has a crucial value in the research of invisible college. Invisible college, which was first formed in the 17th century by the Royal Society of England, is a form of scientific communication based on scientists’ sharing research interests and geographical proximity (Bartle, 1995; Lievrouw, 1990; Lingwood, 1969; Price, 1963). It is a fairly organized system in which scientists’ information sharing and cooperation can be predicted (1972, Crane, 1969; Griffith & Mullins, 1980). However, Price (1963,1986) identified clusters of scientists in the citation network of research topics through bibliometrics and scientometrics. The clusters of these scientists built an invisible college (Sandstrom, 1998; Zuccala, 2006). Lievrouw (1990) identified a major problem concerning the invisible college—the structure versus social process problem. Using the comparative analysis of sleeping beauty on social media (informal scientific communication) and sleeping beauty in science based on the citation network (formal scientific communication), this study discusses the differences of knowledge diffusion patterns in formal and informal scientific communication systems. In addition, this study proves that the diffusion trajectory of scientific knowledge in the invisible college based on the citation network is relatively “stable,” and the structure of the invisible college connected by knowledge diffusion is

16

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

more “determined” and “stable,” with longer lasting time; however, its influence is “centralized.” Nevertheless, the invisible college formed by scientific knowledge based on the social media metrics is faster, and the trajectory and form of knowledge diffusion are more “diversified.” Furthermore, the structure of the invisible college connected by such scientific knowledge is relatively “loose,” not “stable,” and the speed of disappearance is faster but the influence is more “extensive.” In this study, whether the sleeping beauty based on the social media or the sleeping beauty in science based on the citation network, we think it is the diffusion trajectory of scientific knowledge, as our research is based on the perspective of the diffusion trajectory of scientific knowledge, that is, whether scientific literature is “concerned” or “discovered.” If more researchers “concerned” or “discovered” a scientific literature, perhaps, this scientific literature could be “awakened”; it is not about whether there are leads to use in new developments (even in the perspective of the citation network because of the diversity of citation motivation and behavior, there is no guarantee that there are leads to use in new developments). Thus, the concepts of sleeping beauty based on the social media and sleeping beauties in science based on the citation network are similar from the perspective of the diffusion trajectory of scientific knowledge, which can be compared and analyzed. Moreover, the diffusion trajectory of scientific knowledge based on the social media can be recorded and statistically and quantitatively analyzed. Hence, this study further expands the research on the phenomenon of sleeping beauty on the social media. In the future, we can continue investigating sleeping beauties in science, sleeping beauties in patent, S-SB, and Altmetrics-based sleeping beauties. In addition, the diffusion of scientific knowledge can also be conducted through forms such as “invisible college,” academic conferences reports, literature reading, and curriculum learning (however, these corresponding data are hard to find, and it is difficult to perform accurate and quantifiable data analysis). With the increasing diversification of scientific communication, the diffusion of scientific knowledge is also becoming increasingly diversified. Different diffusion trajectory of scientific knowledge are crucial factors for whether sleeping beauties in science can “wake up.” The evolution of a scientific specialty cannot be observed simply by examining co-citation or bibliographic coupling analysis. As Chubin (1976) suggested, it must be performed by describing the content of the knowledge transfer. In the follow-up study, we aim to combine the two, from a comprehensive perspective (Altmetrics perspective) to explore the knowledge diffusion trajectory of literature and explore their similarities and differences. In this study, from the perspective of the entire lifecycle of the diffusion trajectory of scientific knowledge, we only discussed the literature characteristics and influencing factors of the first- and secondgeneration awakening of the literature. However, we found that some studies have more than three-generation awakening types. Will these multilevel awakening characteristics be different from that of the current research remain unclear. In the process of multigeneration awakening, which indicators play a key role and what is the mechanism of awakening also needs to be explored. In the future, we will focus on exploring these issues to further reveal the awakening mechanism of different types of sleeping beauty in science. Author contributions Jianhua Hou: Conceived and designed the analysis; Collected the data; Performed the analysis; Wrote the paper. Xiucai Yang: Contributed data or analysis tool; Performed the analysis. Acknowledgments Jianhua Hou acknowledges the support of the National Social Science Foundation of China under Grant 17BGL031. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:https://doi.org/10.1016/j. joi.2020.101012. References Adie, E. (2014). Taking the alternative mainstream. Professional De La Informacion, 23(4), 349–351. Baltimore, D. (2011). Lennart Philipson (1929-2011): A warrior has passed obituary. PLoS Biology, 9(9), e1001153 Barber, B. (1961). Resistance by scientist to scientific discovery. Science, 134, 596–602. Bar-Ilan, J., Shema, H., & Thelwall, M. (2013). Bibliographic references in web 2.0. In B. Cronin, & C. Sugimoto (Eds.), Bibliometrics and beyond: Metrics-based evaluation of scholarly research. Cambridge: MIT Press, in press. Bartle, R. G. (1995). “A brief history of the mathematical literature”. Publishing Research Quarterly, 11, 3–9. Baumgartner, S. E., & Leydesdorff, L. (2014). Group-based trajectory modeling (GBTM) of citations in scholarly literature: Dynamic qualities of “transient” and “sticky knowledge claims”. Journal of the Association for Information Science and Technology, 65(4), 797–811. Bornmann, L. (2013). What is societal impact of research and how can it be assessed? A literature surveys. Journal of the American Society for Information Science and Technology, 64(2), 217–233. Bornmann, L. (2014). Validity of altmetrics data for measuring societal impact: A study using data from Altmetric and F1000Prime. Journal of Informetrics, 8(4), 935–950. Bornmann, L. (2016). Scientific revolution in scientometrics: The broadening of impact from citation to societal. In C. R. Sugimoto (Ed.), Theories of informetrics and scholarly communication (pp. 347–359). Berlin, Germany: De Gruyter. Bornmann, L. (2015a). Usefulness of altmetrics for measuring the broader impact of research: A case study using data from PLOS and F1000Prime. Aslib Journal of Information Management, 67(3), 305–319. Bornmann, L. (2015b). Alternative metrics in scientometrics: A meta-analysis of research into three altmetrics. Scientometrics, 103(3), 1123–1144.

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

17

Bornmann, L., & Haunschild, R. (2018). Do altmetrics correlate with the quality of papers? A large-scale empirical study based on F1000Prime data. PloS One, 13(5), e0197133. Bornmann, L., & Leydesdorff, L. (2013). The validation of (advanced) bibliometric indicators through peer assessments: A comparative study using data from InCites and F1000. Informetrics, 7(2), 286–291. Bornmann, L., Ye, Y. A., & Ye, F. Y. (2018). ‘Identifying “hot papers” and papers with “delayed recognition” in large-scale datasets by using dynamically normalized citation impact scores. Scientometrics, 116(2), 655–674. Bornmann, L., Haunschild, R., & Adams, J. (2019). Do altmetrics assess societal impact in a comparable way to case studies? An empirical test of the convergent validity of altmetrics based on data from the UK research excellence framework (REF). Journal of Informetrics, 13(1), 325–340. ¨ Braun, T., Glanzel, W., & Schubert, A. (2010). On sleeping beauties, princes and other tales of citation distributions. Research Evaluation, 19(3), 195–202. Braun, T., Glänzel, W., & Schubert, A. (1989). Some data on the distribution of journal publication types in the Science Citation Index database. Scientometrics, 15(5–6), 325–330. Campanario, J. M., Carretero, J., Marangon, V., Molina, A., & Ros, G. (2011). Effect on the journal impact factor of the number and document type of citing records: A wide-scale study. Scientometrics, 87(1), 75–84. Chao, Y., Shiozaki, E. N., Srinivasula, S. M., Rigotti, D. J., Fairman, R., & Shi, Y. (2005). Engineering a dimeric caspase-9: A re-evaluation of the induced proximity model for caspase activation. PLoS Biology, 3(6), 1079–1087. Chhapola, V., Tiwari, S., Deepthi, B., & Kanwal, S. K. (2018). Citation classics in pediatrics: A bibliometric analysis. World Journal of Pediatrics, 14(6), 607–614. Chubin, D. E. (1976). “State of the field the conceptualization of scientific specialties”. The Sociological Quarterly, 17(4), 448–476. http://dx.doi.org/10.1111/j.1533-8525.1976.tb01715.x Cole, S. (1976). Professional standing and the reception of scientific discoveries. The American Journal of Sociology, 76, 286–306. Comins, J. A., & Leydesdorff, L. (2016). Rpys i/o: Software demonstration of a web-based tool for the historiography and visualization of citation classics, sleeping beauties and research fronts. Scientometrics, 107(3), 1509–1517. ¨ Costas, R., Zahedi, Z., & Wouters, P. (2015). Do altmetrics¨ correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective. Journal of the Association for Information Science and Technology, 66(10), 2003–2019. Costas, R., Leeuwen, T. N. V., & Raan, A. F. J. V. (2010). Is scientific literature subject to a ‘sell-by-date’? A general methodology to analyze the ‘durability’ of scientific documents. Journal of the American Society for Information Science and Technology, 61(2), 329–339. Crane, D. (1969). Social structure in a group of scientists: A test of the “invisible college” hypothesis”. American Sociological Review, 34(3), 335–352. Crane, D. (1972). Invisible colleges: Diffusion of knowledge in scientific communities. Chicago: The University of Chicago Press. Das, A. K., & Mishra, S. (2014). Genesis of altmetrics or article-level metrics for measuring efficacy of scholarly communications: Current perspectives. Journal of Scientometric Research, 3(2), 82–92. De Blas, G. A. D., Roggero, C. M., Tomes, C. N., & Mayorga, L. S. (2005). Dynamics of SNARE assembly and disassembly during sperm acrosomal exocytosis. PLoS Biology, 3(10), 1801–1812. De Winter, J. C. F. (2015). The relationship between tweets, citations, and article views for PLOS ONE articles. Scientometrics, 102(2), 1773–1779. Del Cul, A., Baillet, S., & Dehaene, S. (2007). Brain dynamics underlying the nonlinear threshold for access to consciousness. PLoS Biology, 5(10), e260. Derisi, S., Kennison, R., & Twyman, N. (2003). The what and whys of DOIs. PLoS Biology, 1(2), e57. Derrick, G. E., & Samuel, G. N. (2016). The evaluation scale: Exploring decisions about societal impact in peer review panels. Minerva, 54(1), 75–97. Dey, R., Roy, A., Chakraborty, T., & Ghosh, S. (2017). Sleeping beauties in computer science: Characterization and early identification. Scientometrics, 113(5439), 1–19. Ding, J., Ahlgren, P., Yang, L., & Yue, T. (2016). Document type profiles in nature, science, and PNAS: Journal and country level. Journal of Data and Information Science, 1(3), 27–41. Donner, P. (2017). Document type assignment accuracy in the journal citation index data of Web of Science. Scientometrics, 113(1), 219–236. Du, J., & Wu, Y. S. (2018). A parameter-free index for identifying under-cited sleeping beauties in science. Scientometrics, 116(2), 959–971. Dutta, D., Shaw, S., Maqbool, T., Pandya, H., & VijayRaghavan, K. (2005). Drosophila heartless acts with Heartbroken/Dof in muscle founder differentiation. PLoS Biology, 3(10), 1789–1800. Ebrahimy, S., Mehrad, J., Setareh, F., & Hosseinchari, M. (2016). Path analysis of the relationship between visibility and citation: The mediating roles of save, discussion, and recommendation metrics. Scientometrics, 109(3), 1497–1510. Egghe, L., Guns, R., & Rousseau, R. (2011). Thoughts on uncitedness: Nobel laureates and Fields medalists as case studies. Journal of the American Society for Information Science and Technology, 62(8), 1637–1644. Erdt, M., Nagarajan, A., Sin, S. C. J., & Theng, Y. L. (2016). Altmetrics: An analysis of the state-of-the-art in measuring research impact on social media. Scientometrics, 109(2), 1117–1166. Eysenbach, G. (2011). Can tweets predict citations? Metrics of social impact based on Twitter and correlation with traditional metrics of scientific impact. Journal of Medical Internet Research, 13(4), e123. Forkosh-Baruch, A., & Hershkovitz, A. (2012). A case study of Israeli higher-education institutes sharing scholarly information with the community via social networks. The Internet and Higher Education, 15(1), 58–68. Frandsen, T. F. (2008). On the ratio of citable versus non-citable items in economics journals. Scientometrics, 74(3), 439–451. Garfield, E. (1990). More delayed recognition. Part 2. From inhibin to scanning electron microcopy. Current Contents Clinical Medicine, 9(February 26), 3–9. Reprinted: Essays of an Information Scientist, 13: 68-74. Philadelphia: ISI Press. Garfield, E. (1987). Why are the impacts of the leading medical journals so similar ad yet so different? Itemby-item audits reveal a diversity of editorial material. Current Contents Clinical Medicine, 2, 7–13. Garfield, E. (1989a). Delayed recognition in scientific discovery: Citation frequency analysis aids the search for case histories. Current Contents, 23(June 5), 3–9. Reprinted: Essays of an Information Scientist, 12: 154–160. Philadelphia: ISI Press. Garfield, E. (1989b). More delayed recognition. Part 1. Examples from the genetics of color blindness, the entropy of short-term memory, phosphoinositides, and polymer Rheology. Current Contents, 38(September 18), 3–8. Reprinted: Essays of an Information Scientist, 12: 264–269. Philadelphia: ISI Press. Garfiled, E. (1980). Premature discovery or delayed recognition—why? Current Contents, 21(26), 5–10. Genc, B., Ozdinler, P. H., Mendoza, A. E., et al. (2004). A chemoattractant role for NT-3 in proprioceptive axon guidance. PLoS Biology, 2(12), e403. Gibson, G. (2003). Microarray analysis. PLoS Biology, 1(1), e15. Glanzel, W., & Garfield, E. (2004). The myth of delayed recognition. Scientist (Philadelphia, Pa), 18(11), 8.9. Glanzel, W., Schlemmer, B., & Thijs, B. (2003). Better late than never? On the chance to become highly cited only beyond the standard time horizon. Scientometrics, 58(3), 571–586. Griffith, B. C., & Mullins, N. C. (1980). Coherent social groups in scientific change. In B. C. Griffith (Ed.), Key papers in information science (pp. 52–57). White Plains, NY: Knowledge Industry. Gross, L. (2007a). Autoimmunity: a barrier to gene flow in plants? PLoS Biology, 5(9), e262. Gross, L. (2007b). Amyloid peptide toxicity in an animal model of Alzheimer disease. PLoS Biology, 5(11), e313. Haustein, S., & Siebenlist, T. (2011). Applying social bookmarking data to evaluate journal usage. Journal of Informetrics, 5(3), 446–457. Haustein, S., Costas, R., & Larivière, V. (2015). Characterizing social media metrics of scholarly papers: The effect of document properties and collaboration patterns. PloS One, 10(3), e0120495. Haustein, S., Peters, I., Sugimoto, C. R., Thelwall, M., & Larivière, V. (2014). Tweeting biomedicine: An analysis of tweets and citations in the biomedical literature. Journal of the Association for Information Science and Technology, 65(4), 656–669. Ho, Y. S., & Hartley, J. (2017). Sleeping beauties in psychology. Scientometrics, 110, 301–305.

18

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

Hou, J. C., & Yang, X. C. (2019). Patent sleeping beauties: Evolutionary trajectories and identification methods. Scientometrics, 120(1), 187–215. Hu, X., & Rousseau, R. (2019). Do citation chimeras exist? The case of under-cited influential articles suffering delayed recognition. Journal of the Association for Information Science and Technology, 70(5), 499–508. Hu, Z., & Wu, Y. (2014). Regularity in the time-dependent distribution of the percentage of never-cited papers: An empirical pilot study based on the six journals. Journal of Informetrics, 8, 136–146. Huang, T. C., Hsu, C., & Ciou, Z. J. (2015). Systematic methodology for excavating sleeping beauty publications and their princes from medical and biological engineering studies. Journal of Medical and Biological Engineering, 35(6), 749–758. Iefremova, O., Wais, K., & Kozak, M. (2018). Biographical articles in scientific literature: Analysis of articles indexed in Web of Science. Scientometrics, 117(3), 1695–1719. Ke, Q., Ferrara, E., Radicchi, F., & Flammini, A. (2015). Defining and identifying sleeping beauties in science. Proceedings of the National Academy of Sciences of the United States of America, 112(24), 7426. Lachance, C., & Larivière, V. (2014). On the citation lifecycle of papers with delayed recognition. Journal of Informetrics, 8(4), 863–872. Lange, L. L. (2005). Sleeping beauties in psychology: Comparisons of “hits” and “missed signals” in psychological journals. History of Psychology, 8(2), 194–217. Li, J. (2014). Citation curves of all-elements-sleeping-beauties: flash in the pan first and then delayed recognition. Scientometrics, 100(2), 595–601. Li, J., & Ye, F. Y. (2014). The phenomenon of all-elements-sleeping-beauties in scientific literature. Scientometrics, 92(3), 795–799. Li, J., & Ye, F. Y. (2016). Distinguishing sleeping beauties in science. Scientometrics, 108(2), 821–828. ¨ ¨ s¨ leeping beauties. Journal of Informetrics, 8(3), 493–502. Li, J., Shi, D. B., Zhao, S. X., & Ye, F. Y. (2014). A study of the heartbeat spectrafor ¨ Lievrouw, L. A. (1990). “Reconciling structure and process in the study of scholarly communication”. In Borgman (Ed.), Scholarly communication and bibliometrics (pp. 59–69). Newbury Park, CA: Sage. Lingwood, D. A. (1969). Interpersonal communication, research productivity and invisible colleges Unpublished doctoral dissertation. Stanford, CA: Stanford University. Market, E., & Papavasiliou, F. N. (2003). V(D)J recombination and the evolution of the adaptive immune system. PLoS Biology, 1(1), e16. Mazarei, Z. (2013). Review of relationship between recognition of scientific products and marking them on Citeulike in the field of knowledge and information science during 2004 to 2012 Master’s thesis in Knowledge & Information Science. Shiraz, Iran: Shiraz University (in Persian). Mcallister, P. R., Narin, F., & Corrigan, J. G. (1983). Programmatic evaluation and comparison based on standardized citation scores. IEEE Transactions on Engineering Management, 30(4), 205–211. McKay, C. P. (2004). What is life—and how do we search for it in other worlds? PLoS Biology, 2(9), e302. Miller, P., Zhabotinsky, M., Lisman, E., & Wang, X.-J. (2005). The stability of a stochastic CaMKII switch: Dependence on the number of enzyme molecules and protein turnover. PLoS Biology, 3(4), 705–717. Minger, S. J. (2007). Shooting stars and sleeping beauties: The secret life of citations. The 22nd European Conference on Operational Research (Euro XXII), 8–11. Moed, H. F. (2017). Applie devaluative informetrics. Heidelberg, Germany: Springer. Mohammadi, E., Thelwall, M., Haustein, S., & Larivière, V. (2015). Who reads research articles? Analtmetrics analysis of Mendeley user categories. Journal of the Association for Information Science and Technology, 66(9), 1832–1846. Mori, T., Williams, D. R., Byrne, M. O., et al. (2007). Elucidating the ticking of an in vitro circadian clockwork. PLoS Biology, 5(4), 841–853. Neugebauer, K. M. (2006). Keeping tabs on the women: Life scientists in Europe. PLoS Biology, 4(4), e97. Nordstrom, K., Barnett, P. D., & O’Carroll, D. C. (2006). Insect detection of small targets moving in visual clutter. PLoS Biology, 4(3), 378–386. Norio, P., & Schildkraut, C. L. (2004). Plasticity of DNA replication initiation in Epstein-Barr virus episomes. PLoS Biology, 2(3), e152. Ohba, N., & Nakao, K. (2012). Sleeping beauties in ophthalmology. Scientometrics, 93(2), 253–264. Powell, K. (2007). Going against the grain. PLoS Biology, 5(12), e338. Price, D. J. D. S. (1963). Little science, big science. New York: Columbia University Press. Price, D. J. de Solla. (1986). Little science, big science and beyond. NewYork: Columbia University Press. Priem, J., Taraborelli, D., Groth, P., & Neylon, C. (2010). Altmetrics: A manifesto Retrieved, from. http://altmetrics.org/manifesto/ Puschmann, C., & Mahrt, M. (2012). Scholarly blogging: A new form of publishing or science journalism2.0. In A. Tokar, M. Beurskens, S. Keuneke, M. Mahrt, I. Peters, C. Puschmann, T. van Treeck, & K. Weller (Eds.), Science and the internet (pp. 171–181). Düsseldorf: Düsseldorf University Press. Ravenscroft, J., Liakata, M., Clare, A., & Duma, D. (2017). Measuring scientific impact beyond academia: An assessment of existing impact metrics and proposed improvements. PloS One, 12(3), e0173152. Rechkoblit, O., Malinina, L., Cheng, Y., Kuryavyi, V., Broyde, S., Geacintov, N. E., et al. (2006). Stepwise translocation of Dpo4 polymerase during error-free bypass of an oxoG lesion. PLoS Biology, 4(1), 25–42. Robinson, R. (2009). Feedback system protects inner ear. PLoS Biology, 7(1), e1000012. Rodríguez-Gironés, M. A., & Santamaría, L. (2004). Why Are So Many Bird Flowers Red? PLoS Biology, 2(10), e350. Rousseau, R. (2009). The most influential editorials. In Celebrating scholarly communication studies. A festschrift for Olle Persson at his 60th birthday. pp. 47–53. Sandstrom, P. E. (1998). Information foraging among anthropologists in the invisible college of human behavioral ecology: An author cocitation analysis Unpublished doctoral dissertation. Bloomington: Indiana University. Sedita, S., Caloffi, A., & Lazzeretti, L. (2018). The invisible college of cluster research: A bibliometric core – Periphery analysis of the literature. Industry and Innovation, 1–23. Servedio, M. R. (2004). The what and why of research on reinforcement. PLoS Biology, 2(12), e420. Shema, H., Bar-Ilan, J., & Thelwall, M. (2014). Do blog citations correlate with a higher number of future citations? Research blogs as a potential source for alternative metrics. Journal of the Association for Information Science and Technology, 65(5), 1018–1027. Shuai, X., Pepe, A., & Bollen, J. (2012). How the scientific community reacts to newly submitted preprints: Article downloads, twitter mentions, and citations. PloS One, 7(11), e47523. Sigogneau, A. (2000). An analysis of document types published in journals related to physics: Proceeding publications recorded in the Science Citation Index database. Scientometrics, 47(3), 589–604. Stavenga, D. G., & Arikawa, K. (2008). One rhodopsin per photoreceptor: Iro-C genes break the rule. PLoS Biology, 6(4), 675–677. Stent, G. S. (1972). Prematurity and uniqueness in scientific discovery. Scientific American, 227(6), 84–93. SUD, P., & Thelwall, M. (2014). Evaluating altmetrics. Scientometrics, 98(2), 1131–1143. Sugimoto, C. R., Work, S., Larivière, V., & Haustein, S. (2017). Scholarly use of social media and altmetrics: A review of the literature. Journal of the Association for Information Science and Technology, 68(9), 2037–2062. Sun, J., Min, C., & Li, J. (2016). A vector for measuring obsolescence of scientific articles. Scientometrics, 107(2), 745–757. Teixeira, A. A. C., Vieira, P. C., & Abreu, A. P. (2017). Sleeping Beauties and their princes in innovation studies. Scientometrics, 110(2), 541–580. Thelwall, M., Haustein, S., Larivière, V., & Sugimoto, C. R. (2013). Do altmetrics work? Twitter and ten other social web services. PloS One, 8(5), e64841. Tsuriel, S., Geva, R., Zamorano, P., Dresbach, T., Boeckers, T., Gundelfinger, E. D., et al. (2006). Local sharing as a predominant determinant of synaptic matrix molecular dynamics. PLoS Biology, 4(9), e271. van Dalen, H. P., & Henkens, K. (2005). Signals in science-on the importance of signaling in gaining attention in science. Scientometrics, 64(2), 209–233. van Leeuwen, T., Costas, R., Calero-Medina, C., & Visser, M. (2013). The role of editorial material in bibliometric research performance assessments. Scientometrics, 95(2), 817–828.

J. Hou and X Yang / Journal of Informetrics 14 (2020) 101012

19

van Leeuwen, T. N., Moed, H. F., & Reedijk, J. (1999). Critical comments on Institute for Scientific Information impact factors: A sample of inorganic molecular chemistry journals. Journal of Information Science, 25(6), 489–498. Van Noorden, R. (2017). The science that’s never been cited. Nature, 552(7684), 162–164. van Raan, A. F. J. (2004). Sleeping beauties in science. Scientometrics, 59(3), 467–472. van Raan, A. F. J. (2015). Dormitory of physical and engineering sciences: Sleeping beauties may be sleeping innovations. PloS One, 10(10), e0139786. van Raan, A. F. J. (2017). Sleeping beauties cited in patents: is there also a dormitory of inventions? Scientometrics, 110(3), 1123–1156. Waltman, L. (2016). A review of the literature on citation impact indicators. Journal of Informetrics, 10(2), 365–391. Waltman, L., & Costas, R. (2014). F1000 Recommendations as a potential new data source for research evaluation: A comparison with citations. Journal of the Association for Information Science and Technology, 65(3), 433–445. Wang, J. C., Ma, F. C., Chen, M. J., & Rao, Y. Q. (2012). Why and how can “sleeping beauties” be awakened? The Electronic Library, 30(1), 5–18. Wardle, D. (2010). Do Faculty of 1000 (F1000) ratings of ecological publications serve as reasonable predictors of their future impact? Ideas in Ecology and Evolution, 3, 11–15. Weber, S. A., Gerton, J. L., Polancic, J. E., DeRisi, J. L., Koshland, D., & Megee, P. C. (2004). The kinetochore is an enhancer of pericentric cohesin binding. PLoS Biology, 2(9), E260. Ye, F. Y., & Bornmann, L. (2018). “Smart girls” versus “sleeping beauties” in the sciences: The identification of instant and delayed recognition by using the citation angle. Journal of the Association for Information Science and Technology, 69(3), 359–367. Zahedi, Z., Costas, R., & Wouters, P. (2014). How well developed are altmetrics? A cross-disciplinary analysis of the presence of ‘alternative metrics’ in scientific publications. Scientometrics, 101(2), 1491–1513. Zahedi, Z., Costas, R., & Wouters, P. (2014). Assessing the impact of publications saved by mendeley users: Is there any different pattern among users? Proceedings of the IATUL Conferences,. Paper 4. http://docs.lib.purdue.edu/iatul/2014/altmetrics/4 Zeng, C. J., Qi, E. P., Li, S. S., Stanley, H. E., & Fred, Y. Y. (2017). Statistical characteristics of breakthrough discoveries in science using the metaphor of black and white swans. Physica A Statistical Mechanics and Its Applications, 487, 40–46. Zhang, H. H., Zuccala, A. A., & Fred, Y. Y. (2019). Tracing the ‘swan groups’ of physics and economics in the key publications of Nobel laureates. Scientometrics, 119(1), 425–436. Zhang, L., Xu, K., & Zhao, J. (2017). Sleeping beauties in meme diffusion. Scientometrics, 112(1), 1–20. Zuccala, A. (2006). Modeling the invisible college. Journal of the American Society for Information Science and Technology, 57(2), 152–168.