Digital.CSIC: Making the Case for Open Access at CSIC Isabel Bernal Available online 11 January 2011 The Spanish National Research Council (CSIC in Spanish, http://www.csic.es) is the largest public institution dedicated to research with more than a hundred centers and institutes nationwide. Its main objective is to develop and promote research that will help bring about scientific and technological progress and to collaborate with Spanish and foreign entities in order to achieve it. Digital.CSIC was created to maximize the international visibility, accessibility and impact of CSIC research by taking advantage of innovations in the field of information management and by participating actively in the open access movement. Serials Review 2011; 37:3–8. © 2010 Elsevier Inc. All rights reserved.
• • • • •
CSIC, An Important Player in the Scientific Development in Spain The Spanish National Research Council (CSIC in Spanish, http:// www.csic.es) was created in 1939 to replace the Board for Advanced Studies and Scientific Research (JAE in Spanish) formed in 1907. Nowadays, CSIC is the largest public institution dedicated to research, a huge organization consisting of more than a hundred research centers and institutes distributed nationwide and comprising a staff of more than 15.500,1 out of whom almost 9.6002 are directly devoted to research activities, including permanent, hired researchers and fellows. Belonging to the Spanish Ministry of Science and Innovation through the Secretary of State for Research, CSIC's main objective is to develop and promote research that will help bring about scientific and technological progress and to collaborate with Spanish and foreign entities in order to achieve that progress. CSIC is mostly known by the excellence of its scientific production: 6% of all the scientific community based in Spain works at CSIC and altogether generates approximately 20% of scientific output in the country yearly. It also manages important and modern facilities, as well as the most complete and extensive network of scientific libraries in the country. Over the last years a remarkable increase in joint research units in partnership with universities and other scientific institutions across Spain has occurred. The multidisciplinary and multi-sectorial nature of CSIC embodies scientific and technical research in eight broad areas: Humanities and Social Sciences; Biology and Biomedicine; Natural Resources; Agricultural Sciences; Physical Science and Technologies; Materials Science and Technology; Food Science and Technology; Chemical Science and Technology. Other CSIC principal functions are:
Contribution to creation of technologically-based companies Training of specialized staff Management of infrastructures and large facilities Promotion of a culture of Science Scientific representation of Spain at the international level
By Royal Decree CSIC became a State Agency in 2007, thus getting an official recognition that backs its history as a main player in fostering innovation and scientific innovation in the country. The Strategic Plan for 2010–2013 gives further impetus over previous structures with the immediate priorities being the dissemination, internationalization, and the evaluation of CSIC scientific production. Within this framework, increased efforts are being channelled in order to make CSIC science more widely available worldover. Its institutional repository Digital.CSIC (http://digital.csic.es/) plays a crucial role in this endeavor.
Digital.CSIC: A Unifying Project in a Complex Organization Digital.CSIC is a direct consequence of the signing by CSIC Presidency of the Berlin Declaration in 2006, which marked the official commitment of the institution to disseminating its research via open access. Two institutional projects resulted. First, the CSIC Press Department started Revistas CSIC (http://revistas.csic.es/) in June 2007. Revistas is a major publishing initiative which involves migrating the full collection of thirty-five CSIC scientific journals into an open access model. Next, CSIC Libraries Coordination Unit publicly launched the institutional repository Digital.CSIC in January 2008 with the ambition to become the memory archive of all CSIC scientific and technical production and provide open access. Today both projects have entered into a consolidation phase and have managed to establish stable infrastructures with a growing agenda of cooperation internally and elsewhere. Hence, Digital.CSIC was built to maximize the international visibility, accessibility and impact of CSIC research by taking advantage of innovations in the field of information management and by participating actively in the open access movement. In order to do so, Digital. CSIC has taken CSIC geographical distribution across the country as an opportunity to grow an institutional project where all research centers
• Advice to other scientific and technical bodies • Transfer of results to the business sector Bernal manages Digital.CSIC, a project under the CSIC Libraries Coordination Unit, CSIC, Madrid and serves as a regional editor for Serials Review; e-mail: Isabel.
[email protected]. 1 2008 figures, Plan de Actuación 2010–2013. 2 2008 figures, Plan de Actuación 2010–2013. 0098-7913/$ – see front matter © 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.serrev.2010.11.001
3
Isabel Bernal
Serials Review
pérides Ocean Research Vessel (http://www.utm.csic.es/hesperides. asp?switchlang=2), Sarmiento de Gamboa Ocean Research Vessel (http://www.utm.csic.es/sarmiento_his.asp?switchlang=2), Juan Carlos I Antarctic Base (http://www.utm.csic.es/bae.asp), Microelectronics White Room (http://www.imb-cnm.csic.es/index.php? option=com_content&view=article&id=25&Itemid=70&lang=en) and is one of the institutional members of Max von Laue/ Paul Langevin Institute (http://www.ill.eu/) and the European Synchroton Radiation Facility (http://www.esrf.eu/). These represent quite a varied group of institutes, some with roots in the 1950s, 1940s and far back, while others are in their infancy. Each is dedicated to a richness of research areas and topics with diverging priorities, funding capabilities and human resources and equipment; however, all share a common interest in disseminating their research outputs. Therefore, building one repository that would organize, preserve and facilitate access to all CSIC research was deemed the most productive approach within a clear institutional context and affiliation. Important, too, was showing the connections between centers and institutes, while at the same time providing enough space to allow individual identities and develop their own collections of knowledge. The very consolidation of a project like this in such a complex organization is a success in itself. Last, but not least, the third pillar of the repository is its Technical Office, a small team of librarians and IT staff located within the CSIC Libraries Coordination Unit that designs its policies, its main development strategy and work agenda, oversees the functioning of the platform, carries out technological innovations and adds new services for end-users, trains CSIC researchers and librarians on how to use the repository, raises awareness on open access in general and embarks on a number of dissemination and partnership activities at national and international levels. This Technical Office also undertakes a huge part of the deposits and is the only one with administrator's permissions to work across all communities, sub-communities and collections.
and institutes and the whole network of seventy-eight libraries participate in the pursuit of a common goal. As far as CSIC libraries involvement is concerned, the project has benefited to a large degree from the already established culture of cooperation that began in 1990 when the Libraries Coordination Unit was created to facilitate interlibrary loans, supervise libraries automation at an institutional level, develop and maintain a centralized catalog and oversee institutional licensing of electronic scholarly resources. Over the years, this Unit has diversified its projects where CSIC libraries come together to share expertise, resources, and mutual interests. Digital. CSIC is but another project that falls within this culture of cooperation and, on this occasion, libraries also become central enablers within CSIC to widely disseminate the outputs produced by scientists in the centers and institutes where they belong. Digital.CSIC places all these libraries at the forefront in the strategy for scientific dissemination at CSIC. The other fundamental pillar of the institutional repository is the CSIC scientific community itself. Since the beginnings of Digital.CSIC, it was clear that its long-term sustainability depended greatly on the active involvement of all CSIC centers and institutes, something that was not assumed, given the considerable degree of autonomy, let alone their distribution, across all regions in Spain–and a few settlements abroad–that characterize them. In fact, CSIC comprises nineteen centers and institutes devoted to Humanities and Social Sciences (http://www.csic.es/web/guest/humanidades-y-cienciassociales); twenty-five to Biology and Biomedicine (http://www. csic.es/web/guest/biologia-y-biomedicina); thirty to Natural Resources (http://www.csic.es/web/guest/recursos-naturales); fifteen to Agricultural Sciences (http://www.csic.es/web/guest/cienciasagrarias); twenty-eight to Physical Science and Technologies (http://www.csic.es/web/guest/ciencia-y-tecnologias-fisicas); thirteen devoted to Materials Science and Technology (http://www. csic.es/web/guest/ciencia-y-tecnologia-de-materiales); ten to Food Science and Technology (http://www.csic.es/web/guest/ciencia-ytecnologia-de-alimentos); and fifteen to Chemical Science and Technology (http://www.csic.es/web/guest/ciencia-y-tecnologiasquimicas). In addition to these centers and institutes, CSIC has a few special infrastructures devoted to research, including Calar Alto Astronomical Observatory (http://www.caha.es/), Doñana Biological Station (http://www.ebd.csic.es/website1/Principal.aspx), Hes-
Content and Structure of Digital.CSIC The CSIC scientific repository mirrors the institutional organization in eight scientific and technical areas to which a ninth one, CSIC
4
Volume 37, Number 1, 2011
Digital.CSIC: Making the Case for Open Access at CSIC
and libraries, the Technical Office considers the creation of new collections if none of the existing ones can accommodate a specific sort of research. Equally valuable, Digital.CSIC reflects the evolution of centers and institutes, while keeping content organized based on clear criteria. Alongside sub-communities that correspond to existing institutes appear other categories that house the research made by centers and institutes that no longer exists or have changed names (http://digital.csic.es/community-list). Main languages are English and Spanish, and the United States, Spain and United Kingdom stand as the three top user countries.
Central Services, has been added. Developed on DSpace software, the repository just migrated from the 1.4.2 version into the 1.6.2, and staff now have the opportunity to make some layout changes to the Web site for easier navigation and richer content discovery in resources and information for end-users. Content growth is channelled through the coordination of the repository's Technical Office, staff from the network of CSIC librarians, and a growing number of researchers who self-archive. Digital.CSIC seeks not only to provide seamless access and permanent access to current and future research produced and/ or financed by CSIC, but also to make visible, organize, and preserve as much as possible of CSIC science produced throughout its long history. In this enterprise, the repository is a useful platform to accomplish this by guaranteeing permanent URLs alongside a stable infrastructure and clear agenda, and–what is more important–the Presidency's commitment to make this happen. Benefits derived from a digital archive like this are enormous at all fronts (institutional, center, library, or single researcher). In a nutshell, to track what research the institution has made over the years is a very valuable tool for conducting analyses of different kinds, be they to evaluate production and its degree of dissemination, to study research patterns and most focused and productive areas of interest, or to get an insight into the very history of centers and institutes through their outputs, staff, and achievements. For CSIC libraries, this content adds value to the rich collections of scientific resources put at the disposal of end-users. CSIC enjoys healthy publications habits. By way of illustration, Scopus indexes more than 85,000 publications by CSIC researchers. Turning into official numbers, one can get a more precise idea of the potential for development for Digital.CSIC. In 2009 alone, CSIC research centres and institutes produced 9,741 SCI-SSCIAHCI articles, 1950 non SCI-SSCI.AHCI articles, 368 books, 1,784 books chapters, 104 other monographs, 4,634 proceedings and 3,409 posters in international conferences, 2,384 proceedings and 1,618 posters in national conferences, 793 PhD theses and 180 patents.3 Between 2002 and 2007 CSIC researchers produced more than 60,0004 scientific outputs, disseminating material aside. Trends in 2008 were similar, with a harvest of 8,754 SCI/ SSCI articles, 1,762 non SCI/SSCI articles, 314 books and 672 PhD theses with a total of 19,7255 scholarly citations obtained. The challenge is to enrich the digital repository not only with this all but also with the huge volume of research from past decades that can yet be of interest to scholars, as the usage statistics often show. Digital.CSIC surpassed 26,000 items at the end of October 2010 and more than 82% of the content is available in full text. At the outset of the project, the main stress was put on articles and conference papers authored by CSIC researchers; progressively other types of research outputs have been incorporated, as explained in our content policy (http://digital.csic.es/politicas/ #politica1). To date, Digital.CSIC contains articles and conference papers, theses and dissertations, book chapters/parts and books, patents, software, reviews, posters, music compositions, datasets, photographs, videos and other multimedia material. Likewise, there is room for the so-called grey literature, scientific output that is rather difficult to find on the Web and worth of getting organized and exposed through the repository. At the request of researchers
3 4 5
Main typologies Articles Conference papers and proceedings Books, chapters and/or extracts Patents Working papers Theses and dissertations Music compositions Presentations Posters Divulgative and learning material Technical Reports Videos Software Maps Datasets
Number of items 19,331 1,889 1,240 1,014 777 453 279 168 167 169 91 21 8 7 4
Figures as of October 28, 2010.
Statistics as of October 29, 2010. In June 2010 the staff revisited Digital.CSIC strategy for content development that can be divided into the following action lines:
• focus on CSIC centers and institutes with the least presence in the repository;
• launch a campaign to promote new types of material, which includes the search and capture of the so-called “hidden pearls,” works created in past decades but still highly valuable; • expand marketing and dissemination channels targeting CSIC researchers; • provide reinforced assistance for those centers and institutes without an individual library.
Data supplied by Office of CSIC Presidency. Plan de Actuación 2010–2013. Plan de Actuación 2010–2013.
5
Isabel Bernal
Serials Review
champion researchers, and provide the scientific community with a system to measure the impact of their research available through Digital.CSIC. As a result, this new module includes not only usage statistics by country, month and year, but also general statistics that give an overview of total figures, content typologies, most active self-archiving researchers and content distribution along geographical and scientific areas criteria. The third component of this module generates center/institute data, with information on the evolution of content by month and year, its classification by typologies, most downloaded and visited items authored by researchers in the center/institute etc. An article explaining the structure of the module was published in September (http://digital. csic.es/handle/10261/27913) and, as this set intends to be a contribution to DSpace development, it is the plan to open the code in the near future. See examples that follow.
The software upgrade also falls within this content development strategy, and thus staff will start making systematic use of SWORD to automate deposits from other platforms.
Nonetheless, all CSIC scientific and technical areas are already represented in Digital.CSIC and the gaps amongst them are being closed at a steady pace. In this regard, it may appear striking that the highest number of items corresponds to an area that is usually labeled as rather technofobic, that is, Humanities and Social Sciences, an area encompassing nineteen institutes. Indeed, the role played by the Tomás Navarro Tomás Library (http://biblioteca. cchs.csic.es/), the one for all of those centers based in Madrid, is that of encouraging researchers to self-archive. Undertaking a systematic upload of content on their behalf has been instrumental to attain this result. Second to them stand Natural Resources, Physical Sciences, and Agricultural Sciences, with little differences in the number of items. CSIC Central Services is a cross-sectional community in the repository, created “ad hoc,” to house material produced by “non-scientific” bodies within CSIC, namely, the Press Department, Office of Technology Transfer, Libraries Coordination Unit, the Vice-presidency of Organization and Scientific Culture and the Vice-presidency of Scientific and Technical Research. Number of items in scientific and technical areas Agrarian Sciences Biology and Biomedicine Chemical Sciences and Technologies CSIC Central Services Food Science and Technologies Humanities and Social Sciences Materials Science and Technologies Natural Resources Physical Sciences and Technologies
3,771 1,895 2,818 203 813 6,917 1,902 4,356 4,111
Most downloaded items within the collections of the Institute of Economic Analysis, as of October 29, 2010.
Figures as of October 28, 2010.
By taking a closer look at figures at center/institute level, both the Technical Office and individual institutes can keep track of growth on a monthly and yearly basis. Top 5 institutes at Digital.CSIC Institute of History (Madrid) Aula Dei Experimental Station (Zaragoza) Institute of Marine Sciences (Barcelona) Institute of Natural Products and Agrobiology (Tenerife) Corpuscular Physics Institute (Valencia)
Number of items 1,259 1,100 977 899 701
Figures as of October 28, 2010.
Graph: Representation of all CSIC scientific areas in the institutional repository.
In fact, in spring 2010 a set of locally developed statistics was released to enrich the existing usage statistics. A number of CSIC libraries requested this service and felt that usage statistics would be very helpful to monitor their degree of participation in the repository, the trends in content growth in their institutes, identify
Digital.CSIC rests on a hybrid work model whereby CSIC librarians and researchers may upload content and through the so-called Delegated Archiving Service, whereby authors can submit their works to their center's library to get them deposited on their behalf. The Technical Office has set clear policies on issues related
6
Volume 37, Number 1, 2011
Digital.CSIC: Making the Case for Open Access at CSIC
use the institutional repository, with CSIC being the official Web site or the very repository platform with the most frequent access points. Most within this group (62.9%) regularly visit the repository to search and download works of interest, while around half of them self-archive their works at a pace of around five works yearly. To a large extent, answers in this section are consistent with the primary findings concerning open access in general: namely, researchers who self-archive or submit their works to their libraries or the Technical Office are primarily encouraged by the prospect of gaining enhanced visibility and accessibility. They also do so to support the open access movement and follow CSIC institutional recommendations. The last free-text comments section of the survey indicated that more services and more content are desired improvements. Searching functionalities ranked very high, with many asking for advanced search options, including links to other open access resources topically related and links to researchers' CVs on the results list. Almost 40% were also supportive of adding digitized material into the collections of the repository, whereas a huge majority proved very favorable that Digital.CSIC and other CSIC databases containing information about scientific production could be integrated into one system. Other suggestions for services included more training sessions and raising awareness about open access, as well as Digital.CSIC assistance with copyright issues. From the outset, Digital.CSIC has been keen on incorporating add-ons and homegrown functionalities to the platform to better serve its community of users. A number of functionalities have paved the way towards gradual improvement of the repository, including the linking to CSIC Virtual Library through the SFX resolver whenever an item under restricted access is accessible through CSIC licensed material; a tool for showing news in the repository's home page; the so-called “auto-complete” functionality, which makes easier and quicker the insertion of some metadata when depositing items, and “Digital.CSIC on your web” API, which enables searches of Digital.CSIC content from external Web sites. A new wave of add-ons will follow shortly, based on the Technical Office agenda and also to accommodate the survey responses.
to the upload of content in the repository (http://digital.csic.es/ politicas/) and promotes the wide participation by conducting frequent training workshops for CSIC librarians and researchers and by producing handbooks, manuals and useful resources that explain in plain language the issues at stake when filling out metadata fields (http://digital.csic.es/handle/10261/20101) and when checking copyright issues (http://digital.csic.es/copyright/). At present, the annual average in deposit uploads is distributed as follows: 45% goes on the libraries network, 37% corresponds to the repository's Technical Office, and the remaining 18% represents individual researchers.
Engagement with CSIC Researchers Digital.CSIC is a tool for CSIC researchers. As such, all value-added services, innovations and improvements aim to meet their needs to facilitate enhanced dissemination and accessibility to their research outputs. After some initial commonplace skepticism, mostly derived from copyright concerns, lack of knowledge about open access and repositories and a considerable degree of technophobia (reactions which are not unknown to most institutional repositories), a growing number of CSIC researchers have become familiar with the project. In this regard, the abovementioned set of statistics is highly valued and utilized by researchers. As a success story in this respect, it is worth mentioning the case of two CSIC researchers who, a few months ago, expressed interest in uploading the massive “SPEIbase: a global 0.5° gridded SPEI data base” (see http://digital.csic.es/handle/10261/23139) that shows the evolution of climate change in the world over the last century. Since its availability through Digital.CSIC, its usage statistics continue to rise at a dramatic pace and its authors report an incredible visibility and accessibility that they did not expect at all. Since its beginnings, Digital.CSIC has made efforts to cultivate an open dialogue with the CSIC scientific community and make sure that their priorities and views are addressed. A major move along these lines was the survey that the Technical Office conducted last spring 2010 to analyze how researchers value the repository after almost three years of existence and to place that value in context by learning more about their publication habits and their attitudes towards open access to scholarly information. From a list of 6,879 names, 832 researchers responded to the survey. The primary findings have turned out to be very insightful. A summary in English is available at http://digital.csic.es/handle/10261/28547. Generally speaking, open access is still sparingly known amongst CSIC researchers and their publication trends follow mainstream behavior, with Elsevier, Springer and Wiley ranking high among publishers. The most cited open access benefits are enhanced visibility and accessibility, and the ensuing impact on citations and professional careers, as well as the ethical value of disseminating via open access research funded with public monies; however, a minority is already making good use of many quality open access resources and initiatives. Scientists in the areas of Physical Sciences, Biology and Biomedicine and related areas are most active with open access as a publishing platform. For them PLoS, Biomed Central, PubMed Central, arXiv, SPIRES, NASA ADS and other platforms are not alien either for readers or active contributors. Within this framework, it is also noteworthy that some researchers reported to have paid open access fees from their own pockets—a remarkable and telling fact not to be overlooked or ignored by a publicly funded institution such as CSIC. Answers from this survey have also helped greatly to verify whether the work agenda matches suggestions by the respondents. A large part of the survey centered on the evaluation of Digital.CSIC. One third (33%) of surveyed researchers claimed to
The Way Forward Digital.CSIC has recently embarked upon a dissemination campaign that intends to make the institutional repository and the open access movement more widely known within the institution. The redesigning of the platform's Web site by making room for spaces dedicated to open access issues for researchers and librarians, the preparation of promotional brochures (http:// digital.csic.es/handle/10261/28538), comics (https://digital.csic. es/handle/10261/28333) and a number of studies underway are working lines that all go after the same goal. Another recent initiative is a bulletin CSIC Abierto (http://digital.csic.es/revistacsic-abierto/?locale=en), issued every three months. It highlights research at CSIC centers and institutes and describes how that research is freely available through the institutional repository. We have just published the second issue (http://digital.csic.es/handle/ 10261/28605), coinciding with the celebration of the Open Access Week 2010, which features our champion researcher, scientist and veterinarian Angel Mantecón, with 422 works–all his scientific CV–and the work done from the institute's library and the direction of his center, The Experimental Agrarian Station, to enable open access to knowledge. A next major enterprise will be the building of an integrated information system. Nowadays, CSIC has three major databases with detailed information about the research made by its community of scientists and scholars. This undertaking is not
7
Isabel Bernal
Serials Review
integration of these platforms, Digital.CSIC will benefit greatly as efforts that are now being concentrated solely on metadata creation can be channeled to the development of new services for end-users. Further, through this project Digital.CSIC–and open access–will be placed at the core of the CSIC plan dealing with the evaluation of its scientific production. The design of this ambitious project is closely linked to two nationwide projects that are currently being developed under the umbrella of the Spanish Foundation of Science and Technology in which CSIC takes part: on the one hand, the development of a standardized scientific CV (https://cv.normalizado.org/index.jsp) and, on the other hand, the building of a persistent author identifier for Spanish scholars and researchers that will comply with international standards.
free of technical, organizational and information management challenges; however, the outcome will be the development of a unified, and, therefore, ergonomically and economically efficient, system from which it will be possible to evaluate the scientific production by CSIC, analyze it from different perspectives and implement the priority of ongoing CSIC Action Plan: that of measuring the impact of its own scientific production. In order to do so, the institutional platform which accumulates current information on scientific publications across all centers and institutes to be evaluated yearly will open to applications of main bibliographic databases to allow for the systematic capture of thousands of metadata records, a process that will avoid the manual data entry that has been the rule up to now. Through the
8