Atomic Data and Nuclear Data Tables 79, 185–186 (2001) doi:10.1006/adnd.2001.0875, available online at http://www.idealibrary.com on
EDITORIAL DATA JOURNAL ARTICLES IN THE ELECTRONIC PUBLISHING ENVIRONMENT There are, however, particular considerations which have always guided editorial judgments on what to publish in this data journal, how, and why. Principal among these considerations is that the data should be of broad utility and should be presented in a manner that facilitates their correct and convenient usage. The question of usability in turn directs the decision on the best form in which to provide these data. A “form” particularly tied to the use of data as input to other computations is, of course, the electronic file in ASCII, as opposed to the tabulation in hard copy or the PDF form of the online article. With the publication of this final issue of the year 2001, there are now 17 articles in ADNDT which have associated ASCII tables posted as supplementary material accompanying the online articles. In many instances, the tables contained in the published articles are simply replicated in ASCII format for computer-readability. In other instances, reflecting the still growing propensity for producing large data sets and an awareness of the need for alternate ways of presenting such large bodies of data, the published articles have adopted a style where synoptic overviews of the data are given precedence over explicit detailed tabulations. Such a style of presentation is not a departure from this journal’s design and purpose in any fundamental sense; it does, however, represent an extension in thinking about how publishing large sets of numerical data in a useful way can be approached. The task of presenting large sets of numerical data in a useful way can be thought of in terms of three agendas: (i) informing the reader of the scientific basis and authority for the data, (ii) representing the data as fully as possible and in the most economical way commensurate with preserving the information’s accessibility in the various circumstances the data might be used, and (iii) instructing the reader in the proper use of the data. The first and third of these agendas are in fact relevant to all ADNDT articles: They are achieved through the introductory text of the article, the Explanation of Tables, and Examples of Use of Tables as may be deemed helpful. It is the second agenda that is more distinctly pertinent to very large data sets.
Since the inception of this journal, many data tables published in Atomic Data and Nuclear Data Tables (ADNDT) have had a close connection to the computer. Even early on, when the journal’s production processes were based on creating “camera-ready tables,” tables of theoretical values were being calculated and printed in photo-ready form with computers prior to their submission for publication. Over time, experimental data tables were being generated as computer printouts in increasing numbers as well. Thus, ADNDT was often printing in hard copy the numerical output of computers, a fact which led many to suggest that we should provide our readers with data tables in electronic form as well as on paper. But, though the topic of electronic handling of data material was much discussed by this journal’s editors and publisher, it was only when the technology of Internet-based electronic publishing of scientific content became relatively mature that an online version of ADNDT became available along with online versions of other physics journals. Even so, tables in the PDF file format of the standard online article are not digitally readable as numerical data. A trend concomitant with the growth and development of electronic publishing is, of course, the proliferation of computational data themselves. In the current environment, virtually unlimited amounts of numerical data can be generated, as for instance from a code including ever more atomic or nuclear configurations or ever finer grids of kinematic variables. Moreover, increasingly more powerful and sophisticated experimental methods have often provided impetus and rationale for these detailed calculations with their associated large data sets. Of such abundant data, then, what should be published in a journal article, how, and why? The more general question of why publish in a scholarly journal at all when research results can be so readily communicated directly via the Internet is one we will not be revisiting here. By now, the research community appears to have come to terms with a complementary relationship between rapid electronic communications in the form of preprints and data files directly from authors and eventual publication of these works in refereed journals.
0092-640X/01 $35.00 2001 Elsevier Science All rights reserved. ° C
185
Atomic Data and Nuclear Data Tables, Vol. 79, No. 2, November 2001
A. LI-SCHOLZ
To implement (ii), the approach to take again depends on the issue of utility. One begins by asking how a tabulation is likely to be used: as a look-up table or as input data to other calculations. When the primary use is to look up particular values or to survey a pattern, priority generally should be given to printing at least a representative subset of the data as data tables within the article. This can be done by printing an abbreviated grid of variables or by printing the data from selected regions of greatest user interest. The article should additionally make clear what data omitted in the published article are available to be looked up in the electronic ASCII tables. If, on the other hand, the main use of the data is as input to subsequent calculations, or if the data are too voluminous to be represented in a meaningful way even by making a selection, then the approach to take is to print only samples of the ASCII tables with the article and to provide detailed explanations of the tabulated quantities to guide in the use of the ASCII tables. Regardless of how the data are expected to be used, the article should give a full account of the range and scope of the extended ASCII tables’ contents. Often, an overview of even quite large data sets can be conveyed economically and well through graphical representations which readily reveal systematic patterns and also permit users requiring only a low level of precision to read off approximate values. The various presentation approaches described above have all been employed in the recent ADNDT articles dealing with large data sets. Full numerical listings for these articles
Editorial
are provided as ASCII files posted as supplementary material to the online papers. ASCII files available with any ADNDT article can be called up by the article’s digital object identifier (doi) with a suffix /dat (as in http://www.idealibrary.com/ links/doi/10.1006/adnd.2001.0868/dat). Data users can access the ASCII files directly, without necessarily downloading the online article. Thus, in this publication format, the traditional presentation of data information in a paper is now complemented by Internet delivery of associated computerreadable data. Over the course of the 37 years of ADNDT’s publication, articles have communicated “data” in many different ways, through tabulations, graphically, using analytical and semiempirical expressions from which to calculate data, and so forth, thereby extending the idea of data publishing beyond the specific form of the printed numerical tables of the journal’s original conception. The approach described above— delivering information through an article that fully characterizes the data and tabulations that are computer-readable—has expanded the “data” and “tables” connection in Atomic Data and Nuclear Data Tables yet again. As the technology surrounding electronic data delivery and data generation, particularly with respect to large data sets, has advanced, so, too, has this journal evolved to ensure that what and how we publish are of continued benefit to both authors and audiences of the data article. Angela Li-Scholz, Editor, 1983–2001
186
Atomic Data and Nuclear Data Tables, Vol. 79, No. 2, November 2001