Digital Library Opportunities1 by Clifford Lynch
I
Clifford Lynch is Executive Director, Coalition for Networked Information, 21 Dupont Circle, Suite 800, Washington, DC 20036 ⬍
[email protected]⬎.
286
n developing my comments, I have had a couple of advantages. One is that I had a reasonably good idea of the scope of the other two talks that Deanna Marcum and Ann Okerson have presented. And so I am not going to spend much time at all on issues around collections, which I think are very critical and have been well addressed in the two prior talks that you’ve heard. I took the opportunity to think about the topic that Karen Hunter asked us to address a little bit differently. I chose not to try to develop fully articulated visions of the future, but rather to stress constructive opportunities that we might take and, in particular, opportunities that did not necessarily call, at least in their early stages, for massive intergalactic collaborations and the constructions of new national and international programs. There are, unquestionably, places for such initiatives. And I think Deanna Marcum, in particular, really built the case for the importance of dealing with the enormous body of legacy material we have in various physical forms and thinking through how we are going to bring that into the digital world. This area is just a superb opportunity for constructing systematic collective action. I thought instead I would focus on places where maybe we can make a difference through grassroots level individual institutional initiatives, and places where intellectual shifts are important— where we really need to consider how the world is changing and to recognize that the opportunities for the future call for thinking very differently than we have thought about things in the past. So, what I am going to offer are some sketches of a couple of opportunities that are very much on my mind as I think about how we engage the future. And these are not just opportunities, but are opportunities that I believe are not getting enough attention. Overcoming the obsta-
The Journal of Academic Librarianship, Volume 29, Number 5, pages 286 –289
cles to progress in these areas could have a high payoff. As I thought about what to cover in this talk, I saw someone with a tee-shirt with a slogan on it, one of these clever sayings that I had heard before but had not thought about recently, and the slogan has been very much on my mind as I have put this together. This is the statement: “The future is already here—it’s just not uniformly distributed.” (I’m not sure who this saying comes from, perhaps the science fiction writer William Gibson, or perhaps someone from the MIT Media Lab community.) I think there is a great deal of truth to this slogan, however. If we look hard, we can find fragments of futures already here in the good work that various people are doing. What we need to do is make these innovations more evenly distributed, better known, more accepted, and more universal. I have three areas that I want to focus on. The first area is being sensitive to the whole question of how scholarly communication is changing. Now, I think if we are really honest, if we look at how we have been making technology investments over the past 20 or 30 years, we have mostly been modernizing printbased scholarly publication. We now create things that are designed for print, store and move them around on the Web, print them on demand, search them on-line and do all kinds of things like that. But, at some level, the basic concepts of authoring— of making arguments, of conveying information and insight— haven’t yet changed very much in the digital world. I
1
This is an edited version of the transcript of my talk at the Digital Libraries Symposium; it is not a formal paper, and in editing the transcript I have preserved the informal nature of the talk. My thanks to Shelley Sperry of CNI for help in editing that transcript.
think that if we look hard we can start seeing those changes happen. We can start seeing pioneers and explorers out there, interestingly enough, perhaps more commonly in the humanities than in the hard sciences, and more commonly in transformations of the idea of the monograph rather than the journal article. But we can see authors starting to grapple with what our various kinds of literary and scholarly genres should turn into in the digital world— how we write, how we explain and build arguments that are based on complex interactive models, databases and datasets, how we document those arguments, and how we make explicit linkages in ways that are much deeper than we have done in the past between data, algorithms, and arguments. We see these authors grappling with where interpretation and assessment fit into these new constructs. We are just starting to see these changes take place. In fact, it is striking to me that most of them—and I choose the term “scholarly communication” very deliberately here in opposition to “scholarly publishing”—are happening entirely or largely outside of any of the vehicles and systems for scholarly publishing. These are individual efforts. We really don’t understand how to put them on a routine economic footing at this point. Many of them are simply being underwritten in one form or another. But they point the way to how the next couple of generations of scholars, who have grown up from the very beginning authoring in the digital environment, will behave, moving farther and farther away from preconceived notions about how a word processor is just a better way of producing traditional printed articles and books, and moving toward a world where video, images, sound and interactive things are on a much more equal footing with text, and text is no longer privileged in the ways it has been historically. This is going to change what libraries do. It is going to change what scholars do. It is going to change the character of the information that we strive to manage and curate and provide access to. It is going to change, I believe, relationships among scholars, people who know about information management, and libraries, publishers, and authors. This is going to really shake things up, I think, in the long run; the “long run” here being maybe as long as the next 20 years. We need to pay very close attention to this. Libraries don’t drive change in
scholarly communications. They can facilitate it. And I would note that there are things going on today that are very important steps. You have heard mention of institutional repositories, I believe, in both of the earlier talks. To my mind, one very fundamental way to characterize investment in institutional repositories is that libraries and universities are investing broadly to make the world safer and more hospitable for new forms of scholarly communication that exploit the digital medium. That’s what institutional repositories do. They are giving us a safety net both for access and preservation that has been missing, and that has not been there with these works that are being haphazardly produced outside of well-established systems of publication and preservation and dissemination. That is an important step in many ways, including a step towards legitimizing these innovations as credible scholarship. I think it is very easy, though, to underestimate how wide-reaching these changes are going to be and the extent to which they are going to require libraries both to support and to re-think what they are doing as the whole nature of scholarly literature changes. There is a wonderful recent essay by Jerome McGann, “Interpretive Networks,” Chronicle of Higher Education, December 13, 2002, that looks at the future of literary texts in the digital world. One of the observations the author made that really shook me up a little bit— because I recognized how true it is—is that, in the next 20 years or so, we probably are going to have to redo critical editions of just about every major literary text for the digital world. Critical editions in the digital world are going to be very different beasts than what we are used to in the print world—just as we are starting to learn that encyclopedias, dictionaries, thesauri and other kinds of tools are not the same in the digital world as they are in print, and that just digitizing the pages is a very far cry from reconceptualizing these things in a digital networked information environment. So this is one area where I think there are enormous opportunities and enormous potential, and where libraries need to be investing and building partnerships with faculty, with creative explorers, trying to make investments in things like institutional repositories, and trying to help people think through the structure of information in these settings. Remember another thing here: there is a very strong bias built into our staid,
comfortable, traditional world of print publication that says that our readers are human beings. That’s not entirely true. Our readers in the digital world are both human beings and various kinds of programs that want to mine, correlate, index, update, compile, and analyze the various sorts of structured data and information. We are going to need to continue to progress into markup formats for certain kinds of works that explicitly recognize that we are writing for both an audience of programs and an audience of people. That is going to be a big shift. There are other shifts at work here. It’s hard, sometimes, to recognize these for what they are. I don’t need to tell those of you from higher education institutions how learning management systems, or course management systems if you prefer, are starting to really proliferate and roll out in higher education, and I think, inevitably, they will reach down over time into K-12 schools, as well. Now, one way to think about these is as a new authoring environment with all of its own peculiarities and idiosyncrasies. It will produce new genres of learning materials. We have already seen this with an earlier generation of tools, most notably Microsoft PowerPoint. There is now a well-established series of genres that are built upon PowerPoint: the classroom presentation, the sales presentation, and the business presentation. Horrible as these may be, they are real. I am told that PowerPoint (not Word!) is the most heavily used piece of software for K-12 students today. And we are going to see the constraints and capabilities of learning management systems and of the networked information environment more broadly shape a whole generation of descendants of the textbook. We need to be very aware of what is going on here. Now for my second set of opportunities—and here I am going to stay at a very high level, because there are a whole lot of intertwined opportunities. We have heard a lot about collections. The topic of this symposium is digital libraries. As we gain experience with the digitization of material, I think that we are starting to understand that digital collections and digital libraries are not co-terminus; they are not equivalent. Digital libraries are more than digital collections; they are software systems that are underpinned, in part, by digital collections, but, in fact, there is not a one-to-one relationship. We may have many digital libraries presenting material that is drawn from many dig-
September 2003 287
ital collections in very complicated ways, over time. We understand digital collections, I would say, a whole lot better than we understand digital libraries. Digital libraries get us into a lot of terra incognito, and they invite us to go places where libraries have been very scared to go, where they have not gone historically. I think the questions raised in this environment are complicated, profound, tantalizing. And they demand that we reexamine some of our assumptions. I will just give you a couple of examples here. To what extent should a digital library appropriately incorporate analytic and authoring tools? Surely, when we think of digital libraries as they shade into collaboratories— collaborative work spaces, analysis and research, and learning environments—analytic tools, authoring tools, and annotation tools become very important. These are things that libraries typically have not gotten deeply involved in. They are things that are different from collections and organizational or access services, but in the world of digital libraries may be very tough to disentangle. Let me give you just one other example of the opportunities involved in rethinking the set of services and functionalities that comprise a digital library system. Look at what we have learned so far about successful digital libraries, and I use the term “digital library” in its most general, its broadest, its most catholic interpretation, to cover things that need have no connection with libraries as institutions. Digital libraries, as I frame them, include all kinds of haphazard commercial and not-for-profit ventures that have popped up and, sadly, sometimes have died on the Web, often without the active collaboration or engagement of libraries as the institutions that we understand them to be. One of the things that we have learned is that there is a large class of these things that are successful because they combine content with community— community contribution, community involvement, and community annotation; they harness collections of people with common interests, not just content. Now, how can we exploit this knowledge if we take a view that says that privacy is such an overarching imperative, such a non-negotiable fundamental value, that the only kinds of systems we are willing to build are anonymous systems? I would suggest that we have seen a lot of evidence that, in fact, people are sometimes not just willing but eager to be identified, to become part of a commu-
288
The Journal of Academic Librarianship
nity, and that this kind of technology is one of the strengths of digital libraries. I think that, as we go down the path to digital libraries, we are going to need to revisit some of these assumptions about privacy and anonymity. I’m not suggesting that there is no role for anonymity. I absolutely believe that there are situations—substantial numbers of them—in which people want to interact anonymously and, in fact, look to libraries as one of the very, very few places left in this society where they have any hope of doing that. But I would suggest that, alongside those situations, there are other scenarios in which people want to be known, particularly in benign settings where their personal information won’t necessarily be remarketed, where opt-in can actually be trusted, where they believe that a stated personal data use policy might be seriously intended. I think that we’re going to need to become much more nuanced in our assumptions about identity, privacy, and informed consent, as we really start to explore digital libraries. I could go on for a long time about the different kinds of opportunities that digital libraries raise. But I am not going to do that here. The core message is that the bundle of services and features that make digital libraries appealing, effective, successful, and engaging to users may be quite different than the set of things that we have done in physical libraries in the past. And we need to be open to some very different thinking about that. The third and final area opportunity that I want to raise today builds on this question of being very flexible and openminded in our thinking about what a digital library is and what it does. The opportunity is the demand that users are starting to make to collapse information spheres that historically have been very independent or, if not collapse them, at least make them interpenetrate more gracefully. We now have people building rather substantial collections of personal and unpublished information of their own, some of which they are willing to share in various controlled settings on their machines. The personal computer truly has become personal for many people. And it’s not just a personal computational device; it’s a personal filing device. It’s a personal database. There is a lot of information there. People are starting to roam around with rather extensive collections of e-mail and downloaded mailing lists and other goodies. Personal machines
also, of course, hold and manage personal trails of interactions with information resources, as well as local information. We are seeing a recognition in organizational settings that organizational information, some of which is internal to the organization and unpublished, is an important asset and an important tool for people trying to work within organizations. We are starting to see much more structured approaches to that. And then there is the world of more public information, be it information that moves along commercial lines or information that is just given away in a truly public way, the domain of libraries, among other players. These are three very independent worlds right now: personal, organizational, and public. You use different tools to deal with them. You organize things or don’t organize them in very different ways in these three spheres. Indeed, even the legal structuring arrangements are very different. In the public sphere, we have copyrights and certain kinds of licensing. In the corporate world or the organizational world, we have different structurings. (Note also that a single individual may participate in many different organizational or corporate spheres.) In the personal world, we have yet other structurings. There is nothing that says that there is an absolute requirement that these always be separate spheres. Indeed, there are people, mostly coming out of the computer science world, who have enough hubris now to ask questions about how we can develop tools that usefully help people span these three historically insular and independent universes of information. And people need to span them. This is going to be very important for people as they try to become more effective in dealing with an overwhelming universe of digital information. This, again, is an area most libraries—at least the ones I’m familiar with— have tended to stay away from. But it is a place that I think libraries, in the future, ignore at their peril. Note that libraries are not the only players in denial; information science has ignored this issue to a scandalous degree, for example. I think there is an enormous opportunity there and an enormous risk for not thinking about this area. In all three of the areas I’ve discussed, the questions are new enough, and they are complicated enough that we don’t yet know the answers. We don’t even know all the questions. We know, I think, that there is something
there. There is an opportunity or many opportunities. These are opportunities that, I think, lend themselves to exploration, lend themselves to a diversity of initiatives and explorations at this point, in hopes that we can better understand
some of the questions and at least a few of the answers. And as I look at promising, yet manageable, approachable opportunities for moving forward the vision of truly useful digital libraries— indeed, digital libraries that will
transform the way people learn, work, research, communicate, and organize information—these are three paths that I think could be very fruitful, and I’m grateful to have the opportunity to talk with you today about them.
September 2003 289