On the promises of multimedia authoring

On the promises of multimedia authoring

Information and Software Technology 1994 36 (4) 243-245 On the promises of multimedia authoring Stephen W Smoliar Institute of Systems Science, Natio...

268KB Sizes 1 Downloads 84 Views

Information and Software Technology 1994 36 (4) 243-245

On the promises of multimedia authoring Stephen W Smoliar Institute of Systems Science, National University of Singapore, Heng Mui Keng Terrace. Kent Ridge, Singapore 0511 'Multimedia authoring' for all its popularity, is still a very ill-defined concept. Yet there is considerable interest in the development of authoring technology to facilitate multimedia communication. A clearer understanding of what multimedia authoring is all about should begin with an examination of just how the needs of communication are being served. Such an examination is undertaken in the form of a case study of the IMPACT multimedia authoring system. It is seen that the architecture of this system is based on three key premises which support the objective of surmounting difficulties for the 'video amateur', but a critical review reveals significant logical weaknesses in each of these premises. The conclusion is that the lot of such amateurs will not necessarily be improved by more powerful multimedia software tools of the sort provided by IMPACT. Those amateurs require access to intelligent agents which embody the judgemental expertise without which those tools cannot be used effectively. The prospect of being able to develop such agents is briefly reviewed. Keywords: multimedia authoring, expository writing, creativity

'Multimedia authoring', for all its popularity, is still a very ill-defined concept. This paper is based on a case study of the IMPACT multimedia authoring system ~ under the assumption that a better understanding of authoring is based on its contribution to communication. IMPACT was selected because it creators explicitly defined goals for facilitating communication, but each goal is based on a premise which justifies the implemented solution. The next section examines both these explicit goals and their associated premises. The following section then provides a critical review of each of these premises and its implications for communication. The final section concludes that communication will not be as enhanced by IMPACT's power tools as by intelligent agents with the expertise to use those tools effectively.

Impact System objectives IMPACT's primary ambition may best be stated in the words of its authors: 'to offer a system non-professional users can use to edit and create motion picture data as they desire '1. It is assumed that these users wish to communicate through a document in which text and static images are supplemented with motion pictures. They are 'professionals' within the provenance of the contents of those documents who should not also have to be professional film-makers. 0950-5849/94/040243--03 © 1994 Butterworth-HeinemannLtd

The development of IMPACT was based on the identification of three key problems: (1) editing--extracting and arranging motion picture data from existing resources. (2) filling--synthesizing motion picture data. (3) creativity--conceiving of a communication document. However, behind each of these problems is a premise which already suggests its solution. Therefore, before IMPACT can be assessed, it is necessary to identify just what these premises are.

The premises behind the objectives Editing. The first premise is that editing can be facilitated by operating on specifications, rather than the artefacts (i.e. film objects) themselves. Editing is a time-consuming process, demanding a prodigious memory to keep track of myriad scraps of film. The 'non-professional user' who has to present, say, the virtues of a new copy machine design, cannot worry about such details while developing his arguments. Since he has to do some editing, he may benefit from approaching the task in terms of something other than the film itself--specifications. F i l l i n g . The second premise is that movie composition will be facilitated by image processing tools which analyse prerecorded material for structure and content. Thus, one 243

On the promises of multimedia authoring: S W Smoliar need only specify those objects necessary for a presentation and how those objects need to behave. Technology should be able to take care of the rest, even if it involves having that new copy machine operated by Madonna. Creativity. Creativity lies in the ability to conceive of Madonna-as-operator as the element which really 'sells' that copy machine design. The third premise is that anyone is capable of having such bright ideas; but most never articulate them because they are too busy with 'noncreative chores'. Put another way, if none of us had to worry about persuading Madonna to be in a film (let alone meeting the cost of her time), we could all have ideas as brilliant as using her to promote new copy machine designs. These premises presage a bold vision of new technologies. The question is whether that vision brings us closer to our original goal or distances us. Lest we get too dazzled by the promise of Madonna in future documents that we author, we need to ask whether these premises are really about facilitating communication.

Reviewing the premises Editing based on specifications Good writing, like good software, should be well-structured. Indeed, authors frequently benefit from 'specifications' of their texts, usually in the form of an outline which is then embellished by itemizing the points which need to be made and relationships among them. One may even take a 'topdown' approach to writing such a text, beginning with the thesis statement, developing those points which are required to support the thesis, developing the points required to support the supporting points, and so on. However, a multimedia document is more akin to a film than to expository writing; and film production is a far more complex process. This complexity is captured in the three 'phases of production' formulated by David Bordwell and Kristen Thompson2: (1) Preparation. The idea for the film is conceived and developed. (2) Shooting. The images and sounds are recorded. (3) Assembly. The material collected during shooting is composed into final form. IMPACT fails to acknowledge that each of these phases is a different problem domain with its own expertise. Indeed, most professional film production requires many individuals with different skills participating in each phase. Effective production is more a matter of coordinating those individuals, rather than enabling a single person to perform all the necessary tasks. A specification-based approach to multimedia authoring thus embodies two 'top-down' assumptions. One is that the act of communicating may be approached in a top-down manner: a well-structured hierarchy of goals and subgoals. The other is that the implementation of a communicating product may be similarly approached: the output of a single creator who simply needs enough control over his resources to see his bidding done. The former assumption is consistent with constructing outlines prior to writing; but it overlooks that writing really communicates through more

244

subtle rhetorical techniques, rather than by just enumerating and supporting key points. The latter assumption holds for geniuses like Charlie Chaplin, who knew enough to command all the resources at their disposal2; but film production is usually a complex social process involving many individuals with a wide variety of skills. Unfortunately, the management of such a socially rich environment is rarely served by a simple top-down approach to either the product or the process that yields the product 3.

Composition based on image processing IMPACT's user is unlikely to have much expertise in any of the three phases of film production. Nevertheless, the second premise assumes that, with the appropriate tools, any user can contribute productively to the second phase (shooting). All that is required is some basic decisionmaking regarding what goes into a shot and a resource library of images. If the cut-and-paste techniques of 'clip art' already make it possible for Madonna's face to appear on your letterhead, IMPACT will allow you to 'cut-andpaste' Madonna, as an actress, in your own movie. Unfortunately there is more to shooting a film than cutting and pasting its elements. The real Madonna cannot simply be inserted into some environment (say, standing in front of a copy machine) and told to 'be Madonna'. Actors need direction; and the means by which directors communicate is not very well understood 4. Lest one object that one need not work with actors, it is still the case that even a static image often requires the trained eye of a professional in order to communicate effectively. Being a communicating author is not a matter of choosing the right objects to cut and paste. How those objects are pasted is the act of composition, which is the multimedia analog of rhetorical writing. It requires creative judgement which may not be well served by tools which simply extend the scope of what may be cut and pasted. Eliminating "non-creative chores' This raises the final premise, that the multimedia document production is beset by too many 'non-creative chores', but this may reflect inadequate experience in creation. Tasks which may appear non-creative in terms of the final product may actually provide a necessary service: for example, they may divert the mind, freeing it to wander through thoughts not intended for the creative task. Valuable ideas are often discovered during such wandering 5. Trying to develop tools for creativity is a risky business. People are creative because they work with their material in creative ways. Authoring can never do more than provide new material. To assume that such material can be fashioned to encourage creativity, particularly through eliminating 'non-creative chores', is arrogant folly; the result is more likely to encourage conformity than creativity.

Conclusions IMPACT may thus fall short of its objectives of surmounting difficulties for the 'video amateur' in the areas of editing, filming, and creativity, simply because its designers have not really analysed the nature of the tasks in each of these areas. So is IMPACT pursuing a realistic objective; and, if so,

Information and Software Technology 1994 Volume 36 Number 4

On the promises o f multimedia authoring: S W Smoliar

is it pursuing it along a viable path? Certainly it is realistic to assume that there are many authors whose communication goals would significantly benefit from incorporating video effectively. Such authors generally know what material would serve their purposes, so the real problem resides in its effective use. This latter task is not a collection of routine activities which may be relegated to suitably designed power tools. Those tools only become powerful when they exercise a level of professional judgement which is beyond the reach of the amateur. The amateur does not need tools he is ill-equipped to use; he needs the ability to interact with intelligent agents who are equipped to use those tools. Is this a realistic vision? Designing such agents is a primary objective of the Video Classification Project 6. We are currently developing agents which contribute to the following tasks: • Parsing. Video source is segmented into camera shots which serve as elemental units; when a suitable model is available, individual images may be decomposed into semantic primitives based on that model. • Indexing. Camera shots are maintained in a knowledge base which facilitates their classification on the basis of available semantic models; knowledge base frames

lnjbrmation and Software Technology 1994 Volume 36 Number 4

allow each clip to be augmented with the descriptive information based on those models. • Retrieval and browsing. The knowledge base may be accessed through queries based on text and/or visual examples; or it may be browsed through interaction with displays of meaningful icons. The results of a retrieval query may be similarly browsed. Both retrieval and browsing appeal to the user's visual intuitions. Such agents have capabilities similar to some of those in IMPACT; but we wish to explore them as a path towards capturing professional expertise, rather than as power tools for 'video amateurs'.

References 1 Ueda, H, Miyatake, T and Yoshizawa, S 'IMPACT: An interactive natural-motion-picturededicated multimedia authoring system' in Images beyond imagination, Monte Carlo, Monaco (January 1992) CNC 2 Bordwell, D and Thompson, K Film art: an introduction (4th edn) McGraw-Hill (1993) 3 McGregor, D The human side of enterprise McGraw-Hill (1960) 4 Cole, S L Directors in rehearsal: a hidden world Routledge (1992) 5 Boden, M A The creative mind: myths and mechanisms Basic Books (1990) 6 Zhang, H J and Smoliar, S W 'Developing power tools for video indexing and retrieval' in Symposium on electronic imaging science and technology: storage and retrieval for image video databases H, San Jose, CA (February 1994) IS&T/SPIE. To appear.

245