Concept-based indexing of annotated images using semantic DNA

Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655 Contents lists available at SciVerse ScienceDirect Engineering Applications o...

Download PDF

1MB Sizes 0 Downloads 30 Views

Report

PDF Reader
Full Text

Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655

Contents lists available at SciVerse ScienceDirect

Engineering Applications of Artiﬁcial Intelligence journal homepage: www.elsevier.com/locate/engappai

Concept-based indexing of annotated images using semantic DNA Syed Abdullah Fadzli a,b, Rossitza Setchi a,n a b

Cardiff University, School of Engineering, The Parade, Cardiff CF24 3AA, UK Faculty of Informatics, University of Sultan Zainal Abidin, Gong Badak Campus, 21300 Kuala Terengganu, Terengganu, Malaysia

a r t i c l e i n f o

a b s t r a c t

Article history: Received 28 July 2011 Received in revised form 3 January 2012 Accepted 7 February 2012 Available online 28 February 2012

One of the challenges in image retrieval is dealing with concepts which have no visual appearance in the images or are not used as keywords in their annotations. To address this problem, this paper proposes an unsupervised concept-based image indexing technique which uses a lexical ontology to extract semantic signatures called ‘semantic chromosomes’ from image annotations. A semantic chromosome is an information structure, which carries the semantic information of an image; it is the semantic signature of an image in a collection expressed through a set of semantic DNA (SDNA), each of them representing a concept. Central to the concept-based indexing technique discussed is the concept disambiguation algorithm developed, which identiﬁes the most relevant ‘semantic DNA’ (SDNA) by measuring the semantic importance of each word/phrase in the annotation. The concept disambiguation algorithm is evaluated using crowdsourcing. The experiments show that the algorithm has better accuracy (79.4%) than the accuracy demonstrated by other unsupervised algorithms (73%) in the 2007 Semeval competition. It is also comparable with the accuracy achieved in the same competition by the supervised algorithms (82–83%) which contrary to the approach proposed in this paper have to be trained with large corpora. The approach is currently applied to the automated generation of mood boards used as an inspirational tool in concept design. & 2012 Elsevier Ltd. All rights reserved.

Keywords: Concept Design Image Indexing Semantics Lexical Ontology Design Creativity

1. Introduction The ‘semantic gap’, i.e. the discrepancy between the limited descriptive power of the low-level image features and the richness of the user’s interpretation of the same image, has been in the focus of the image retrieval research for the last decade. Most of the approaches use keywords that either correspond to identiﬁable items describing the visual content of an image or relate to the context and the interpretation of that image. The user may be searching for an image of something (e.g. ‘Japanese garden’) or ‘‘an image about something’’ (e.g. ‘Japanese philosophy’). Advances in image analysis, object detection and classiﬁcation techniques may facilitate the automatic extraction of keywords which describe identiﬁable entities like a garden. However, keywords belonging to the second category are unlikely to be automatically obtained from images (Liu et al., 2007; Ferecatu et al., 2008), because, from the point of view of feature representation, there is no single visual feature, which best describes the image content (Tsai, 2007). A particular challenging aspect in this context is dealing with concepts which have no visual appearance in the images (e.g. ‘intellectual curiosity’). This includes concepts related to categories such as time, space, events and their signiﬁcance, as well as abstract

terms and emotions (Enser et al., 2007). Such concepts could be extracted from annotations or the text complementing the image. This paper proposes an unsupervised concept-based image indexing technique which uses a lexical ontology to extract a semantic signature called ‘semantic chromosome’ from an image annotation. Central to this technique is the developed concept disambiguation algorithm which identiﬁes the most relevant concept(s) or the ‘semantic DNA’ (SDNA) by measuring the semantic importance of each word/phrase in the annotation. Although this paper only focuses on image retrieval, the proposed approach is also applicable to any resources which contain text description including audio, video, multimedia content, speech, technical documents and web sites. The remainder of the paper is organized as follows. Section 2 describes recent work in semantic indexing of images and semantic systems evaluation. Section 3 introduces the concept of SDNA, and the lexical ontology used in this research, and describes the indexing technique developed. Its evaluation is presented in Section 4. Section 5 concludes the paper by suggesting directions for further research.

2. Literature review 2.1. Semantic indexing of images

n

Corresponding author. Tel.: þ44 2920875720; fax: þ 44 2920874716. E-mail address: [email protected] (R. Setchi).

0952-1976/$ - see front matter & 2012 Elsevier Ltd. All rights reserved. doi:10.1016/j.engappai.2012.02.005

Traditional text-based image retrieval systems predominantly employ indexing techniques which use keywords occurrences to

S.A. Fadzli, R. Setchi / Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655

identify important terms in annotations and the text accompanying images. The keywords used to index the images are normally weighted to indicate their relative importance. Several weighting functions have been proposed including statistical factors such as term frequency (TF), inverse document frequency (IDF), the product of TF and IDF (TF-IDF), and document length normalization (Salton and McGill, 1983; Salton and Buckley, 1988; Fuhr and Buckley, 1991; Lee, 1995). However, most keyword-based indexing methods do not consider the semantic context of the documents/annotations. The relationship between words and concepts is considered a complex issue due to the use of synonyms (‘different words, same meaning’) and homonyms (‘same word, different meaning’). An alternative approach is offered by the content-based image retrieval (CBIR) community who use image processing techniques to extract low-level image features and means for semantic interpretation of these features. However, the use of visual features on their own does not solve the problem of the semantic gap, i.e. the discrepancy between the low-level features contained in an image and its high level description that is meaningful to the human mind (Smeulders et al., 2000, Boujemaa et al., 2001). A number of researchers work on narrowing down the semantic gap by combining CBIR with high-level semantics using various techniques including ontology associations (Mezaris et al., 2003, Ren et al., 2003), supervised and unsupervised machine learning (Chen et al., 2003; Vasconcelos, 2004) and relevance feedback (Lu et al., 2000; Doulamis and Doulamis, 2004). Eugenio et al., (2002) use low-level features and provide a semantic representation of the images based on combination of geometric shapes. Other approaches use semantic templates (Chang et al., 1998) and textual information on the Web to support high level image retrieval (Feng et al., 2004). Concept-based image retrieval is an alternative approach that combines text document retrieval with semantic technologies to analyze the annotation or text surrounding the image, and extract high level concepts. Instead of using keywords only, it represents both the image and the query using these concepts, and performs retrieval in the concept space. The use of high-level concepts as dimensions in a vector space model reduces the dependency on speciﬁc terms used in the annotation and the query, which yields to a better retrieval performance (Styltsvig, 2006). This approach is capable of producing good results even when different words are used in the query and text annotation to communicate the same meaning. This also solves the synonymy and homonymy problem and increases recall. Similarly, if the correct concept is extracted to represent a polysemic word, non-relevant results could be eliminated which also increases precision. In concept-based image retrieval, concepts are mapped to an existing knowledge base, which is populated with real-life concepts understandable by humans (Voorhees and Harman 1999; Gauch et al., 2003). Alternatively, concepts can be automatically generated based on overlapping relations between terms or probabilities of term occurrences, which are not necessarily interpretable by humans (Hofmann, 1999; Yi and Allan, 2009). The former approach is preferable as it is better aligned with human understanding which is the most important aspect in narrowing the semantic gap. In recent years, the use of semantic technologies and metadata languages has expanded as they offer means for deﬁning class terminologies with well-deﬁned semantics and ﬂexible data ¨ models for representing metadata descriptions (Hyvonen et al., 2002). In particular, controlled vocabularies, taxonomies, free text descriptions and annotations are employed to describe or classify the images in order to improve the retrieval. Other approaches rely on the use of ontologies to provide different views for navigation, and terminology for creating the metadata or the ¨ annotations of the images (Hyvonen et al., 2002; Dill et al., 2003;

1645

Zhang et al., 2006; Staab et al., 2008). It must be noted, however, that different ontologies may not have the same degree of formality. Controlled vocabularies, dictionaries, thesauri, and taxonomies are some of the most lightweight ontology types widely used in annotations. These forms of vocabularies are not strictly formal and the annotations produced using them are normally pointers to terms in the vocabulary, which can be used to improve the search by using synonyms, antonyms, hyponyms and hypernyms. 2.2. Evaluation of semantic systems Evaluating the relevancy of the retrieved results is a difﬁcult and expensive task due to the use of different test sets, linguistic inventories and knowledge resources. Most of the systems are evaluated using in-house, mostly small-scale, data sets. Competitions like Senseval (Edmonds, 2002) and Semeval provide common grounds for comparative evaluation of word sense disambiguation and semantic analysis of text. However, although they are the best reference to study the recent developments in the area, it is difﬁcult to use the data sets provided because of the different dictionaries adopted for the ground truth creation (i.e. HECTOR, WordNet 1.7, WordNet 1.7.1 and WordNet 2.1). Furthermore, the subjectivity in perceiving and interpreting visual content makes it difﬁcult to determine what is considered relevant in the context of a speciﬁc query. A set of query results may or may not be relevant to different people, depending on their personal understanding. Currently, there are no publicly available semantic datasets that represent the ‘ground truth’ and could be used as an evaluation benchmark. A recently emerged evaluation method is based on using crowdsourcing as a means for collective human judgment (Snow et al., 2008; Alonso and Mizzaro, 2009; Akkaya et al., 2010). The word ‘crowdsourcing’ introduced by Jeff Howe (2006) describes ‘‘the act of company or institution taking a function once performed by employees and outsourcing it to an undeﬁned (and generally large) network of people in the form of an open call’’. In crowdsourcing, a large task is divided into smaller tasks which are then distributed among a large group of people who do not necessarily know each other. Crowdsourcing normally involves payment in exchange of the task being performed. The cost, speed and quality of the crowdsourcing results are reported by many researchers to be impressive (Snow et al., 2008; Akkaya et al., 2010; Corney et al., 2010). Although spammers are the main concern in crowdsourcing, Akkaya et al. (2010) have found that their input is minimal and the results are highly reliable. Another experimental study by Corney et. al. (2010) concludes that, with the right question and enough information, crowdsourcing can provide high quality results. Their crowdsourcing approach applied to a two-dimensional strip packing task demonstrates a better efﬁciency rate than the best algorithm available in the literature. These ﬁndings are consistent with the study by Snow et al. (2008) of the evaluation of experts and nonexpert conducting ﬁve natural language processing tasks. The study has found that in average only four non-experts answers are needed to emulate an expert opinion. Callison-Burch (2009) also shows that a non-expert group produces judgments that are similar to those of experts. The evaluation results produced in this study have a stronger correlation than the Bleu algorithm (Papineni et al., 2002) which approximates human judgment in evaluating machine translation. This paper further advances the state of the art in the ﬁeld of image retrieval by proposing a novel concept-based retrieval approach, which automatically maps image annotations to high level concepts contained in a comprehensive lexical ontology. The approach is evaluated using crowdsourcing.

1646

S.A. Fadzli, R. Setchi / Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655

3. Semantic DNA and semantic chromosomes 3.1. Deﬁnitions Owners of professional collections of photographs provide a large amount of keywords in an attempt to make their photographs easier to ﬁnd by those looking for images to illustrate books, develop web sites, and create visual designs. Most annotations manually added to images are: a. descriptive text, brief explanation, or comments used to label the image, (e.g. ‘‘a tea ceremony at a garden in the center of Tokyo’’), b. metadata in the form of free keywords (e.g. ‘‘tea ceremony, drinking tea, Japanese way of life, Japanese tradition, tea garden, water, rocks, bridge, island, sado’’), c. controlled vocabulary to make the categorization easier (e.g. ‘‘culture’’, ‘‘Japan’’, ‘‘tourism’’, ‘‘garden’’) or/and d. database information (e.g. date, title, name of the photographer who has taken the picture). The approach in this paper uses image annotations to extract the semantic signature of each image in a collection. The semantic signature, called in this research semantic chromosome, is composed of a number of semantic DNA. Scientists use the concepts of DNA and chromosomes to describe the organization of genetic information in living organisms. Following the same analogy, a semantic chromosome is deﬁned in this research as an information structure, which carries the semantic information of an image. It is its semantic signature expressed through a set of semantic DNA (SDNA), where each SDNA in the set representing one semantically distinguishable concept. For example, an image depicting a tea house in a Japanese traditional garden might be represented through a set of concepts such as ‘tea garden’, ’Japan’, ‘tradition’ and ‘ceremony’. Each of these four words represents a semantically distinguishable concept. Used together (and represented in a coded way), they form the semantic signature or the semantic chromosome of this image and could be used to represent its meaning. Although a semantic chromosome may look like an annotation, it is very different as it is a formal representation of the semantic meaning of that image. This means that the semantic chromosome is extracted in a formal way, using terminology with well deﬁned semantics, and is linked to some semantic resources. In particular, the use of ontologies is very beneﬁcial as ontologies nowadays are the only widely accepted paradigm for the management of sharable and reusable knowledge in a way that allows its automatic interpretation. In the context of this research, the use of ontologies provides some formalization of the content as a prerequisite for more comprehensive indexing, retrieval and use. 3.2. OntoRo The ontology used in this research is OntoRo, a lexical ontology built using the electronic version of the Roget’s Thesaurus (Hart and Newby, 2003) and employed in the development of two other concept tagging algorithms (Setchi and Tang, 2007; Setchi et al., 2010). The Roget’s Thesaurus (Davidson, 2003) is a well known resource mainly used to facilitate the expression of ideas and assist in literacy composition. In information retrieval it is employed to expand search items with other closely related words. Different from a dictionary which explains the meaning of words, Roget’s groups words based on ideas and their semantic similarity. It has a well established structure, where the words/

phrases are grouped and linked by their meaning and associations. The current version of OntoRo, also available as a web application (OntoRo, 2011), includes 68,920 unique words and 228, 130 entries classiﬁed into 6 classes, 39 sections, 95 subsections, 990 heads, 4 part-of-speech (POS) categories and a number of paragraphs within each concept. Monosemic words, which have a single sense, appear in one concept only. Most of the words however are polysemic, have several meanings and are linked to a corresponding number of concepts. (This example also explains why OntoRo contains 68,920 unique words and many more entries: 228,130.) For example, the word ‘tradition’ has 6 senses and is related to 6 OntoRo concepts representing the meaning of tradition as something from the past (#127: oldness), lasting quality (#144: permanence), means of sharing information (#524: information), statement of facts (#590: description), habit or second nature (#610: habit) and religious faith (#973: religion). In this paper, each of the 990 concepts (called head groups in Roget’s) is labeled through its number in OntoRo and the ﬁrst word in the list of all words and phrases belonging to that concept. For example, the concept #127: oldness is represented in OntoRo with 233 words, some of which are shown in the Box 1 below. It is clear that all these words can be used to describe different aspects of ‘oldness’. Most of them are related to history and mythology but there are clear connotations to decline, decay, and aging, and some not entirely expected negative associations and comparisons (e.g. ‘old fossil’ and ‘moth-eaten’). It is clear, however, that this particular sense of the word ‘tradition’ would be inappropriate to use in relation to an image of a traditional Japanese garden. Paragraphs in OntoRo are further divided into POS categories and sub paragraphs which group words with closer relationships in term of contextual meaning. This example shows that the semantic signature of an image should be a more complex and meaningful structure than a list of words if it were to be used for indexing and information retrieval.

Box 1 oldness, primitiveness, beginning, y, antiquity, maturity, mellowness, autumn, decline, rust, decay, senility, old age, eldership, seniority, archaism, antiquities, y, thing of the past, relic of the past, listed building, ancient monument, museum piece, antique, heirloom, bygone, Victoriana, dodo, dinosaur, fossil, oldie, golden, old fogy, old fossil, y, tradition, lore, folklore, mythology, inveteracy, custom, prescription, yvintage, venerable, patriarchal, archaic, ancient, timeworn, ruined, prehistoric, mythological, heroic, classic, Hellenic, Byzantine, feudal, medieval, y, historical, past, y, geological, pre-glacial, fossil, Paleozoic, secular, Eolithic, Paleolithic, Mesolithic, Neolithic, y, ancestral, traditional, time-honored, habitual, y, old as the hills, y, old as history, old as time, age-old, lasting, antiquated, of other times, of another age, y, prior, anachronistic, archaistic, archaizing, retrospective, fossilized, ossified, static, permanent, behind the times, out of date, out of fashion, dated, y, conservative, Victorian, old-fashioned, old-school, y, outdated, outmoded, old hat, gone by, past, decayed, perished, dilapidated, rusty, moth-eaten, crumbling, mildewed, moss-grown, moldering, decomposed, fusty, y, belong to the past, have had its day, be burnt out, end, age, grow old, decline, fade, y, rot, rust, decay, decompose, anciently, since the world was made,.., before the Flood, formerly (233 words in total, not all are included in this box)

S.A. Fadzli, R. Setchi / Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655

The next section shows how semantic chromosomes are constructed using semantic DNA. 3.3. Semantic chromosomes As already mentioned, a semantic chromosome is deﬁned as the semantic signature of an image, expressed through a set of SDNA, each representing a semantically distinguishable concept. Each SDNA is formally represented as a chain of numbers corresponding to the structural elements of the OntoRo’s hierarchy. The format of a SDNA is as follows: Class #—Section #—Sub-section #—Head #—Label for POS (1¼noun, 2 ¼adjective, 3¼ verb, 4¼adverb)—Paragraph # For example, a suitable sense for the word ‘tradition’ in an image annotation of a Japanese garden would be ‘lasting quality’ (belonging to concept #144: permanence); its SDNA is 1-7-24144-1-1. Its semantic representation, following the SDNA format (i.e. OntoRo’s hierarchical structure), is shown in Fig. 1, where 1is used to show the POS group of the word, i.e. noun. Each SDNA carries semantic information including part of speech, high level concept name, context meaning and points to other words that can be used in the same context. These words are not always synonyms; they are related words that can be used to express the same idea or concept. Table 1 lists all SDNA corresponding to each contextual meaning of the word ‘tradition’. Only one of these SDNA is meaningful within a certain context for a given image, and only it will be included in the semantic chromosome of that image. The selection of the most meaningful SDNA and their use to index images is explained in the next section.

1647

Given an image annotation, a, the approach involves four processes: i. Parsing the text of the annotation and removing punctuations and unrecognized characters. ii. Mapping the keywords in the annotation to a lexical ontology. This involves identifying relevant words or phrases which are part of the ontology. Each relevant word or phrase is referred to as term ti where Ta ¼{t1, t2,y,tn} and n is the total number of terms in the annotation. iii. Extracting the SDNA, si, of each term using the ontology hierarchy. Each SDNA represents a different sense of the term where Senses (ti) ¼{tis1, tis2, y, tism} and m is the total number of SDNAs from all the terms in the image annotation. iv. Weighting each SDNA, si, and disambiguating them to select the SDNA which represents the most accurate sense of each term. The selected SDNA, referred to as the semantic chromosome of image annotation a, are then used to represent the image in the image matrix. The image in Fig. 3 will be used to illustrate the algorithm. The annotation of the image contains 18 keywords/phrases and all of them have been identiﬁed in step ii as relevant (i.e. they all exist in the lexical ontology). Table A.1 (see Appendix) lists all SDNA related to each of these terms, as extracted in step iii.

Fig. 2. Semantic DNA indexing.

3.4. Semantic DNA indexing Fig. 2 shows the conceptual model of the SDNA indexing approach proposed by the authors in an earlier publication (Fadzli and Setchi, 2010, 2011). The formal representation of the semantic indexing algorithm is outlined below. Class #1: Abstract Relation Section #7: Change Sub Section #24: Social Head #144: Permanence Part of speech #1: noun Paragraph #1: permanence, permanency, no change, status quo, invariability, unchangeability, …, lasting quality, persistence, perseverance, endurance, duration, durability, .., sustenance, maintenance, conservation, preservation, continuance, …, standing, long standing, inveteracy, oldness, tradition, custom, practice, habit, .., static condition, quiescence, traditionalist, conservative, …, obstinate person . (63 words in total, not all are included in this example) Fig. 1. Semantic representation of the word ‘tradition’ in the context of lasting quality.

Fig. 3. An image from the collection: the Golden Temple in Kyoto, Japan. Annotation: Golden, temple, Japan, Far East, travel, architecture, wooden, shrine, religion, historic, tradition, water, peace, garden, world, heritage, site, tourism.

Table 1 Semantic DNA of the word ‘tradition’. Semantic DNA

Concept

Sense

Paragraph Content

1-6-22-127-1-3 1-7-24-144-1-1 4-24-57-524-1-1 4-25-58-590-1-2 5-26-59-610-1-1 6-39-92-973-1-4

Oldness Permanence Information Description Habit Religion

Tradition Permanence Information Narrative Habit Theology

17 words semantically related to ‘tradition’ as ‘something from the past’ 63 words semantically related to ‘tradition’ as ‘lasting quality’ 123 words semantically related to ‘tradition’ as ‘means of sharing information’ 87 words semantically related to ‘tradition’ as ‘statement of facts’ 610 words semantically related to ‘tradition’ as ‘habit’ or ‘second nature’ 57 words semantically related to ‘tradition’ as ‘religious faith’

1648

S.A. Fadzli, R. Setchi / Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655

In step iv, each SDNA is weighted using reverse Hamming distance (RHD) by measuring the similarity between any two SDNA, si and sj: simðsi ,sj Þ ¼ Lhamming_distanceðsi ,sj Þ

ð1Þ

where L is the total number of levels used in the SDNA, which is 6. For example, the similarity between s8:2-10-35-209-1-4 (one of the senses of ‘temple’) and s14: 2-10-36-226-1-10 (one of the senses of Japan) is 2. The total similarity for each SDNA is then calculated by cumulating the similarities between it and all other strings of SDNA: totalsimðsi Þ ¼

9s9 X

4. SDNA disambiguation evaluation simðsi ,sj Þ

ð2Þ 4.1. Materials

j ¼ 1;j a i

where 9s9 is the total number of SDNA extracted from all terms (132 in this case, see Table A.1). A normalization method based on Okapi BM25 model (Robertson et al., 1999) integrated with information content (IC) is used to balance the SDNA weight, sw(), when having annotations with different length: SWðsi Þ ¼

words and phrases which can be used to match a search query with images. The SDNA disambiguation algorithm is crucial in this approach as it inﬂuences the performance of the indexing and retrieval process. Selecting the correct sense for each keyword could eliminate non-relevant results in the retrieval process, thus increasing precision. The performance of the SDNA disambiguation algorithm proposed is evaluated in the next section.

totalsimðsi Þðk þ1Þ kð1b þ bICðsi ÞÞ þtotalsimðsi Þ

ð3Þ

where IC(si) is the information content of SDNA si obtained by dividing the number of images in the collection containing this SDNA by the total number of images in the collection, and taking the negative logarithm of the quotient. k and b are two tuning parameters which are adjustable according to the requirements of the speciﬁc application. k is a positive parameter that calibrates the document frequency scaling. A k value of 0 corresponds to a binary model with no term frequency, and a large value corresponds to using raw term frequency. b is another tuning parameter which determines the scaling by document length, where b¼(0,y, 1). b¼1 yields to fully scaling the term weight by the document length, while b¼0 yields to no document length normalization. The values of k¼9 and b¼ 0.75 have been used in this example. Table 2 lists SDNA related to every possible sense of term t1:‘golden’and t2:‘temple’ and the computed values for total sim, IC and SW. As shown, the term ‘golden’ has 5 senses, while ‘temple’ has 8. Based on the SW values, s2: yellow and s7: house are selected as the most accurate SDNA which represent t1 and t2. Table A.2 (see the Appendix) lists all selected SDNA for Fig. 3. Related words from OntoRo are also included to help the understanding of the context of each sense. The selected SDNA form the semantic chromosome of the image (Table A.2). It is ‘‘yellow, house, blacken, farness, land travel, structure, wooden, tomb, public worship, reputable, theology, cultivate, euphoria, cultivate, land, posterity, place, land travel’’. It can be seen that it adequately represents the contextual meaning of the image using high level concepts. In addition, as mentioned earlier, the chromosome can be used as a pointer to a great number of semantically related

Research collaboration with VisconPro Ltd. has provided this study with 157,639 digital images. VisconPro Ltd. is one of the UK leading online companies which hosts an image stock website called fotoLIBRA& (VisconPro, 2011). The website currently hosts 392,728 high quality images covering a broad range of topics. The images, owned by more than 20,000 photographers, have already been manually annotated by them. The evaluation used in this study is based on collective human evaluation using Amazon Mechanical Turk (MTurk). The evaluation tasks are divided into micro-tasks which are offered to a large number of people who do not know each other. Every task offered through the MTurk is called a human intelligence task (HIT). The people who perform the task are called workers. They are paid according to the number of HITs they had completed. 4.2. Evaluation protocol The main objective of the evaluation is to measure the accuracy of the proposed SDNA disambiguation algorithm in selecting the most accurate sense for each term in an annotation. The experiment includes two tasks which are conducted by different groups of workers. In Task 1, the workers are given a group of words from an image annotation (Fig. 4). They are asked to consider the context of these words and select the most accurate sense for each of them. Three keywords with the highest SDNA weight are chosen to be scored by the workers. For each keyword, the worker is provided with two choices: (i) the sense selected by the proposed approach and (ii) a randomly selected sense from all other possible senses. In addition, a list of related words is also provided to help the workers understand the meaning of each sense. The workers are also offered ‘all of the above’ choice, if they agree with both senses given, and ‘none of the above’ choice if they do not agree with neither senses. Table 3 lists the scores for each choice selected by the user.

Table 2 List of SDNA for terms ‘golden’ (t1) and ‘temple’ (t2). Senses

SDNA

Concept

Sense

Totalsim(si)

IC(si)

SW(Si)

t1s1 t1s2 t1s3 t1s4 t1s5 t2s6 t2s7 t2s8 t2s9 t2s10 t2s11 t2s12 t2s13

1-6-22-127-1-2 3-15-48-433-2-1 5-27-63-644-2-4 5-30-69-730-2-3 6-36-80-852-2-2 1-8-28-164-1-3 2-9-33-192-1-6 2-10-35-209-1-4 2-10-35-213-1-3 5-27-63-662-1-1 6-36-82-866-1-4 6-39-95-990-1-1 6-39-95-990-1-3

Oldness Yellowness Goodness Prosperity Hope Production Abode Height Summit Refuge Repute Temple Temple

Archaism Yellow Valuable Palmy Promising Building House High structure Head Refuge Honors Temple Church

27 65 23 21 27 33 35 32 32 23 29 39 38

1.6237 1.3776 1.7606 1.7172 1.7030 0.8957 0.8668 0.9290 1.0300 1.0619 0.8908 1.3250 1.3225

6.2722 7.7724 5.8206 5.6751 6.1961 7.3562 7.4643 7.2775 7.1677 6.5983 7.1812 7.1705 7.1346

S.A. Fadzli, R. Setchi / Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655

1649

Fig. 4. Task 1: Selecting the most suitable sense.

Table 3 Scoring for task 1. Choice

Score

Proposed sense Non-proposed sense All of the above None of the above

1 0 1 0

To observe the statistical relationship between the SDNA disambiguation score from Task 1 and the accuracy of the annotations, a second task is designed (Fig. 5) which measures the accuracy of each annotation according to the image context. In Task 2, the workers are provided with an image together with its annotation. Based on the image, the workers are asked to rate the accuracy of the annotation given, from ‘not accurate’ to ‘very accurate’. Table 4 lists the scores for each category. The main challenge in using MTurk services is to ﬁlter out lowquality results from irresponsible and careless workers. In order to help managing worker’s accuracy, MTurk provides the ability to review the HITs results prior to approving or rejecting HITs submissions. Any answers such as identical answers for different HITs, completing time that is less than 30 s per task (which is considered too fast), and incomplete answer, are rejected without any payment made. This affects the workers approval rate which indicates theirs reliability for performing future tasks. By restricting the workers approval rate to a certain threshold, requesters can make sure that workers with bad reputation are not able to accept their tasks. 4.3. Evaluation results In Task 1, 500 images with their annotations were randomly selected from the collection for evaluation. A total of 5000 HITs were offered with payment of USD 0.02 per HIT. Each HIT was scored by 10 different workers. This is consistent with the

experiment conducted by Snow et al. (2008) which involved a word sense disambiguation task. 203 workers accepted the tasks; each of them completed in average 24.6 HITs. A HIT took an average of 54.09 s to complete. During the review of the results, 263 HITs were rejected because of unreliable answers. These HITS were offered to other workers. For each image, three keywords were considered resulting in a total of 1500 keywords for the 500 images selected for evaluation. Table 5 and Fig. 6 show the scoring for 1500 words in task 1. Column ‘Score’ shows the ratio of ‘Y’ and ‘N’ scores, where the number preceding the ‘Y’ letter indicates the number of workers who agree with the proposed sense, while the number preceding the letter ‘N’ indicates the number of workers who disagree with the proposed sense. For example, a category ‘7Y 3N’ includes 7 ‘Y’ and 3 ‘N’ votes; 18.1% of all keywords (272 keywords) belong to this category. For every keyword evaluated in a HIT, a simple majority score from 10 workers is taken as the ﬁnal score. A majority score is deﬁned as at least 6 out of 10 workers in agreement. Therefore, for the 1500 keywords considered in this experiment, 1191 keywords (79.4%) have a majority score of 1, with only 253 keywords (16.9%) having a majority score of 0. In other words, 79.4% of the senses proposed by the approach are agreed by the workers with a majority score, which indicates the accuracy of the SDNA disambiguation algorithm proposed. This result of 79.4% demonstrated by the proposed approach is far better than the 73% accuracy achieved in the Semeval 2007 competition, which compared the accuracy of various unsupervised algorithms where the participants have been using WordNet as a lexicon (Navigli, 2009). It is also comparable with the accuracy achieved in the same competition (82–83%) by the supervised algorithms which contrary to the approach proposed in this paper have to be trained with large corpora. In task 2, a subset of 50 images from the 500 images used in task 1 was selected for evaluation. Using the same approach, each image annotation was scored by 10 different workers. A total of

1650

S.A. Fadzli, R. Setchi / Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655

Fig. 5. Task 2: Accuracy of the annotations.

Table 4 Scoring for task 2.

Vote Distribution for Task 1 278 272

300

200

0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Total

0Y 10N 1Y 9N 2Y 8N 3Y 7N 4Y 6N 5Y 5N 6Y 4N 7Y 3 N 8Y 2 N 9Y 1 N 10Y 0 N

0 9 40 71 133 56 278 272 262 235 144 1500

0.0% 0.6% 2.7% 4.7% 8.9% 3.7% 18.5% 18.1% 17.5% 15.7% 9.6% 100%

500 HITs were offered with payment of USD 0.02 per HIT. A total of 47 workers accepted the tasks; in average each of them completed 10.6 HITs. An average score from 10 workers is taken as the ﬁnal score for each image annotation. Table 6 shows the average score for the 50 images considered. Fig. 7 shows examples of images with (a) high annotation accuracy and (b) low annotation accuracy. The annotation of the ﬁrst image (Fig. 7a) has a high average annotation accuracy of 2.7 as its annotation contains words that could easily be associated with objects in the picture. The annotation of the second image (Fig. 7b) has a low average annotation accuracy score of

0

9 9Y 1N

Percentage

56

8Y 2N

Keyword Count

2Y 8N

Code

1Y 9N

50 0Y 10N

Score

71 40

7Y 3N

100 Table 5 Task 1: results.

144

133

10Y 0N

150

6Y 4N

3 2 1 0

235

5Y 5N

Very accurate Accurate Fair Not accurate

262

250

4Y 6N

Score

3Y 7N

Choice

Fig. 6. Vote distribution for task 1.

Table 6 Task 2: results. Score

Vote Count

Percentage

0o ¼ xo 1 1 o ¼ xo 2 2 o ¼ xo 3 X ¼3

0 16 33 0

2% 32% 66% 0%

Total

50

100%

1.3 as it contains lots of irrelevant words such as customs, society, earthwork, horizontal, windmill, beaker, horseshoe, igneous and heel. To ﬁnd the statistical correlation between the accuracy of the SDNA disambiguation algorithm and the accuracy of the annotations, Pearson’s correlation was computed on the basis of the 50 images used in task 2. Fig. 8 shows the correlation graph between the average SDNA disambiguation score and average annotation score.

S.A. Fadzli, R. Setchi / Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655

1651

Fig. 7. Two images assessed in Task 2. (a) Annotation: horse, riding, water, holidays, lifestyle, ride, fun, splash, country, scene, village, gallop, holiday, life, stream, lower, slaughter (Average Score: 2.7) and (b) Annotation: customs, society, stone, circle, earthwork, sandstone, horizontal, lintel, world, heritage, site, bronze, age, windmill, beaker, horseshoe, igneous, rocks, heel, astronomy (Average Score: 1.3).

Correlation between Average Annotation Score and Average SDNA Disambiguation Score

Table 7 Terms with concept numbers.

Average Annotation Score

3.000 2.500 2.000 1.500 1.000 0.500 0.000 0.000

0.200

0.400

0.600

0.800

1.000

1.200

Average SDNA Disambiguation Score Fig. 8. Correlation graph between average annotation score and average SDNA disambiguation score.

The analysis shows a positive correlation value of 0.5779 between the average SDNA disambiguation score and the average annotation accuracy score which suggests that there is a positive relationship between the quality of the image annotations and the quality of the SDNA disambiguation result proposed by this approach. In other words, the approach selects an accurate sense for each keyword when the annotation accuracy is high. As one may expect, lower quality annotations make it hard for the approach to propose the correct sense for each keyword. For example, the result from the disambiguation of the annotation of the image in Fig. 7a scored 2.7 while the corresponding score for the one shown in Fig. 7b is 1.3.

5. Comparison with previous work The concept-based indexing approach proposed in this paper is based on previous research conducted within an EU-funded project called TRENDS (Setchi and Bouchard, 2010, Setchi et al., 2011). The approach tags images with a ranked set of concept numbers extracted by analyzing web content (i.e. the text surrounding the images). The TRENDS algorithm uses concepts from two ontologies: a generic lexical ontology called OntoRo and a domain-speciﬁc ontology for designers called Conjoint Trend Analysis (CTA). The weight of the concepts in TRENDS is calculated using (3): n X 1 wck ðdj Þ ¼ ð3Þ kCTA wtf idf ðt i ,dj Þ C k ðt i Þ i¼0

Term (ti)

tf-idf

Golden Temple Japan Far East Travel Architecture Wooden Shrine Religion Historic Tradition Water

0.1176 5 0.1101 7 0.1301 6 0.1626 1 0.0715 6 0.0724 7 0.1260 4 0.1524 3 0.0955 7 0.1007 1 0.1232 6 0.0644 21

Peace Garden World Heritage Site Tourism

0.1213 10 0.0876 10 0.1063 10 0.0900 4 0.1138 3 0.0819 2

Ck(ti) Concept numbers 127, 433, 644, 730, 852 164, 192, 209, 213, 662, 866, 990 226, 226, 357, 428, 428, 844 199 265, 267, 277, 298, 589, 981 56, 62, 164, 243, 331, 551, 844 366, 576, 602, 820 364, 988, 990 449, 529, 973, 979, 981, 982, 984 866 127, 144, 524, 590, 610, 973 43, 156, 163, 171, 301, 302, 319, 335, 339, 341, 346, 350, 369, 370, 382, 387, 422, 633, 648, 654, 730 24, 60, 266, 376, 399, 710, 717, 719, 730, 826 156, 194, 235, 366, 368, 370, 383, 396, 449, 841 3, 32, 52, 154, 183, 319, 321, 344, 494, 708 124, 170, 773, 777 184, 186, 187 267, 837

where wck ðdj Þ is the weight of a concept ck in a document dj, kCTA is a coefﬁcient with two values: 1.5 (if concept ck is domain-speciﬁc, i.e. it exists in the CTA ontology) or 1 (if the concept is not domain-speciﬁc and therefore not part of the CTA ontology), wtf-idf(ti, di) is the tf-idf weight of a term ti in a document dj, and Ck(ti) is the number of concepts Ck the term ti is related to. The algorithm is reported to demonstrate good performance in semantic-based retrieval and higher precision in concept search compared to traditional keyword-based retrieval. However, it was originally formulated for long text documents dealing with hundreds of concepts per document. Based on empirical evidence, the algorithm is considerably less efﬁcient when dealing with short texts such as image annotations. The lack of word disambiguation function and the extensive use of tf-idf weighting have lead to irrelevant concept numbers being tagged to images, producing a lot of noise in the index table. Further analysis shows that Ck(ti) has a high impact on the concept weights as any concept related to terms which are less ambiguous (i.e. have a small number of senses) will most probably get high weighting. For example, consider Fig. 3 and its annotation. It contains 18 unique terms including words and phrases. Most terms are polysemic (i.e. they have more than one possible meaning) and are linked to a number of ontological concepts. Table 7 lists all

1652

S.A. Fadzli, R. Setchi / Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655

Table 8 The top six concept numbers tagged to the image.

Table 9 Top ﬁve SDNA strings tagged to the image example.

Concept numbers

wck(dj)

Term

SDNA

Concept Name

Sense

totalsim()

#199: #866: #990: #267: #364: #988:

0.1626 0.1164 0.0665 0.0529 0.0508 0.0508

Garden World Golden Architecture Travel Tourism Peace

3-15-47-370-3-1 3-14-46-344-1-1 3-15-48-433-2-1 3-14-45-331-1-1 2-12-40-267-1-1 2-12-40-267-1-1 3-15-48-376-1-3

Agriculture Land Yellowness Composition Motion Land travel Pleasure

Cultivate Land Yellow Composition Motion Land travel Euphoria

8.19717 7.99745 7.77245 7.71918 7.71901 7.71901 7.64018

distance repute temple land Travel interment ritual

concept numbers related to these terms. For example, the word ‘temple’ is linked to seven concept numbers because it is semantically related to the concept of production (concept number #164), abode (#192), height (#209), summit (#213), refuge (#662), repute (#866) and temple (#990). The word ‘water’ is related to 21 different concepts while the named entity ‘Far East’ is related to only 1 concept. Table 8 lists the concepts with the highest ranking computed using formula (3). For the purposes of this example, the CTA coefﬁcient is ignored as it has been used in the TRENDS project to highlight the importance of some high-impact semantic adjectives for the designers of concept cars. The analysis shows that concept #199 is ranked as the highest weighted concept as result of being the only concept linked to the term ‘Far East’. The low value of Ck (ti) makes a high impact on the weight of the concept. Concept #199, named ‘distance’, contains 219 related terms as listed below: distance, astronomical distance, light years, depths of space, space, measured distance, mileage, food miles, footage, length, focal distance, elongation, greatest elongation, aphelion, apogee, far distance, horizon, false horizon, skyline, ofﬁng, background, rear, periphery, circumference, outline, drift, dispersion, deviation, reach, grasp, compass, span, stride, giant’s stride, range, far cry, long way, fair way, tidy step, day’s march, long long trail, marathon, farness, far distance, remoteness, aloofness, removal, separation, antipodes, pole, contraposition, world’s end, ends of the earth, Pillars of Hercules, ne plus ultra, back of beyond, Far West, Far East, foreign parts, extraneousness, outpost, seclusion, purlieus, outskirts, exteriority, outer edge, frontier, limit, unavailability, absence. This concept which represents the idea of being in a distant or remote area is hardly the strongest concept to be associated with the annotation context. The second highest concept #866:repute is also the only concept associated with the monosemic word ‘historic’. The next three concepts in the ranked list (#990:temple, #364:interment and #988:ritual) are all related to the term ‘shrine’ which has the second highest tf-idf value. Although these three concepts indicate three different meanings of the term ‘shrine’, they are all being tagged as important concepts for the image regardless of the actual context. This example shows the need for concept disambiguation, especially when dealing with annotations or short texts. An experiment using 5000 random image annotations from the fotoLIBRA& dataset reveals that 80.9% of the highest ranked concepts being tagged using the TRENDS algorithm are linked to terms with less than two senses (Ck (ti)o ¼2). This result shows that, when dealing with short texts, the TRENDS algorithm is biased towards monosemic terms, regardless of the semantic context of the text. The concept senses tagged by the proposed algorithm to the same image example are shown in Table 9. The highest weighted SDNA, 3-15-47-370-1-2 is selected from 10 different senses of the term ‘garden’. This SDNA represents the idea of botanical farm or ﬂora garden. It is semantically associated with the 124 terms listed below:

cultivate, bring under cultivation, make fruitful, farm, ranch, garden, grow, till, till the soil, scratch the soil, dig, double-dig, trench, bastard trench, delve, spade, dibble, seed, sow, broadcast, scatter the seed, set, plant, prick out, dibble in, puddle in, transplant, plant out, bed, plough, disk, harrow, rake, hoe, weed, prune, top and lop, thin out, deadhead, shorten, graft, engraft, implant, layer, take cuttings, force, vernalize, fertilize, top dress, mulch, dung, manure, invigorate, grass over, sod, rotate the crop, leave fallow, not use, harvest, gather in, store, glean, reap, mow, cut, scythe, cut a swathe, bind, bale, stook, sheaf, ﬂail, thresh, winnow, sift, bolt, separate, crop, pluck, pick, gather, tread out the grapes, ensile, ensilage, improve one’s land, make better, fence in, enclose, ditch, drain, reclaim, water, irrigate. This paper proposes an approach to sense disambiguation which is based on calculating the occurrences of each concept sense. The SDNA strings tagged to the image carries rich semantic information linked to the high level concepts represented in the annotation. Unlike the TRENDS algorithm which uses concept groups as a means of semantic representation, the proposed approach considers the subgroups of each concept number in the lexical ontology. The innovation in the center of this research is the idea that these subgroups provide more precise sense disambiguation which results in improved image indexing.

6. Conclusion This paper proposes a concept-based indexing method using Semantic DNA (SDNA) to represent text semantics. The approach tags images by ﬁrst analyzing the annotation keywords, extracting the candidate SDNA strings from the annotation and selecting the most accurate SDNA for each keyword. The set of SDNA produced for each image, called the semantic chromosome, is then used to index the image. Although the study only focuses on image indexing, this approach is also applicable to other resources which contain text description including audio, video and multimedia content. The paper reports good results in using crowdsourcing in evaluating the accuracy of the SDNA disambiguation algorithm. The experiments show that the algorithm has better accuracy (79.4%) than the accuracy demonstrated by other unsupervised algorithms (73%) in the 2007 Semeval competition. It is also comparable with the accuracy achieved in the same competition by the supervised algorithms (82-83%) which contrary to the approach proposed in this paper have to be trained with large corpora. Further experiment shows a positive correlation value of 0.5779 indicating that the performance of the SDNA disambiguation algorithm depends on the quality of the text/annotation. The approach proposed in this paper is currently applied to the automated generation of mood boards used as an inspirational tool in concept design. Future work includes combining SDNA

S.A. Fadzli, R. Setchi / Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655

1653

Table A1 Terms with their SDNAs. Term (ti)

No. of senses

SDNA

t1: Golden

5

t2: Temple

8

t3: Japan

6

t4: Far East t5: Travel

1 8

t6: Architecture

7

t7: Wooden

4

t8: Shrine t9: Religion

3 8

t10: Historic t11: Tradition

1 6

t12: Water

26

t13: Peace

10

t14: Garden

14

t15: World

14

t16: Heritage

4

t17: Site

5

t18: Tourism

2

s1: 1-6-22-127-1-2 s2: 3-15-48-433-2-1 s6: 1-8-28-164-1-3 s7: 2-9-33-192-1-6 s8: 2-10-35-209-1-4 s14: 2-10-36-226-1-10 s15: 2-10-36-226-3-4 s20: 2-10-34-199-1-2 s21: 2-12-40-265-1-1 s22: 2-12-40-265-3-1 s23: 2-12-40-267-1-1 s29: 1-3-12-56-1-1 s30: 1-4-13-62-1-1 s31: 1-8-28-164-1-1 s36: 3-15-47-366-2-3 s37: 4-25-58-576-2-1 s40: 3-15-47-364-1-6 s43: 4-16-49-449-1-2 s44: 4-24-57-529-1-1 s45: 6-39-92-973-1-1 s51: 6-36-82-866-2-1 s52: 1-6-22-127-1-3 s53: 1-7-24-144-1-1 s58: 1-3-11-43-3-1 s59: 1-8-26-156-3-1 s60: 1-8-27-163-1-3 s61: 1-8-27-163-3-2 s62: 1-8-28-171-3-1 s63: 2-12-43-301-1-29 s64: 2-12-43-301-3-4 s65: 2-12-43-302-1-4 s66: 3-13-44-319-1-4 s84: 1-2-8-24-1-1 s85: 1-4-13-60-1-1 s86: 2-12-40-266-1-2 s87: 3-15-48-376-1-3 s94: 1-8-26-156-1-3 s95: 2-9-33-194-1-22 s96: 2-10-36-235-1-1 s97: 3-15-47-366-1-2 s98: 3-15-47-366-1-7 s108: 1-1-2-3-1-1 s109: 1-3-10-32-1-2 s110: 1-3-11-52-1-1 s111: 1-3-11-52-2-4 s112: 1-7-25-154-1-2 s122: 1-6-22-124-1-1 s123: 1-8-28-170-1-1 s126: 2-9-31-184-2-1 s127: 2-9-32-186-1-1 s131: 2-12-40-267-1-1

s3: 5-27-63-644-2-4 s4: 5-30-69-730-2-3 s9: 2-10-35-213-1-3 s10: 5-27-63-662-1-1 s11: 6-36-82-866-1-4 s16: 3-14-46-357-1-4 s17: 3-15-48-428-1-3 s24: s25: s26: s32: s33: s34: s38: s39: s41: s46: s47: s48:

2-12-40-267-3-1 2-12-41-277-3-1 2-12-43-298-3-1 2-11-37-243-1-1 3-14-45-331-1-1 4-25-58-551-1-3 5-26-59-602-2-1 6-35-77-820-2-1 6-39-95-988-1-10 6-39-93-979-1-1 6-39-93-979-1-1 6-39-94-981-1-7

s54: 4-24-57-524-1-1 s55: 4-25-58-590-1-2 s67: 3-14-46-335-1-1 s68: 3-14-46-335-1-1 s69: 3-14-46-335-1-1 s70: 3-14-46-339-1-1 s71: 3-14-46-339-3-1 s72: 3-14-46-341-3-4 s73: 3-14-46-346-1-1 s74: 3-14-46-350-3-3 s75: 3-15-47-369-3-3 s88: 3-15-48-399-1-1 s89: 5-29-68-710-1-1 s90: 5-29-68-717-1-1 s91: 5-29-68-719-1-2 s99: 3-15-47-366-3-1 s100: 3-15-47-368-1-1 s101: 3-15-47-370-1-2 s102: 3-15-47-370-2-1 s103: 3-15-47-370-3-1 s113: 2-9-31-183-1-1 s114: 2-9-31-183-2-2 s115: 3-13-44-319-1-1 s116: 3-13-44-321-1-1 s117: 3-13-44-321-1-2 s124: 5-34-73-773-1-1 s125: 5-34-73-777-1-4 s128: 2-9-32-187-1-3 s129: 2-9-32-187-1-3 s132: 6-36-78-837-1-6

s5: 6-36-80-852-2-2 s12: 6-39-95-990-1-1 s13: 6-39-95-990-1-3 s18: 3-15-48-428-3-1 s19: 6-36-79-844-3-1 s27: 4-25-58-589-1-2 s28: 6-39-94-981-3-2 s35: 6-36-79-844-1-2

s42: 6-39-95-990-1-1 s49: 6-39-94-982-1-1 s50: 6-39-94-984-1-1

s56: s57: s76: s77: s78: s79: s80: s81: s82: s83:

5-26-59-610-1-1 6-39-92-973-1-4 3-15-47-370-3-1 3-15-48-382-1-3 3-15-48-387-1-1 3-15-48-422-1-1 5-27-63-633-3-1 5-27-63-648-1-4 5-27-63-654-3-2 5-30-69-730-3-3

s92: 5-30-69-730-1-2 s93: 6-36-78-826-1-1

s104: s105: s106: s107:

3-15-48-383-1-2 3-15-48-396-1-1 4-16-49-449-1-3 6-36-79-841-1-2

s118: s119: s120: s121:

3-13-44-321-2-4 3-14-46-344-1-1 4-20-53-494-1-1 5-29-68-708-1-4

s130: 2-9-32-187-3-1

Table A2 Semantic chromosome. Term

Possible Concepts

SDNA

Concept

Sense

Golden

Oldness, yellowness, goodness, prosperity, hope

3-15-48-433-2-1

Yellowness

Yellow

Temple

Production, abode, height, summit, refuge, repute, temple Covering, covering, unctuousness, blackness, ornamentation

2-9-33-192-1-6

Abode

3-15-48-428-3-1

Blackness

Far East

Distance

2-10-34-199-1-2

Distance

Travel

Motion, motion, land travel, land travel, velocity, egress, book, worship

2-12-40-267-1-1

Land travel

Japan

Related Words

Pale yellow, acid yellow, lemon yellow, primrose yellow, jasmine, citrine, chartreuse, champagne, canary yellow, sunshine yellow, bright yellow, sulphur yellow, mustard yellow, golden, aureate, gilt House House, building, ediﬁce, house of God, temple, abode, home, residence, dwelling, dwelling house, messuage. Blacken Blacken, black, black lead, Japan, ink, ink in, dirty, blot, smudge, smirch, make unclean, deepen, darken, singe, char, burn Farness Far West, Far East, foreign parts, extraneousness, outpost, seclusion, purlieus, outskirts, exteriority, outer edge, frontier, limit, unavailability, absence Land travel Land travel, travel, traveling, wayfaring, seeing the World, globe-trotting, country hopping, tourism,

1654

S.A. Fadzli, R. Setchi / Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655

Table A2 (continued ) Term

Possible Concepts

SDNA

Concept

Sense

Architecture

Composition, arrangement, production, form, structure, representation, ornamentation

3-14-45-331-1-1

Structure

Structure

Wooden

Vegetable life, inelegance, obstinacy, insensibility Interment, ritual, temple

3-15-47-366-2-3

Vegetable life

Wooden

3-15-47-364-1-6

Interment

Tomb

Shrine

Religion

Thought, news, religion, piety, worship, idolatry, occultism

6-39-94-981-1-7

Worship

Public worship

Historic

Repute

6-36-82-866-2-1

Repute

Reputable

Tradition

Oldness, permanence, information, description, habit, religion

6-39-92-973-1-4

Religion

Theology

Water

Mixture, causation, weakness, productiveness, eating, eating, excretion, materiality, ﬂuidity, water, moisture, lake, stream, animal husbandry, agriculture, refrigeration, insipidity, transparency, provision, cleanness, improvement, prosperity Agreement, order, quiescence, pleasure, silence, concord, peace, paciﬁcation, prosperity, pleasurableness Causation, receptacle, enclosure, vegetable life, botany, agriculture, furnace, fragrance, thought, beauty Substantiality, greatness, whole, event, space, materiality, universe, land, truth, party Filturity, posterity, possession, property

3-15-47-370-3-1

Agriculture

cultivate

3-15-48-376-1-3

Pleasure

Euphoria

3-15-47-370-3-1

Agriculture

3-14-46-344-1-1

Land

1-8-28-170-1-1

Posterity

Site

Spatial, situation, location

2-9-32-187-3-1

Location

Tourism

Land travel, amusement

2-12-40-267-1-1

Land travel

Peace

Garden

World

Heritage

with content-based image retrieval (CBIR) methods to improve the indexing and retrieval performance.

Acknowledgment The authors are grateful to the Ministry of Higher Education of Malaysia and Sultan Zainal Abidin University in Malaysia for sponsoring this research. They also wish to acknowledge the support of all members of the Knowledge Engineering Systems group at Cardiff who have provided a vibrant and intellectually stimulating environment for this research. Special thanks to Viscon Pro Ltd. for supporting this research by providing access to their fotoLibra collection of images. Appendix See Tables A.1 and A.2 for more details.

Related Words walking, hiking, riding, driving, motoring, cycling, biking, journey, Voyage Works, workings, nuts and bolts, architecture, tectonics, architectonics, fabric, work, brickwork, stonework, woodwork, timberwork, studwork, materials, substructure, infrastructure, superstructure, building, scaffold, framework Wooden, wood, treen, woody, ligneous, ligniform, hardgrained, soft-grained Shaft tomb, barrow, earthwork, cromlech, dolmen, menhir, monument, shrine, aedicule, memorial, cenotaph Meeting for prayer, gathering for worship, assembly, prayer meeting, revival meeting, open-air service, mission service, street evangelism, revivalism, temple worship, state religion, religion Famous, fabled, legendary, famed, far-famed, historic, illustrious, great, noble, glorious, excellent, notorious, disreputable, known as, wellknown Symbolics, credal theology, liberation theology, tradition, deposit of faith, teaching, doctrine, religious doctrine Rotate the crop, leave fallow, not use, harvest, gather in, store, glean, reap, mow, cut, scythe, cut a swathe, bind, bale, stook, sheaf, ﬂail, thresh, winnow, sift, bolt, separate, crop, pluck, pick, gather, tread out the grapes, ensile, ensilage, improve one’s land, make better, fence in, enclose, ditch, drain, reclaim, water, irrigate

Luxuries, superﬂuity, lap of luxury, wealth, feather bed, bed of dow bed of roses, velvet, cushion, pillow, softeness, peace, quiet, rest, repose, quiet dreams, sleep, painlessness, euthanasia cultivate Cultivate, bring under cultivation, make fruitful, farm, ranch, garden, grow, till, till the soil, scratch the soil, dig, double-dig, trench, bastard trench, delve, spade Land Land, dry land, terra ﬁrma, earth, ground, crust, earth’s crust, world, continent, mainland, heartland, hinterland, midland, inland, interior, interiority, peninsula Posterity Seed, litter, farrow, spawn, young creature, fruit of the womb, children, grandchildren, family, aftercomers, succession, heirs, inheritance, heritage, posteriority, rising generation, youth Place Place, collocate, assign a place, arrange, situate, position, site, locate, relocate, base, center, localize, narrow down, pinpoint, pin down, ﬁnd the place Land travel Land travel, travel, traveling, wayfaring, seeing the World, globe-trotting, countryhopping, tourism, walking, hiking, riding, driving, motoring, cycling, biking, journey, Voyage

References Akkaya, C., Conrad A., Wiebe J., and Mihalcea R., 2010. Amazon Mechanical Turk for subjectivity word sense disambiguation. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, Association for Computational Linguistics, Los Angeles, pp. 195–203. Alonso, O., Mizzaro, S., 2009. Can we get rid of TREC assessors? Using Mechanical Turk for relevance assessment. In: SIGIR, I.R. (Ed.), Evaluation Workshop. Boston, Massachusetts, pp. 15–16. Boujemaa, N., Fauqueur, J., Ferecatu, M., Fleuret, F., Gouet, V., Saux, B.L. and Sahbi, H., 2001.Ikona: Interactive generic and speciﬁc image retrieval. In Proceedings of the International workshop on Multimedia Content-Based Indexing and Retrieval (MMCBIR’2001). Callison-Burch, C., 2009. Fast, cheap, and creative: evaluating translation quality using amazon’s Mechanical Turk, In EMNLP ’09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 286–295, Morristown, NJ, USA. Association for Computational Linguistics. Chang, S.-F., Chen, W., Sundaram, H., 1998. Semantic visual templates: linking visual features to semantics, International Conference on Image Processing (ICIP), Workshop on Content Based Video Search and Retrieval, vol. 3, pp. 531–534.

S.A. Fadzli, R. Setchi / Engineering Applications of Artiﬁcial Intelligence 25 (2012) 1644–1655

Chen, Y., Wang, J.Z., Krovetz, R., 2003. An unsupervised learning approach to content-based image retrieval. In: IEEE Proceedings of the International Symposium on Signal Processing and its Applications, pp. 197–200. Corney, J.R., Torres-Sanchez, C., Jagadeesan, A.P., Yan, X.T., Regli, W.C., Medellin, H., 2010. Putting the crowd to work in a knowledge-based factory. Adv. Eng. Inform. 24, 243–250. Dill, S., Eiron, N., Gibson, D., Gruhl, D., Guha, R., Jhingran, A., Kanungo, K., McCurley, S., Rajagopalan, S., Tomkins, A., Tomlin, J.A., Zien, J.Y., 2003. A case for automated large-scale semantic annotation, Web Semantics: Science. Serv. Agents World Wide Web 1 (1), 115–132. Doulamis, A.D., Doulamis, N.D., 2004. Generalized nonlinear relevance feedback for iterative content-based retrieval and organization. IEEE Trans. CSVT 14 (5), 656–671. Edmonds, P., 2002. SENSEVAL: The evaluation of word sense disambiguation systems. ELRA Newsl. 7 (3), 5–14. Enser, P.G.B., Sandom, C.J., Hare, J.S., Lewis, P.H., 2007. Facing the Reality of Semantic Image Retrieval. J Doc. 63 (4), 465–481. Eugenio, D.S., Francesco, M.D., Marina, M., 2002. Structured Knowledge Representation for Image Retrieval. J. Artif. Intell. Res. 16, 209–257. Fadzli, S.A., Setchi, R., 2010. Semantic approach to image retrieval using statistical models based on a lexical ontology. In: Proc. of the 14th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems, Cardiff, UK, vol IV, pp. 240-250. Fadzli, S.A., Setchi, R., 2011. Ontology-based indexing of annotated images using semantic DNA and vector space model. In: Proc. of the STAIR’11: International Conference on Semantic Technology and Information Retrieval, Putrajaya, Malaysia. Feng, H., Shi, R., Chua, T.-S., 2004. A bootstrapping framework for annotating and retrieving WWW images. In: Proceedings of the ACM International Conference on Multimedia. Ferecatu, M., Boujemaa, N., Crucianu, M., 2008. Semantic Interactive Image Retrieval Combining Visual and Conceptual Content Description. ACM Multimedia Syst. J. 13 (5-6), 309–322. Fuhr, N., Buckley, C., 1991. A probabilistic learning approach for document indexing. Acm. Trans. Inform. Syst. 9 (3), 223–248. Gauch, S., Madrid, J.M., Induri, S., Ravindran, D., and Chadlavada, S. 2003Keyconcept: A conceptual search engine. Tech. Report TR-8646-37, University of Kansas. Hart, M., Newby, G., 2003. Project Gutenberg, /http://www.gutenberg.org/S (accessed 30.11.10). Hofmann, T. 1999. Probabilistic latent semantic indexing.In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’99). ACM, Berkeley, California, pp. 50–57. Howe, J., 2006. The Rise of Crowdsourcing, Wired Magazine, 14(6), URL : /http:// www.wired.com/wired/archive/14.06/crowds.htmlS(accessed 15 05 2011). ¨ Hyvonen, E., Styrman, A., Saarela, S., 2002. Ontology-based image retrieval. Proceedings of the XML Finland 2002 Conference vol. 16, 15–27. Lee, J.H., 1995. Combining multiple evidence from different properties of weighting schemes. In: Proceedings of the 18th SIGIR Conference, pp. 180–188. Liu, Y., Zhang, D., Lu, G., Ma, W.-Y., 2007. A Survey of Content-Based Image Retrieval with High-Level Semantics. Patt. Recog. 40, 262–282.

1655

Lu, Y., Hu, C., Zhu, X., Zhang, H.Q.Yang, 2000. A uniﬁed framework for semantics and feature based relevance feedback in image retrieval systems. ACM Int. Conf. on Multimedia, 31–37. Mezaris, V., Kompatsiaris, I., Strintzis, M.G., 2003. An ontology approach to objectbased image retrieval. Proceedings of the ICIP II, 511–514. Navigli, R., 2009. Word sense disambiguation: A survey. ACM Computing Surveys (CSUR) 41 (2), 1–69. OntoRo, 2011. URL, /http://kes.engin.cf.ac.uk/sdna/ontoro/S(accessed 06.06.2011). Papineni, K., Roukos, S., Ward, T., and Zhu, W.J., 2002. BLEU: A method for automatic evaluation of machine translation. In: ACL-2002: 40th Annual meeting of the Association for Computational Linguistics, pp. 311–318. Ren, J., Shen, Y., Guo, L., 2003. A novel image retrieval based on representative colors. In: Proceedings of the Image and Vision Computing, N.Z., pp. 102–107. Robertson, S.E. et al. Eds., 1999. Okapi at TREC-7. In: Proceedings of the Seventh Text REtrieval Conference. Gaithersburg, USA. Davidson, E. (Ed.), 2003. Roget’s Thesaurus of English Words and Phrases. Penguin, UK. Salton, G., Buckley, C., 1988. Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24 (5), 513–523. Salton, G., & McGill, M.J., 1983. Introduction to modern information retrieval. New York. Setchi, R., Tang, Q., 2007. Concept Indexing using Ontology and Supervised Machine Learning,. Trans. Eng., Comput. Technol., 221–226. Setchi, R., Tang, Q., Stankov, I., 2011. Semantic-based Information Retrieval in Support of Concept Design. Adv. Eng. Inform. 25 (2), 131–146. Setchi, R., Bouchard, C., 2010. In Search of Design Inspiration: A Semantic-based Approach, Transactions of the ASME. J. Comput. Inf. Sci. Eng. vol. 10 (3), 31006. :23 (23 pages). Smeulders, A., Worring, M., Santini, S., Gupta, A., Jain, R., 2000. Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22 (12), 1349–1380. Snow, R., O’Connor, B., Jurafsky, D. and Ng. Andrew T., 2008. Cheap and fast—but is it good?: Evaluating non-expert annotations for natural language tasks. In: EMNLP ’08: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 254–263. Staab, A., Scherp, A., Arndt, R., Troncy, R., Grzegorzek, M., Saathoff, C., Schenk, S., Hardman, L., 2008. Semantic Multimedia. LNCS 5224, 125–170. Styltsvig, H.B., 2006. Ontology-based information retrieval. Ph.D. Thesis, Dept. Computer Science, Roskilde University, Denmar. Tsai, C.-F., 2007. A Review of Image Retrieval Methods for Digital Cultural Heritage Resources. Online Inf. Rev. vol. 31 (2), 185–198. Vasconcelos, N., 2004. On the efﬁcient evaluation of probabilistic similarity functions for image retrieval. IEEE Trans. Inf. Theory 50 (7), 1482–1496. VisconPro, 2011.fotoLIBRA, /http://www.fotolibra.comS, (accessed 30.06.2011). Voorhees, E.M. and Harman, D. 1999. Overview of the eighth text retrieval conference (TREC—8).In: Proceedings of the Eighth Text Retrieval Conference (TREC-8). NIST, Gaitherburg, MD, pp. 1–24. Yi, X. and Allan, J. 2009. A comparative study of utilizing topic models for information retrieval. In: Proceedings of the 31st European Conference on IR Research (ECIR). Springer, Toulouse, France, pp. 29–41. Zhang, R., Zhang, Z., Li, M., Ma, W.-Y., Zhang, H.-J., 2006. A probabilistic semantic model for image annotation and multi-modal image retrieval. Multimedia Syst. 12 (1), 27–33.

Concept-based indexing of annotated images using semantic DNA

Concept-based indexing of annotated images using semantic DNA

Recommend Documents