Video Content Reterival Using Historgram Clustering Technique

Video Content Reterival Using Historgram Clustering Technique

Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 50 (2015) 560 – 565 VIDEO CONTENT RETERIVAL USING HISTORGRAM CLUST...

637KB Sizes 4 Downloads 52 Views

Available online at www.sciencedirect.com

ScienceDirect Procedia Computer Science 50 (2015) 560 – 565

VIDEO CONTENT RETERIVAL USING HISTORGRAM CLUSTERING TECHNIQUE D.Saravanana Vaithyasubramanianb K.N. Jothi Vengateshc a

. Associate Professor, IBS University, Hyderabad, 501 203, Telangana, India. Assistant Professor, Sathyabama University, Chennai, 600 119, Tamil Nadu Assistant Professor, Sathyabama University, Chennai, 600 119, Tamil Nadu

b c

Abstract In the recent years the video content management and mining becomes most essential one because of the increasing quantity of digital video data presented with the advance technology of multimedia and networking, the digital video contents are widely available over the web. Recently the clustering and indexing applications are dynamic research in the in the research of multimedia recently many applications are created for categorizing, indexing and retrieving the digital video contents. These applications are used to handle the large quantity of video contents. Moreover the advanced technologies should be developed for mining and searching the large amount of videos that are accessible on the web. Generally the clustering is useful technique to obtain some information from the given dataset. It maps the similar data items into clusters.This paper focus the fast retrieval of video data by using histogram clustering; the experimental result shows the proposed method works efficiently. Key Terms: Data Mining, Clustering, Video Clustering, Segmentation, Frames, CHAMELEON, Multimedia Datas.

1.Introduction The information retrieval process often requires the effective and efficient organization of large dimensional data. Mostly clustering, an unsupervised learning is used to retrieve information from huge dimensional data. Clustering and classification aims at grouping videos into various classes. In (Faloutsos.C et. al 2002; Kimiaki Shirahama 2011), a video is depicted as cube that signifies both temporal and spatial information in the video. Then the videos are classified into news and commercial by applying Independent Component Analysis (ICA) on n x n x n cube. In (Davidson.J.G et. al; Kimiaki Shirahama 2011), the researchers represent a soccer video based on shots where each shot is depicted as a mosaic. In (Kimber.D and Zhou. H 2006; Kimiaki Shirahama 2011) a Gaussian Mixture Model (GMM) is used to represent human movements in an example video. Then the human movements are classified into clusters of similar activities based on the GMM parameters. So the clusters consist of a little amount of activities. 1.1 Clustering Clustering is a best technique to discover some information from dataset. This technique maps data item with one cluster, the clusters for data items are normal groupings and this group are mainly as per the probability density model or similarity metrics model [7, 3]. Clustering is said to be a unsupervised learning procedure, and it aims at identifying the structure in a group of unlabeled data to solve problems of this kind (Safadi .B and Quenot. G 2011). Clustering is a process of dividing data into clusters of similar objects. According to (Akilandeswar. S and Anita Elavarasi 2011;Asaad Hakeem and Mubarak Shah 2007), Clustering algorithms can be classified into five categories as follows (anandan and Irani. M 1998 ; Ville Viitaniemi 2002). 1.2 Existing Video Clustering Video clustering has some kind of variations with conventional clustering algorithms. As mentioned previously, because of unstructured, the computer vision and image processing technique used for preprocessing of video data to obtain features in structured formats. In video clustering, one more variation is that time must be considered on video processing. Often the video is a synchronized data of both audio and visual information in time. So the time factor is more important. Usual clustering algorithm is categorized into hierarchical and partitioned clustering (2003). [7, 13].Hierarchical clustering algorithm clusters dataset using some end criterion and treestructured denogram. In that each cluster node has child clusters. Sibling clusters divide the points that are covered by their parent. This kind of approach permits exploring in various levels of group. These clustering methods are classified into two types. One is agglomerative and another is divisive. This clustering algorithm is easy to use in a

[email protected] [email protected] [email protected]

b c

1877-0509 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of scientific committee of 2nd International Symposium on Big Data and Cloud Computing (ISBCC’15) doi:10.1016/j.procs.2015.04.084

D. Saravanan et al. / Procedia Computer Science 50 (2015) 560 – 565

Video clustering and it easily handles mined feature’s similarity from video, and also it has ability to represent the granularity and depth using tree’s level 2. Experimental Work 2.1 Proposed Histogram Clustering Framework Unprocessed

Frame Mining

Frames Clustering

Video Database Fig 1. Proposed Histogram Clustering Framework

In the beginning of the process, the video is first converted into sequence of frames. Afterwards the video clustering algorithm is employed where two searching process are there. The first searching process is on the image matrix and it is utilized to identify the centroids in order to remove the duplicate frames in the video. Next the second search is on the image pixel and it is mostly used to create the cluster. This type of pixel searching takes the smaller amount of time to cluster. Hence it automatically overcomes the issues of time complexity. A novel matrix based indexing technique at first converts the video into number of frames. Then the input frame is splitted into columns and rows. Afterwards matrix cell histogram is calculated and it is used to retrieve the video or else the query image from the video database. The proposed frame work will provide the better results when compared to the existing techniques. 2.2Frames The following sample videos are converted into number of frames for clustering performance comparison of existing techniques with proposed technique.

Fig 2. Cartoon Video File Frames

Fig 3. Graphics Video File Frames

Fig 4. Meeting Video File Frames

561

562

D. Saravanan et al. / Procedia Computer Science 50 (2015) 560 – 565

Fig 5. Song Video File Frames

Fig 6. News Video File Frames

2.3 Duplicate Removal After the frame conversion the duplicate contents are removed by using the following technique. Algorithm: Repeated Frame Removal using Grey scale Value Input: Segmented video frames Output: Grey value and find Repeated image by the difference between two adjacent frames. Pseudo code: Σ I=0 to Frame height Σ j=0 to Frame width Picpixel= pixel (i, j) Grey value=lngGrayScaleValue = (0.299 * PicPixel.R) + (0.587 * PicPixel.G) + (0.1114 * PicPixel.B) Grey = Σ Grey value ‘Grey’ gave the value of the grey value of the whole image. Grey value= Image1.Greyvalue- Image2. Grey Value If Grey value < threshold then Duplicate image Else Needed Image End if. Grey value represents the value of the difference between the two adjacent Picture grey values. By assuming a threshold for the grey value the duplicate images are found out. Using file handling method the duplicate files are get eliminated. Here table shown the sample grey value caluculation. Tab 1. Gray Value Vs Number of Frames. Frame 1 2 3 4 5

Grey Value 3213382 3241749 3284197 3302771 3351903

2.4 User Side Data Training After extracting the image features like texture and matrix conversion the pixel values are trained in the database by labeling the features of the images. The matrix conversion us done by giving intensity at each point x, y and RGB values are found. A matrix will be formed having M rows and N columns.

D. Saravanan et al. / Procedia Computer Science 50 (2015) 560 – 565

Features are extracted Color Features images are represented in RGB

Histogram is formed

The images are labeled using Threshold value

Fig 7. Flow graph for user Data Training

2.5 Image Retrieval The two images are matched using the features and reorganizing the objects. The process is done by the following steps. Reorganization of Frame by using Frame Value

User image Query

Digitizer will retrieve image by using frame value

Feature extraction by Image Feature

Retrieve the relevant image Fig 8. Flow graph fro image retrieval

3.Experimental Results :

Fig 9. Scale Values for removing duplicates

563

564

D. Saravanan et al. / Procedia Computer Science 50 (2015) 560 – 565

Fig 10. Cluster Formation (Time Take in Sec)

Fig 11. Cluster Formation for Song Video file

Fig 12. Cluster Formation of Cartoon Video file Tab2. Output of cluster for various video files S.No.

Video name

1 2 3 4 5

Cartoon Sports Debate News Song

Number frames 7 16 15 15 15

of

Fig 13. Result of Chameleon Histogram Cluster

Fig 14 .Performance Graph of Chameleon Histogram Cluster

Input

Number of output frames 5 10 14 13 14

Duplicate removed 2 6 1 2 1

frames

565

D. Saravanan et al. / Procedia Computer Science 50 (2015) 560 – 565

The formation of clusters of various video files with respect of time (in millisecond) is depicted in above CHEMELEON graph. The Blue Color represents Cartoon video file. Red represents Sports video file. Rosy Brown represents song video file. Brown represents news video file and Cyan represents news video file. 4. Conclusion: The information retrieval process often requires the effective and efficient organization of large dimensional data. Mostly clustering, an unsupervised learning is used to retrieve information from huge dimensional data. Clustering and classification aims at grouping videos into various classes the accuracy of clustering depends on the similarity parameter used to cluster the objects into different groups. Generally clustering algorithms are unaware of background knowledge about the data domain the clustering algorithms can be used for a large range of problems in various application areas since it doesn’t need any prior knowledge about such application areas. Here we propose a clustering technique for video data files the Experiment data prove this clearly. 4.1 Future Enhancement A new clustering algorithm that overcomes the imitations of existing agglomerative hierarchical clustering algorithms still it suffers the following problems. 1. It should maintain static model also. 2. Execution cost should be minimized 3. It should be implemented for various applications domain and tested. 5. Refrences [1] D.Saravanan, Dr.S.Srinivasan, (2013). , Matrix Based Indexing Technique for video data, Journal of computer science, 9(5), 2013, 534-542. [2] D.saravanan, Dr.S.Srinivasan (2012). Video image retrieval using data mining Techniques, Journal of computer applications (JCA), Vol V,Issue 01, 2012. 39-42. [3]D.Saravanan V.Soma sundaram “MATRIX BASED SEQUENTIAL INDEXING TECHNIQUE FOR VIDEO DATA MINING Journal of Theoretical and Applied Information Technology 30th September 2014. Vol. 67 No.3. [4]JngHwan Oh, Babitha Bandi ‘Multimedia Data Mining Framework for Raw Video sequence ‘ in proc MDM/KDD 2002:International workshop on multimedia data mining (with ACM SIGKDD 2002). [5]George Karypis Eui-Hong (Sam) Han Vipin Kumar CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling. IEEE. [6] D.Saravanan, Dr.S.Srinivasan (2010).Indexing ad Accessing Video Frames by Histogram Approach, In the Proc. Of International Conference on RSTSCC 2010, 196-1999. [7] John. R Smith (1997), “IntegratedSpatial And Feature Image Systems : Retrieval,Analysis and Compression”, A thesis in the Graduate School of Arts and Sciences, Columbia University. . [8] Winston H. Hsu(2007),’ Ph.D Theis ,An Information-Theoretic Framework towards Large-Scale Video Structuring, Threading, and Retrieval’. Columbia University. .[10] Gupta A, Jain R, Santini S, Smeulders A.W.M and Worring M(2000)’,Contentbased Image retrieval at the end of the early years,’ IEEE Transation. on Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pages. 1349–1380. [11] D.Saravanan, Dr.S.Srinivasan(2013) “Video information retrieval using :CHEMELEON Clustering” International Journal of Emerging Trends & Technology in Computer Science (IJETTCS), Volume-02,Issue 01, January –February 2013, Pages 166-170. [12] D.Saravanan, A.Ramesh Kumar, “ContentBased Image Retrieval using Color Histogram”, International journal of computer science and information technology (IJCSIT), Volume 4(2), 2013, Pages 242-245, ISSN: 0975-9646. [13] Orengo. M and Stricker. M(1995), “Similarity of color images”, in the Video Databases, pp. 01-12.

Proceedings of. SPIE Storage and Retrieval

[14]. Ma. W, Ma. Y, Mei. T and Zhang(2005), “Sports video mining with mosaic”.In Proc. of MMM 2005, pp .107–114.

for Image and