Optimising sampling rates for accelerometer-based human activity recognition

Optimising sampling rates for accelerometer-based human activity recognition

Accepted Manuscript Optimising sampling rates for accelerometer-based human activity recognition Aftab Khan, Nils Hammerla, Sebastian Mellor, Thomas ...

2MB Sizes 0 Downloads 28 Views

Accepted Manuscript

Optimising sampling rates for accelerometer-based human activity recognition Aftab Khan, Nils Hammerla, Sebastian Mellor, Thomas Plotz ¨ PII: DOI: Reference:

S0167-8655(16)00004-0 10.1016/j.patrec.2016.01.001 PATREC 6422

To appear in:

Pattern Recognition Letters

Received date: Accepted date:

3 July 2015 6 January 2016

Please cite this article as: Aftab Khan, Nils Hammerla, Sebastian Mellor, Thomas Plotz, Optimising ¨ sampling rates for accelerometer-based human activity recognition, Pattern Recognition Letters (2016), doi: 10.1016/j.patrec.2016.01.001

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

CR IP T

ACCEPTED MANUSCRIPT

Research Highlights (Required)

To create your highlights, please type the highlights against each \item command.

It should be short collection of bullet points that convey the core findings of the article. It should include 3 to 5 bullet points (maximum 85 characters, including spaces, per bullet point.)

AN US

• Activity recognition with wearable sensing typically uses non-optimal sampling rates. • Sampling rates are up to 57% higher than necessary leading to waste of resources. • We develop a method for automated, task specific optimisation of sampling rates.

• Using our method we can maintain recognition performance for the optimal rates.

AC

CE

PT

ED

M

• Experimental validation through recognition experiments using classification.

ACCEPTED MANUSCRIPT 1

Pattern Recognition Letters journal homepage: www.elsevier.com

Optimising sampling rates for accelerometer-based human activity recognition

a Telecommunications b Open

Research Laboratory, Toshiba Research Europe Limited, Bristol, UK Lab, School of Computing Science, Newcastle University, UK

ABSTRACT

CR IP T

Aftab Khana,, Nils Hammerlab , Sebastian Mellorb , Thomas Pl¨otzb

M

AN US

Real-world deployments of accelerometer-based human activity recognition systems need to be carefully configured regarding the sampling rate used for measuring acceleration. Whilst a low sampling rate saves considerable energy, as well as transmission bandwidth and storage capacity, it is also prone to omitting relevant signal details that are of interest for contemporary analysis tasks. In this paper we present a pragmatic approach to optimising sampling rates of accelerometers that effectively tailors recognition systems to particular scenarios, thereby only relying on unlabelled sample data from the domain. Employing statistical tests we analyse the properties of accelerometer data and determine optimal sampling rates through similarity analysis. We demonstrate the effectiveness of our method in experiments on 5 benchmark datasets where we determine optimal sampling rates that are each substantially below those originally used whilst maintaining the accuracy of reference recognition systems. c 2016 Elsevier Ltd. All rights reserved.

ED

1. Introduction

AC

CE

PT

When Marc Weiser developed his vision of pervasive computing, one of the key promises of it was the prospect of disappearing technologies that “weave themselves into the fabric of everyday life until they are indistinguishable from it” [1]. The research fields of ubiquitous and pervasive, as well as wearable computing have since then matured rapidly and many of the original visions have become a reality for many – massively accelerated with the advent and ubiquitous uptake of smartphones and, more recently, wearable sensing and computing platforms such as smart watches. Smart environments, living labs, and especially mobile computing now constitute the central paradigm of what many call the third generation of computing [2]. As an enabling technology, automatic inference of the context and especially of the activities humans are engaged in – typically referred to as Human Activity Recognition (HAR) – plays a central role in the majority of ubiquitous and mobile computing applications. Over the years a multitude of application scenarios have been developed for activity recognition – with the majority using wearable, miniaturised inertial measurement units (IMUs).

e-mail: [email protected] (Aftab Khan)

Of particular interest have been tri-axial accelerometers that are used in such diverse domains like novel interaction techniques [3], situated support in smart environments [4], automated health assessments [5, 6] or health care automation [7, 8, 9] to name but a few. Beyond the mere recognition of certain activities of interest, a number of application domains now even require the analysis of the quality of a person’s activities [10]. Such skill assessment is, for example, relevant for progress assessment in physical rehabilitation [11], or for coaching in certain sports [12]. Real-world applications of human activity recognition require careful configuration of wearable accelerometers to balance accurate and reliable analysis of movement data with practical constraints surrounding battery life and storage requirements. The sampling rate, i.e., the temporal resolution at which tri-axial acceleration is measured represent the most critical parameter. It directly affects power consumption, data storage, and power or bandwidth requirements in case of wireless transmission. The dilemma lies in the fact that poorly chosen sampling rates can either lead to excessive energy and memory consumption, or, conversely, to missing detail of the activity data to be analysed. To avoid this, the decision of choosing an optimal sampling frequency is based on the balance between the information content against the amount of time a sensor could record for. For example, in various applications, accelerometry

ACCEPTED MANUSCRIPT 2 Acceleration

b)

0

0.5

1

1.5

2

2.5

Acceleration

Acceleration

a)

W=1S

0

0

W

0.5

1

1.5

2

2.5

0.5

1

1.5

2

2.5

=1S

Raw Data

Activity Recognition

Feature Extraction

c)

CR IP T

d)

Sampling Rate Selection

oo x o oo x x xxx x oo x x xx x + +++ + oo ooo ooo ++ oo oo o x xxxxx x x + ++ ++ o o oo o x x x x + + o oo o o x x x xx+x + + + + o o oooo oo x x + + + + + o o oo o oo oo o x x + + ++ x + + + o ooo oo o + o o oooo Classification

AN US

Fig. 1: Optimised sampling rate selection procedure presented in the context of an activity recognition system. Q represents the original sampling rate of the raw data (a); qi represents lower sampling rates (b) while qˆ is the optimal sampling rate calculated using a two-sample KolmogorvSmirnov (KS)-test between a dataset in its original sampling rate and subsampled versions of it (c). (d) represents the evaluation step of a standard activity recognition backend.

data is recorded for a long duration and it is crucial to be able to reliably determine the logging time whilst also considering the amount of information content.

CE

PT

ED

M

In this paper, we investigate optimised sampling rates with a statistical approach that analyses the properties of raw accelerometer signals (cf. Fig. 1 for an overview of the developed approach). We employ statistical tests for determining those sampling rates that are optimal for particular application scenarios. Through assessments on small, unlabelled sample sets sampling rates can be optimised automatically using our approach. In an experimental evaluation we demonstrate the effectiveness of our method on 5 different datasets, each covering a wide variety of human activities. We aim to successfully determine optimised sampling rates for each dataset, which is a pre-requisite for subsequent successful activity analysis. For a concrete evaluation, we also perform activity recognition on these datasets to show that the selected sampling rate is optimal in the given settings.

AC

The developed method is universally applicable, i.e., without restrictions regarding the actual activities to be recognised in a deployment of human activity recognition systems. A possible application scenario would consist of running small scale pilot studies using the particularly chosen sensing solution to collect small amounts of – unlabelled – accelerometry data that is representative for the application domain. Through employing our analysis method practitioners could then automatically determine the sampling rate that is optimal for their application scenario and configure their sensing and analysis settings accordingly for most effective human activity recognition in their field. In our experience this resembles a typical deployment setting specifically in the health domain where non-experts often struggle to configure their sensing and analysis framework such

that it produces optimal results. 2. Related Work

Previous work suggests that sampling rates of approximately 20Hz are reasonable for “standard” human activities (e.g., [13, 14]). Such standard activities usually correspond to periodic movements such as walking, running, or cycling, i.e., fairly regular and less complex movements. The motivation for sampling with said 20Hz typically comes from the ShannonNyquist theorem [15], which states that for a successful, i.e., loss-less reconstruction of a particular signal, the data needs to be sampled with at least twice its highest frequency. It is assumed that voluntary human movements do not typically exceed 10Hz and thus, according to Shannon-Nyquist, ≥ 20Hz would be a reasonable choice when recording accelerometer data using wearable sensing platforms. However, recent developments in HAR have moved the field towards the analysis of more complex movements and activities, and even beyond the recognition of activities [10]. For example, a number of methods now focus more on quality analysis of recorded activities and thus aim at a more fine-grained assessment of the details of the movement data. For such application cases a sampling rate that is too low would effectively hide essential details such that automatic signal analysis w.r.t. quality parameters and differences thereof becomes very difficult, if not impossible (e.g., mechanical vibrations may be insightful when assessing tool use [16]). In the case of human activities, ≥ 20Hz cannot be simply considered as a natural sampling rate as it is difficult to reliably represent human activities. This could be linked with a lack of periodicity in accelerometry signals which also means that signals at this rate are signifi-

ACCEPTED MANUSCRIPT 3 4.25

4.25

0.009 400Hz 400Hz 800Hz 800Hz 1600Hz 1600Hz 3200Hz 3200Hz

4.2 4.2

4.1

Battery (V)

4.15

12Hz 12Hz 25Hz 25Hz 25Hz 50Hz 50Hz 50Hz 100Hz 100Hz 100Hz 200Hz 200Hz 200Hz

4.1

4.05

4

4.05

4

3.95

3.9

20

40

60

80

0.004

0.003

0.001

3.8 0

0.005

0.002

3.85

3.95

0.006

CR IP T

Battery (V)

0.007

Discharge rate (V/hour)

4.15

0.008

0

0

20

40

time (H)

time (H)

(a1)

(a2)

60

80

12

25

50

100

200

400

800

Frequency (Hz)

(b)

AN US

Fig. 2: Overview of battery discharge curves for a wearable accelerometer configured to record at varying sampling rates. Fig. (a1) shows the battery voltage (V) over time (in hours) for individual sampling rates ranging from 12Hz to 200Hz, (a2) shows 400Hz to 3.2KHz. (b) shows the voltage discharge rates for individual sampling rates (log scale). It can be seen in this graph that decreasing the sampling rate from 100Hz to 25Hz would more than double battery life.

AC

CE

PT

ED

M

cantly different to a much higher rate. It is for this reason that, in many cases, accelerometer data is now usually recorded at much higher sampling rates. For example, datasets like Opportunity and Daphnet Gait were recorded at 30Hz [17, 18], but in health and sports assessment scenarios 100Hz was used [19, 20], and in other domains sampling rates as high as 250Hz have been used [21]. Certainly, the availability of inexpensive sensing hardware (such as ADXL335 tri-axial accelerometers) have also contributed to the wider use of higher sampling rates [22]. However, higher sampling rates come at a price for realworld deployments that rely on long term operation. The higher the sampling rate, the shorter the deployment time. In practice, sampling rates are chosen based on the activities of interest and usability constraints such as battery life. Consequently, generalised heuristics such as the aforementioned “Shannon-Nyquist” are not as universally applicable as one might hope. This problem has been recognised in previous work, and optimisation approaches have been developed such as [23] proposes a Genetic Programming based algorithm for sampling rate optimisation under various accuracy requirements. Alternatively, [24] proposed activity prediction for resource optimisation through sampling rate variation. Retrospectively adjusting sampling rates after system development can be difficult, as it is unclear whether a system that works well with a high sampling rate would also work well with a lower rate. It is therefore not common practice. 3. Optimising Sampling Rates An optimal sampling rate can be defined as the minimum sampling rate at which all relevant characteristics of activities

that are of interest in a particular scenario can be captured. Note that in many scenarios not only the activities themselves are of interest but also additional parameters such as movement quality. Below the – scenario specific – optimal sampling rate, modelling and hence recognition becomes difficult as important activity related information (for example, impact peaks in the signal) might be lost. Above this sampling rate, there is little additional information gain thus resulting in no significant improvements in activity recognition performance. Unnecessarily high sampling rates waste resources, which limits deployment times “in the wild”. Figure 2 shows the effect of varying sampling rates on the battery consumption over time of a wearable logging accelerometer. It can be clearly seen that higher sampling rates result in a much quicker battery drain. Figure 2b shows that reducing the logging sampling rate from 100Hz down to 25Hz we reduce the battery consumption by more than half. The devices used to produce Figure 2 contain a 150mAh rechargeable lithium polymer battery, 512MB NAND flash non-volatile memory, and a MEMS tri-axial accelerometer. Each device was configured to record for a set period of 75 hours or until the battery was depleted or memory ran out. Before the experiment each device was charged to full capacity. The key to our optimisation approach lies in similarity analysis of signals with different sampling representation using a statistical test protocol. We measure the significance of the information difference between sampling rates using this approach. 3.1. Analysis Approach – Overview In our investigations we determine optimal sampling rates through systematic subsampling of signals that have originally been sampled with higher frequencies. Our method assumes

ACCEPTED MANUSCRIPT 4 This distance metric gives us a probabilistic measure of likelihood for an observed KS-test value, given the data, i.e., the number of possible KS-tests exceeding our observed test, relative to the total number of permutations of the data to test. This enables the test to be agnostic to the underlying distribution and to provide a measure of significance. The overall measure of similarity in information Sq (D) between the original sampling rate, Q, and the test sampling rate, q. can then be defined as the mean over all values of M:

=

(Dd ( f w), ...,

Bd, f,q

=

(Dd,q ( f w), ...,

=

...

Dd,q ( f w + 1),

(1)

...

T

Dd,q ( f w + w − 1))

0, . . . , F − 1.

CE

with f

Dd ( f w + 1),

Dd ( f w + w − 1))T

PT

Ad, f

ED

M

3.2. Formal Description For a given dataset D consisting of N samples {Dd (t), d = [1, 2, 3], t = 0 . . . N −1} of 3-dimensional accelerometer data we extract F pairs of frames covering the same time window w with one frame, A, sampled at the original sampling frequency Q and the other, B, at the test frequency q ∈ N+ , q < Q. We denote down-sampled data as Dd,q (t) where Dd,Q (t) is the original data Dd (t).

(2) (3)

AC

For every pair of frames we then perform a two-sample KS test: KS d, f,q = sup |cd f (Ad, f (x|n)) − cd f (Bd, f,q (x|n0 ))| x

(4)

Where, cd f represents the empirical distribution function and sup is the supremum function. n and n0 represent the number of samples in frames Ad, f and Bd, f,q respectively. The normalised distance metric, M, of a given test frequency q for a given axis d and index f is computed as follows: Md, f,q =

#{KS ≥ KS d, f,q } , total#permutations

for all permutations of the data within each frame.

(5)

3 F 1 XX Md, f,q , 3F d=1 f =1

(6)

CR IP T

Sq (D) =

with S ∈ [0, 1]. Where Sq = 0 represents completely dissimilar frames, and Sq = 1 frames are identical under the KS-test for distance. Algorithm 1 Sampling rate selection algorithm procedure OPTIMAL SAMPLING RATE Q ← original sampling rate i ← 4:2:Q T ← simlarity threshold for i do 6: if Si (Data, i|Q) > T then 7: return qˆ ← i 1: 2: 3: 4: 5:

AN US

that the original sampling rate of the data is not insufficient for successful activity recognition which is justified by activity recognition results reported in the corresponding literature for each data set [25][26][27][21][28]. For a given dataset we analyse overlapping frames of a fixed duration that are extracted for the complete signal and then sub-sampled to create variants that mimic differences in sampling rates (cf. Figure 1(b) for an illustration of how these frames are aligned across different sampling frequencies). We then apply a statistical test between differently sampled variants of corresponding analysis frames in order to test for similarity (or breach thereof). Differences between frames in their original sampling frequency and each test frequency are computed using a two-sample Kolmogorov Smirnov (KS) test, which is a standard non-parametric test for equality of continuous probability distributions [29]. One particular advantage of using KS-test is that it makes no assumption regarding the underlying distribution of the data. An optimal sampling rate is then achieved when the comparative distributions of respective time frames for a given sampling rate against the original become statistically similar. The relationship between similarity and frequency is shown in Figure 1(c). This process is unsupervised, i.e., no activity labels are required, which makes it immediately applicable to any scenario where only unlabelled pilot datasets would be required for optimising the sampling rates.

4. Experiments

To validate our method, we perform an experimental evaluation on 5 benchmark datasets. We determine lower and upper boundaries of sampling rates, which define an “operating area” beyond which no additional information would be gained and thus would rather only waste energy and memory. The lower boundary defines the minimum frequency that is required for recognition without losing precision. For each dataset, we down-sampled from the original frequency in a range of {4, 6, 8, ..., Q}Hz where Q represents the maximum sampling frequency of a given dataset. We use stepsizes of 2Hz in order to avoid frames with odd numbers of samples, which would otherwise negatively affect the test statistic and result in noisier normalised distance metric between odd/even frame shifts. We then extract the analysis frames that each contain consecutive sensor readings for 1s signal duration with subsequent frames overlapping by 50% (corresponding to standard parameterisation of sliding window procedures in HAR [30]). According to the procedure described above we then calculate the normalised distance metric S using the KS-test, followed by a smoothing spline curve fitting. We choose those sampling rates qˆ as optimal where S reaches the similarity threshold T (we use T = 0.95 and T = 0.99 in our experiments). These points indicate where difference between the frames in selected and in maximum sampling rates of a given dataset statistically and significantly vanish. The trend in the smoothed curve allows us to reliably select an optimal frequency, and as such higher similarity thresholds can also be chosen if a greater degree of similarity is desired.

ACCEPTED MANUSCRIPT

0.8

0.7

0.6

0.5 0

20

40 60 Frequency (q Hz)

(a) 1 Q=100

q=56 S=0.99

0.8

0.7

0.6

0.5 0

20

40 60 Frequency (q Hz)

80

100

0.8

0.7

0.6

0.5 0

20

40 60 Frequency (q Hz)

(b) PAMAP2 − Chest 1 Q=100

q=57 S=0.99

q=33 S=0.95

0.9

0.8

0.7

0.6

0.5 0

20

40 60 Frequency (q Hz)

80

100

(d)

USC−HAD 1

0.9

q=17 S=0.95

Q=100

q=30 S=0.99

0.8

0.7

0.5 0

20

40 60 Frequency (q Hz)

80

100

ED

(f)

M

0.6

80

100

PAMAP2 − Ankle 1 q=42 S=0.95

0.9

Q=100

q=63 S=0.99

0.8

0.7

0.6

0.5 0

AN US

(c) Distance metric S using a 2−sample ks−test (q,Q=100Hz)

100

Average p−value of a 2−sample ks−test (q,Q=250Hz)

q=32 S=0.95

80

Q=100

q=26 q=15 S=0.99 S=0.96

0.9

Distance metric S using a 2−sample ks−test (q,Q=100Hz)

0.9

Q=96

PHealth 1

CR IP T

q=12 S=0.96

q=22 S=0.99

PAMAP2 − Hand

0.9

Distance metric S using a 2−sample ks−test (q,Q=100Hz)

Skoda 1

Distance metric S using a 2−sample ks−test (q,Q=100Hz)

Distance metric S using a 2−sample ks−test (q,Q=100Hz)

Average p−value of a 2−sample ks−test (q,Q=96Hz)

5

20

40 60 Frequency (q Hz)

80

100

(e)

Walk8

1

q=35 q=18 S=0.95 S=0.99

0.9

Q=250

0.8

0.7

0.6

0.5

10

20

30

40

Frequency (q Hz)

210 220 230 240 250

(g)

Fig. 3: (a)-(g) show the similarity metric S for each of the 5 datasets from the lowest frequency 4Hz up to the original frequency Q. Additionally, 3 markers indicate those frequencies for which S = 0.95, S = 0.99, and the original sampling rate.

AC

CE

PT

In order to validate the applicability and relevance for real world tasks, we then use a standard classification backend as it is typically used in HAR application and evaluate recognition accuracy (mean F1-scores) for the particular datasets at the chosen sampling rates q. ˆ We use the same parameterisation as before for sliding window based frame extraction (window length: 1s; 50% overlap) and subsequent frame-wise calculation of a total of 23 statistical features as outlined in [8, 30]. These include mean and standard deviation of each axis, pitch and roll, entropy, energy and inter-axis correlation coefficients. Subsequently we apply a set of one-vs-all support vector machines (SVMs) using an RBF kernel. The scaling parameter is set to 1/|features| with and the cost set to 1. We evaluate the performance in 10-fold stratified cross validation. 4.1. Datasets We use 5 benchmark datasets containing a variety of human activities, recorded at different sampling rates using body-worn 3D accelerometers (see Table 1). For comparability we have only included accelerometers and excluded other modalities. Unless otherwise specified, all of the datasets refer to a wrist-

Table 1: Overview of the datasets, number of classes and associated optimal sampling rates with varying thresholds for the normalised distance metric S.

Dataset

#Classes Original Optimal sampling rates Q qˆ (S = 0.95) qˆ (S = 0.99)

Skoda

11

96Hz

12Hz

22Hz

PAMAP2-Hand PAMAP2-Chest PAMAP2-Ankle

13

100Hz

32Hz 33Hz 42Hz

56Hz 57Hz 63Hz

USC-HAD

12

100Hz

17Hz

30Hz

PHealth

10

100Hz

15Hz

26Hz

Walk8

4

250Hz

18Hz

35Hz

worn accelerometer, which corresponds to the most common approach in wearable computing. With these datasets, our main objective is to find scenario-specific optimal sampling rates that are suitable for the given set of activities enabling researchers to use this instead of a power consuming, memory intensive higher sampling rate – whilst at the same time maintaining all

ACCEPTED MANUSCRIPT 6 Recognition performance 0.95 0.99 1

mean F1-score

0.7

0.95 0.99 1 std at 1.0

0.02

diff. in mean F1-score

0.8

Abs. difference in performance

0.03

0.6 0.5 0.4 0.3 0.2

0.01 0 -0.01 -0.02

0

-0.05

*

* *

*

*

PA

PA

W al M AP k8 2PA ha M nd AP 2PA ch M es AP t 2an PA kl e M AP 2a U SC ll -H AD Sk od a PH ea lth

-0.04

W al M AP k8 2 PA -h an M d AP 2 PA -c he M st AP 2an PA kl e M AP 2a U SC ll -H AD Sk od a PH ea lth

0.1

*

*

-0.03

CR IP T

0.9

Fig. 4: Overall classification performance measured as mean F1-score (left) . Absolute difference in performance with varying S in relation to the standard deviation at the original sampling rate (right).

relevant signal information. We used the following datasets:

AN US

i) PHealth dataset [25] contains accelerometer data of people with Parkinson’s disease with activities like sitting, walking, running and gait freezing.

A range of sampling rates are found to be optimal depending on the activities in the dataset. For example, from the smoothed curves the statistical similarity between frames captured at qˆ = 32Hz and Q = 96Hz for the Skoda dataset has S = 0.99 suggesting that 32Hz – instead of the original 96Hz, i.e., a third of it – should be the optimal sampling rate to use. In the case of the PAMAP2 dataset, relatively, higher sampling rates are chosen qˆ = 56 − 63Hz (depending on the sensor location) at S = 0.99. This indicates that activities performed in this particular dataset require a relatively higher sampling rate but again much less than the original rate at which this data was originally recorded.

ii) USC-HAD [26] dataset contains 12 activities including walking, going up/down stairs, jumping, sitting and elevator up/down.

M

iii) PAMAP2 [27] is a physical monitoring dataset; we use a subset of this dataset containing various physical activities such as running, cycling, playing soccer, vacuum cleaning, house cleaning, and car driving.

ED

iv) Walk8 [21] dataset contains walking activities for gait recognition. Acceleration is measured using a sensor placed above the right knee.

PT

v) Skoda [28] dataset contains activities related to quality control in car manufacturing.

AC

CE

In addition to these benchmark datasets, we also collected another dataset related to activities of daily living using an accelerometer sampling at 100Hz. Accelerometers were worn on the wrists for two days by 4 subjects (right wrist for two subjects and left wrist for the other two) and no restrictions were imposed on the types of activities. 4.2. Optimal Sampling Rates in the Benchmark Datasets Figures 3a-3g show the results for all of the 5 datasets with sampling rates on the x-axis and the normalised distance metric S using the two-sample KS-test for frames of the same timelength captured between frequency q and the maximum frequency of the same dataset Q. Points at which the statistical similarity determined by the normalised distance metric reach S = 0.95 and S = 0.99 are also highlighted with the corresponding sampling rates. These are the sampling rates that we later use for classification and validate against the baseline (i.e., by using the same dataset with a sampling rate close to the original sampling rate).

4.3. Classification Results Classification results for the sampling rates qˆ are illustrated in Figure 4-left showing the overall performance measured as mean F1-scores and the standard deviation across the 10 folds in the cross-validation. Figure 4-right illustrates the absolute difference in performance for the different settings in relation to the standard deviation at the original sampling rate. On all data-sets the mean performance of the sampling rate selected at the 0.99 level lies within one standard deviation except for PAMAP2-ankle and PAMAP2-all. A star indicates that the pvalue of a two-sample t-test is below 0.01 highlighting significant difference in performance. It can be seen that for all (except PAMAP2-ankle), there is no significant difference in performance at S = 0.99. Example confusion matrices for the USC-HAD dataset are shown in Figure 5-top. In the case of the PAMAP2 dataset, recognition performance is calculated using, a) only a single sensor modality each i.e., hand, chest, or ankle, and b) using all sensors combined (with the highest observed sampling rates i.e., qˆ = 42Hz at S = 0.95 and qˆ = 63Hz at S = 0.99). When only the ankle sensor is used to perform activity recognition, there are significant differences in performance even at S = 0.99. This is due to the fact that activities in the PAMAP2 dataset cannot be easily classified using a single sensor modality. In practice, all of the available sensors should be used for activity recognition indicated in Figure 4 as PAMAP2-all. Confusion matrices for the PAMAP2-ankle dataset are shown in Figure 5-bottom. It is also important to

ACCEPTED MANUSCRIPT 7 USC-HAD (0.99)

1 0.9

2

0.9

2

0.6

6

0.5 0.4

8

0.3

10

4

0.7 0.6

6

0.5 0.4

8

0.3

10

0.2

4

6

8

10

12

12

0

2

4

6

8

10

12

0.2 0.1

2

0.7 0.6

6

0.5

8

0.4 0.3

10

4

0.7 0.6

6

0.5

8

0.4 0.3

10

0.2

12

0.1

12

0.2

12

0

0.1

2

4

6

8

10

6

8

10

12

0

12

0

predicted class

1 0.9

2

0.8

4

0.7 0.6

6

0.5

8

0.4 0.3

10

0.2

12

0.1

2

4

6

8

10

12

0

predicted class

AN US

predicted class

4

PAMAP2-ankle (1.00)

1

0.8

annotated class

annotated class

4

10

0.3

12

0

0.9

2

0.8

8

0.4

8

predicted class

PAMAP2-ankle (0.99)

1 0.9

2

6

0.5

predicted class

PAMAP2-ankle (0.95)

4

0.6

6

0.1

predicted class

2

0.7

CR IP T

2

4

10

0.2

0.1

12

0.9 0.8

annotated class

annotated class

0.7

1

2

0.8

annotated class

0.8

4

USC-HAD (1.00)

1

annotated class

USC-HAD (0.95)

Fig. 5: Confusion matrices for the USC-HAD (top row) and PAMAP2-Ankle (bottom row) datasets. Recognition results at S = 0.95, S = 0.99, and S = 1 are shown left-right.

ED

M

note that we have only used the default set of parameters for SVM’s RBF kernel for the sake of comparisons. These parameters can also be tuned for a given dataset using a grid-search procedure. In all of the datasets, optimal sampling rates are at least 43% and at most 86% less than the original sampling rates (at S = 0.99) – with unchanged recognition results. This is of high practical relevance as all considered scenarios would immediately benefit from substantial reductions in sampling rates of body-worn accelerometers leading to substantial savings in energy and resource consumption.

PT

4.4. Optimal Sampling Rates in the Recorded Dataset

AC

CE

In this study, we make use of the recorded dataset and the optimal sampling rate results are shown in Figure 6. For all of the 4 subjects, the minimum sampling rate selected – at which S = 0.99 – is approximately 45Hz, much less than the original 100Hz. If the optimal sampling rate is used, that would double the amount of data that could be stored on-board. In practice, a pilot data, such as this, should be captured before data collection. An optimal sampling rate will also greatly affect the battery consumption as higher sampling rates are generally related to high energy consumption. 5. Summary and Further Work We have developed and evaluated an effective approach for automatically finding optimal sampling rates for accelerometry based human activity recognition. It allows us to substantially reduce sampling rates tailored for particular scenarios whilst maintaining unchanged recognition accuracy. Reduced sampling rates imply more efficient resource use in real-world deployments of wearable HAR systems, and thus in consequence

lead to longer runtimes “in the wild”. Given the minimal requirements of our approach it is universally applicable. This was achieved by using an unsupervised statistical approach followed by a standard HAR approach for the classification of human activities in 5 benchmark datasets. In all of these experiments, we found that the datasets were originally recorded at a much higher sampling rate than necessary evidenced not only by the statistical similarities in a fixed length time-windows but also in the classification results. There are several other means to reduce overall sampling rate (and thus battery consumption) that should be considered in future work, including more dynamic methods (on-board the logging device) such as activity based compressive sensing i.e., dynamically controlling sampling rates. Similarly, more complex methods for determining information within the data can also be potentially used based on either the energy content of the signal or using PCA based thresholding. Our work could be supplemental to improving these methods with a combination of dynamic sampling rates and our method to determine an optimal rate at a given time. Acknowledgment Parts of this work have been funded by the RCUK Research Hub on Social Inclusion through the Digital Economy (SiDE) project.

References [1] M. Weiser, The computer for the 21st century, Scientific American. [2] G. D. Abowd, What next, Ubicomp? Celebrating an intellectual disappearing act., in: Proc. Int. Conf. Ubiquitous Comp. (UbiComp), 2012. [3] D. Kim, O. Hilliges, S. Izadi, A. D. Butler, J. Chen, I. Oikonomidis, P. Olivier, Digits: Freehand 3d interactions anywhere using a wrist-worn gloveless sensor, in: Proc. UIST, ACM, 2012. doi:10.1145/2380116.2380139.

ACCEPTED MANUSCRIPT

0.8

0.7

0.6

0.5 0

20

40 60 Frequency (q Hz)

80

100

(a) Recorded data − LW3 1 q=29 S=0.95

0.9

Q=100

q=46 S=0.99

0.8

0.7

0.6

0.5 0

20

40 60 Frequency (q Hz)

80

100

(c)

Recorded data − RW2 1 q=26 S=0.95

0.9

Q=100

q=44 S=0.99

0.8

0.7

0.6

0.5 0

20

40 60 Frequency (q Hz)

80

100

(b) Recorded data − LW4 1 q=27 S=0.95

0.9

0.8

0.7

0.6

0.5 0

20

Q=100

q=44 S=0.99

CR IP T

q=27 S=0.95

0.9

Q=100

q=45 S=0.99

Distance metric S using a 2−sample ks−test (q,Q=100Hz)

Recorded data − RW1 1

Distance metric S using a 2−sample ks−test (q,Q=100Hz)

Distance metric S using a 2−sample ks−test (q,Q=100Hz)

Distance metric S using a 2−sample ks−test (q,Q=100Hz)

8

40 60 Frequency (q Hz)

80

100

(d)

AN US

Fig. 6: (a)-(d) show similarity metric S for each of the 4 subjects. Sampling rates were chosen between 4Hz up to Q = 100Hz. The 3 markers show frequencies for which S = 0.95, S = 0.99, and the original sampling rate respectively.

AC

CE

PT

ED

M

[4] J. Hoey, T. Pl¨otz, D. Jackson, A. F. Monk, C. Pham, P. Olivier, Rapid specification and automated generation of prompting systems to assist people with dementia, PMC 7 (3) (2011) 299–318. [5] T. Pl¨otz, N. Hammerla, A. Rozga, A. Reavis, N. Call, G. D. Abowd, Automatic Assessment of Problem Behavior in Individuals with Developmental Disabilities, in: Proc. UbiComp, 2012. [6] N. Hammerla, J. Fisher, P. Andras, L. Rochester, R. Walker, T. Pl¨otz, PD Disease State Assessment in Naturalistic Environments using Deep Learning, in: Proc. AAAI, 2015. [7] A. Avci, S. Bosch, M. Marin-Perianu, R. Marin-Perianu, P. Havinga, Activity recognition using inertial sensing for healthcare, wellbeing and sports applica- tions: A survey, in: Proc. ARCS, VDE, 2010. [8] T. Pl¨otz, P. Moynihan, C. Pham, P. Olivier, Activity Recognition and Healthier Food Preparation, in: Activity Recognition in Pervasive Intelligent Environments, Atlantis Press, 2010. [9] C. Marcroft, A. Khan, N. Embleton, M. Trenell, T. Pl¨otz, Movement recognition technology as a method of assessing spontaneous general movements in high risk infants, Frontiers in Neurology 5 (284). doi:10.3389/fneur.2014.00284. [10] A. Khan, S. Mellor, E. Berlin, R. Thompson, R. McNaney, P. Olivier, T. Pl¨otz, Beyond activity recognition: Skill assessment from accelerometer data, in: Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, UbiComp ’15, ACM, New York, NY, USA, 2015, pp. 1155–1166. doi:10.1145/2750858.2807534. [11] A. Moeller, L. Roalter, S. Diewald, M. Kranz, N. Hammerla, P. Olivier, T. Pl¨otz, GymSkill: A Personal Trainer for Physical Exercises, in: Proc. PerCom, 2012. [12] C. Ladha, N. Y. Hammerla, P. Olivier, T. Pl¨otz, ClimbAX: skill assessment for climbing enthusiasts, in: Proc. UbiComp, ACM, 2013. [13] H. Junker, P. Lukowicz, G. Troster, Sampling frequency, signal resolution and the accuracy of wearable context recognition systems, in: Proc. ISWC, 2004. [14] X. Yang, A. Dinh, L. Chen, Implementation of a wearerable real-time system for physical activity recognition based on naive bayes classifier, in: Bioinformatics and Biomedical Technology (ICBBT), 2010 International Conference on, 2010, pp. 101–105. doi:10.1109/ICBBT.2010.5479000. [15] A. Jerri, The shannon sampling theorem–its various extensions and applications: A tutorial review, Proceedings of the IEEE 65 (11) (1977) 1565–1596. doi:10.1109/PROC.1977.10771. [16] P. Lukowicz, J. A. Ward, H. Junker, M. St¨ager, G. Tr¨oster, A. Atrash, T. Starner, Recognizing workshop activity using body worn microphones and accelerometers, in: Proc. Pervasive Computing, Springer Berlin Heidelberg, 2004.

[17] R. Chavarriaga, H. Sagha, A. Calatroni, S. T. Digumarti, G. Tr¨oster, J. del R. Mill´an, D. Roggen, The opportunity challenge: A benchmark database for on-body sensor-based activity recognition, Pattern Recognition Letters 34 (15) (2013) 2033 – 2042. [18] M. B¨achlin, M. Plotnik, D. Roggen, I. Maidan, J. M. Hausdorff, N. Giladi, G. Tr¨oster, Wearable assistant for parkinson’s disease patients with the freezing of gait symptom, IEEE Trans. Info. Tech. Biomed. 14 (2) (2010) 436–446. [19] T. Pl¨otz, N. Y. Hammerla, A. Rozga, A. Reavis, N. Call, G. D. Abowd, Automatic assessment of problem behavior in individuals with developmental disabilities, in: Proc. UbiComp, 2012. [20] C. Ladha, N. Y. Hammerla, P. Olivier, T. Pl¨otz, Climbax: Skill assessment for climbing enthusiasts, in: Proc. UbiComp, 2013. [21] K. Van Laerhoven, A. K. Aronsen, Memorizing What You Did Last Week: Towards Detailed Actigraphy With A Wearable Sensor, in: Proc. Distributed Computing Systems Workshops, 2007. [22] H. Piitulainen, M. Bourguignon, R. Hari, V. Jousm¨aki, Megcompatible pneumatic stimulator to elicit passive finger and toe movements, NeuroImage 112 (0) (2015) 310 – 317. doi:http://dx.doi.org/10.1016/j.neuroimage.2015.03.006. [23] X. Qi, M. Keally, G. Zhou, Y. Li, Z. Ren, Adasense: Adapting sampling rates for activity recognition in body sensor networks, in: Real-Time and Embedded Technology and Applications Symposium (RTAS), 2013 IEEE 19th, 2013, pp. 163–172. doi:10.1109/RTAS.2013.6531089. [24] D. Gordon, J. Czerny, T. Miyaki, M. Beigl, Energy-efficient activity recognition using prediction, in: Proc. ISWC, 2012. [25] M. Stikic, T. Huynh, K. van Laerhoven, B. Schiele, Adl recognition based on the combination of rfid and accelerometer sensing, in: Proc. PervasiveHealth, 2008. [26] M. Zhang, A. A. Sawchuk, Usc-had: A daily activity dataset for ubiquitous activity recognition using wearable sensors, in: Proc. UbiComp, 2012. [27] A. Reiss, D. Stricker, Introducing a new benchmarked dataset for activity monitoring, in: Proc. ISWC, 2012. [28] T. Stiefmeier, D. Roggen, G. Ogris, P. Lukowicz, P. Lukowicz, Wearable activity tracking in car manufacturing, IEEE Pervasive Computing 7 (2) (2008) 42–50. [29] G. W. Corder, D. I. Foreman, Nonparametric Statistics for NonStatisticians: A Step-by-Step Approach, John Wiley & Sons, 2009. [30] A. Bulling, U. Blanke, B. Schiele, A tutorial on human activity recognition using body-worn inertial sensors, ACM Computing Surveys 46 (3) (2014) 1–33. doi:10.1145/2499621.