450+ experts on 30 subjects ready to help you just now
Starting from 3 hours delivery
Pssst… we can write an original essay just for you.
Any subject. Any type of essay. We’ll even meet a 3-hour deadline.Get your price
121 writers online
Activity recognition using wearable sensors is a common investigated research topic. The recent work in this area adds different sensing modalities such as Capacitor, Accelerometer in combination to improve activity recognition for more challenging activities. This work presents an approach to classify various activities involving restless leg movements using machine learning methods in real life settings using data collected by wearable sensors. The prototype consists of supervised machine learning model which performs multiclass classification (involving six classes) to recognize leg activities such as kicking, fidgeting, rubbing based on combined sensor values from Accelerometer, Capacitor sensor, Gyroscope. Additionally, this work presents an implementation of interactive Graphical User Interface with the classifier model at the backend which has functionalities to load, analyze the input data and visualize and save the output of classifier (predicted activities).
Human activity recognition has emerged to be a powerful measure to observe behavioral patterns or indicators in the research study and healthcare monitoring. In this work, the restless leg activity recognition is formulated as a multiclass classification problem. The classes representing various leg activities are (1) kicking, (2) fidgeting, (3) rubbing one leg on another, (4) crossing and uncrossing legs, (5) gas pedal action, (6) flexing foot against a surface, (7) stretching (8) Idle. The data is collected by emulating these actions for training data sessions and testing data sessions. While performing these activities, a leg band with embedded sensors is worn all the time.
The procedure as shown in the conceptual diagram (Figure 1) is implemented to make the raw data ready for machine learning model, which involves sensor selection, data acquisition, feature selection, and extraction. For the machine learning methods for classification, the Random Forest Classifier is used as in this scenario it is proven to be giving best results with very less hyperparameter tuning as compared to other classifiers.
The primary aim of the project is to provide Machine Learning software platform for the BIT-RL validation study in the nursing home. In the BIT-RL (Behavioral Indicators Test – Restless Legs, BIT-RL)  study the patient is observed for 20-30 minutes of the time interval. The primary purpose of observation is to note any of the behavioral indicators (In this case seven different Restless Leg Movements) in the time interval of 2 minutes. The software platform will provide a scope of validation to the manual process of observation. Wearable sensor leg band is worn by the patient at the time of observation. The data collected by leg band is given as an input to Machine learning model.
The 3-axis accelerometer senses the acceleration in the three perpendicular axes. By sensing the amount of dynamic acceleration, one can analyze the directional movement of the leg. The accelerometer gives discriminant values for the position of the leg when it is closer to the ground (in case of fidgeting) to the position of the leg at some height (in case of kicking).
Recently, capacitive sensing technologies are embedded into wearable sensors in combination with accelerometers to enhance the accuracy and application range of accelerometer-based activity recognition system. In this scenario, the capacitor plates are composed of conductive textiles sewn into the fabric. Capacitive sensors can sense the movement of remote bodies. The three capacitive sensors in the leg band placed the front, right and left side of the ankle show difference in the capacitance based on the change in proximity of another leg. 
The Gyroscope sensor adds an additional dimension to the sensing information provided by accelerometer by tracking a rotation. Gyroscope measures angular rotational velocity. With the more information about tilt or lateral orientation of ankle/leg the gyroscope helps in differentiating activities such as crossing and uncrossing (involving tilt) from activities such as fidgeting (with no significant tilt)
For the sensor hardware, a leg band consisting of three axes Accelerometer sensor, three textile-base Capacitor sensors, three axes Gyroscope is used. The sensor arrangement on leg band is shown in the figure 2 The capacitor sensors are textile-based capacitor sensors, the Accelerometer sensor, and Gyroscope sensor is embedded on a Microcontroller board. The band is worn on the right leg on the ankle. The figure 3 shows the volunteer subject wearing a band. The leg band also has Bluetooth enabled microcontroller board to which all the sensors are attached. To acquire the data from the band, the Bluetooth connection is required to be formed between the PC and the band. Once the connection is established, with the help of the python script data is collected and saved in the form of CSV files. The figure 4 below shows the raw format of data.
Data is collected over the controlled environment with 4 voluntary participants. Three different datasets for testing data are collected where each dataset has a data of 30 minutes. The data is acquired at the sampling rate of 25 Hz which is the typical range for human actions, fine gestures, and subtle activity differences. Volunteer subjects are guided to be seated in on a chair or sofa throughput the experiment. The information about how each activity is performed is given to the subjects.
The raw data obtained from capacitor sensors is normalized as the textile-based capacitor sensors might not be well calibrated and can result in significant change in the range of capacitance every time data is acquired.
In case of IMU sensors, the high-frequency component, which is also known as AC component is related to the dynamic motion the user is performing, e.g., kicking, crossing, whereas the low-frequency component, known as DC component is related to the gravitational force which can be neglected. In addition to the DC component, the raw data from sensors contains a significant amount of noise which is redundant for further analysis and therefore ‘Band Pass Filters’ in the range of (2 -12 Hz) are used. The type of the filter is ‘Butterworth Bandpass filter’ and the filter is of order 6.
The task of classification is pursued after obtaining features from preprocessed data. Features are computed on the sliding window of the constant size of 75 samples (3 seconds) is used. As every activity takes 1- 3 seconds to perform once, the choice of the window size of 3 seconds is reliable in terms of not missing a partial activity.
The statistical features which calculated on a window of raw data are, (1) Mean (2) Variance (3) Root Mean Square (4) Harmonic mean (5) Skew. The frequency domain features calculated on a window of raw data are (1) Spectral Centroid (2) Signal Energy. Additionally, one more feature is calculated on preprocessed raw data window which gives a number of peaks above the threshold of 60% of peak with maximum value.
The selection of relevant features plays an important part in the process of training. A large number of irrelevant features can result in increased training time, overfitting of the model
The algorithm based on a decision tree such as Random Forest can be used to estimate the importance of computed features. The function ‘feature_importance_’ provided by Scikit Learn  is used to calculate the score of each feature in the form feature vector. The larger score signifies the greater importance for the feature. After obtaining the score vector the feature with the negligible score are eliminated from the process of training to improve the computation speed.
Commonly used supervised machine learning algorithms used for classification tasks are Support Vector Machines(SVM), K Nearest Neighbor(KNN), Random Forest Ensemble(RF) Stochastic Gradient Descent(SGD) where SGDs are used in case of a very large number of training data ( > 100,000 instances) . With the present data set, after feature extraction, 10000 training instances are obtained. Considering the complex ask 8-class classification, an Ensemble classifier as an estimator proved to be a better choice since Ensemble methods use all the weak estimators combined to form a strong estimator. Random Forest is one of the most popular ensemble algorithms when it comes to the task of multiclass classification. The features of Random Forest which make it a better choice are easy implementation, minimum hyper tuning . The Random Forest algorithm used in the present model is an ensemble of decision trees. It is trained with the bagging method. The general idea behind bagging method is combining the result of learning estimators to increase overall performance. The word ‘Random’ here signifies the search for the best feature from a random subset feature while splitting a node.
Random forest is a collection of decision trees. Decision tree algorithm works on the principle of making predictions according to the attributes. Given the training set with features, decision tree algorithms come up with set of attributes.
Random Forest ensemble classifier creates a collection of decision trees. Each decision tree is a random subset of the total dataset. At each node one feature is selected to make a decision that separates the instance. The result of each decision is one of the training classes. The majority vote is taken from all the predicted classes from each of the decision trees which is a final prediction for that instance.
One of the important hyperparameters to tune the Random Forest is a number of trees in the Forest. Number of trees gives increased accuracy. The figure 6 below shows the concept of Random Forest ensemble classifier.
In the multiclass classification, the number of classes decrease the overall accuracy model. One of the commonly used methods is to distribute the classes using a hierarchical approach. In the present model, the hierarchical approach is used to improve the accuracy. Initially, the model is trained without hierarchy to evaluate the classes which are mispredicted with one another, The confusion matrix as in figure 11 shows the initial results. Class 3 ‘Rubbing’ and class 4 ‘Crossing’ are mispredicted with each other and therefore a separate classifier (in this case, a binary classifier) is trained at the second level. At the second level, the binary classifier is trained with the data of classes ‘3’ and ‘4’ and it predicts the results for the same. These predictions are populated with the rest of the predictions as results. The conceptual diagram of the hierarchical classifier is as shown in figure 7.
The primary aim of the data analysis study to validate behavioral indicators of patients through machine learning classifier. The data for the classifier is obtained by wearable sensor leg bands worn by patients. The proposed model is to be used for analysis of leg movements in the patients in the nursing home. To make the process of training and computing accessible by Nursing home staff, A Graphical User Interface is built which has functionalities to analyze data. The figure 8 shows the design of the GUI. GUI is developed in ‘Tkinter’ library in Python. It provides the following functionalities
1) Load File: The button ‘Load File’ provides the functionality to upload the test data in the CSV format
2) Run Analysis: The button ‘Run Analysis’ runs the trained classifier model in the backend to predict the results for test data. The results are populated in the form of a table in the GUI. In the result table, the column attributes represent the 7 activities and the rows represent a time interval. The commonly used time interval for this study is ‘2 minutes’. For example, the if the total time span of observation is 30 minutes. The result table will have 15 rows indicating 2 minutes interval each. if a subject performs any of 7 activities in the span of two minutes, the result for that entry will be updated as ‘yes’ under the column of that activity. Initially, all the entries in the result table are ‘No’. The number of rows populated after analysis is dynamic and it is dependent on the total time interval of the test dataset. The figure 9 shows The GUI with the result matrix populated after the analysis.
3) Interval: The drop-down menu s provided in the GUI to select the interval of observation. i.e. if the interval selected ‘3 minutes’ and total time pan of test data is ’30 minutes’, 10 rows will be populated after the analysis.
4) Plot Data: The plotting functionality is included in the GUI to make a visual analysis of Test data. The figure 10 shows the preprocessed plot of test data collected over 20 minutes.
5) Save file: The button ‘Save file’saves the result table in the form of the CSV file. The user can choose the location to save the file and save it with the appropriate name.
Based on the generated feature files from raw training data, as a set of classifiers was trained to analyze the performance of multiclass classification. Initially, a single Random Forest Classifier was trained to analyze the performance of multiclass classification. Figure 11 shows the confusion matrix for the Random Forest Classifier. The accuracy of this classifier is 83.25%. From the confusion matrix, it is evident that the classes ‘3’ (Rubbing) and ‘4’ (Crossing) are the classes with maximum confusion, whereas the rest of the classes are predicted correctly as compared to classes ‘3’ and ‘4’. Therefore, a second level classifier is used to classify the classes ‘3’ and ‘4’.
Hierarchical classification is a combination of classifiers at different levels. In this case, the class ‘3’ (Rubbing) and class ‘4’ (Crossing) are combined at the second level because these activities are performed in a similar way. At the second level, a Random Forest Classifier is used as a binary classifier. It is trained on the data of classes ‘3’ and ‘4’. Figure 12 shows the results obtained from a hierarchical classification. The accuracy of this model is 86.12%. The hierarchical classification improved the overall accuracy by around 3%.
As the trained model is to be used in the nursing homes for the study of behavioral indicators in the Dementia patients, the GUI is designed to make the analysis accessible. The primary goal of the study at Nursing home is to keep a track if a subject has performed any of the seven activities in the span of every 2 minutes in the total 30 minutes session. Therefore, with the regular confusion matrix, it is difficult for staff at nursing home to analyze the data. The best way to populate result in the GUI is in the form of a binary matrix, i.e. if the user has performed the activity the entry for that time span will be populated as ‘yes’ after the analysis. The rest of the entries will be populated as ‘No’.
Here, to improve the onset accuracy of this binary matrix, the strategy of the maximum vote is used.
In this method, the maximum vote across 3 seconds window is taken. As window length for feature extraction at the initial stage is 3 seconds (75 samples), mode value or maximum vote is taken across the same window length. 3 values are predicted every second with the window length of 75 samples and 90% sliding overlap.
Using Maximum vote strategy, the onset accuracy is improved to 98.10%. Using the maximum vote strategy, the false predictions are avoided being populated on the GUI, making it easier for the nursing home staff to analyze the results.
In the present work, various techniques to improve the performance of multiclass classification are implemented. The raw data is filtered with a bandpass filter and normalized to avoid the problems caused by unstable calibration of sensors in the leg band. Time domain features, frequency domain features, and raw data features calculated on constant length sliding window of raw data. These features are used in combination according to the relevance of each feature eliminating the redundant features. The model is initially trained with traditional Random Forest Classifier to analyze the confusion between classes. Performance of the model is improved with the implementation of the hierarchical or 2-level classifier with Binary Random Forest Classifier at the second level. The accuracy achieved with the hierarchical classification is 86.12 %. The GUI is designed from scratch for a user to compute the predictions test data easily. The GUI has functionalities to load the testing data, run the analysis, display the result table, selection of interval and save the results in the CSV format. The result table displayed in the GUI are obtained after taking maximum vote across the window length. This approached of multiclass classification is useful with the limited training data and high number of classes for the classification task.
We provide you with original essay samples, perfect formatting and styling
To export a reference to this article please select a referencing style below:
Sorry, copying is not allowed on our website. If you’d like this or any other sample, we’ll happily email it to you.
Attention! This essay is not unique. You can get a 100% Plagiarism-FREE one in 30 sec
Sorry, we could not paraphrase this essay. Our professional writers can rewrite it and get you a unique paper.
Please check your inbox.
Want us to write one just for you? We can custom edit this essay into an original, 100% plagiarism free essay.Order now
Are you interested in getting a customized paper?Check it out!