Decision Tree And Parametrized Classifier For Estimating Occupancy In Energy Management: [Essay Example], 2949 words GradesFixer

Haven't found the right essay?

Get an expert to write your essay!


Professional writers and researchers


Sources and citation are provided


3 hour delivery

This essay has been submitted by a student. This is not an example of the work written by professional essay writers.

Decision Tree and Parametrized Classifier for Estimating Occupancy in Energy Management

Download Print

Pssst… we can write an original essay just for you.

Any subject. Any type of essay.

We’ll even meet a 3-hour deadline.

Get your price

121 writers online

Download PDF

A new kind of supervised learning approachis proposed to determine the number of occupants in aroom in order to use these estimate for improved energymanagement. It introduces the concept of Parametrizedclassifier. It relies on the predetermined structure ofsupervised learning classifiers, where any classifier couldbe used to evaluate this approach. The parameters willbe adjusted according to the incoming data sensors (i. eCO2 concentration, acoustic pressure,. . . ) using a tuningmechanism depends on an optimization process. Thispaper provides different supervised learning methods (i. edecision tree random forest) to determine the requiredstructure in order to be used in parametrized classifierapproach.

The structure of decision tree has been chosenwhich represents the classification rules and limit thedepth of the tree to facilitate the generalization process. In order to evaluate the generalization possibilities of asupervised learning approach (i. e. decision tree), it hasbeen chosen to extrapolate results from office H358 toanother similar office H355. The knowledge has beenextracted from a decision tree built on H358 office thenapplied and tuned for H355 using parameterized classifierapproach. Moreover, experiments implement occupancyestimations and hot water productions control show thatenergy efficiency can be increased by about 6% overknown optimal control techniques and more than 26%over rule-based control besides maintaining the occupantcomfort standards. The building efficiency gain is stronglyconnected with the occupancy estimation accuracy.

Keywords—human behavior, optimization, building per-formance, office buildings, machine leaning, data mining,management and control


Due to constraining in building standards, buildingsand their appliances are becoming more and moreefficient. As a result, consumption related to humanactivity is relatively much bigger than before. In addi-tion, demand response in both electric and heat gridsleads to variable tariffs that the occupants of livingareas have to take into account in their everyday life. Occupants presence, number and activities now domatter and it is becoming much more complex thanbefore. Designing a new building should not only focusin building physics and HVAC systems, but also hu-man behavior that relies on energy management andmonitoring systems(EMMS). These systems can pro-vide advice and information to occupants about therelevance of their behavior regarding the current stateof a dwelling and its connected grids.

In additionthey should modify the dwelling setting accordingly. Therefore, advanced EMMS need to estimate the rel-evance of occupants number and activities and buildingsimulation has to take it into account in order to beable to consider EMMS at design step, reducing theso called performance gap with reality. Nevertheless,human behavior is not only interesting during the designstep, but also during operation. It is indeed useful fordiagnostic analyzes to discriminate human misbehaviorfrom building system performance, and also for energymanagement where strategies depend on human activi-ties and, in particular, on the number of occupants in azone. Recently, research about building has turned toinvestigate occupant behavior, this paper tackles thisissue. It proposes an occupancy estimation approachbased on parameterized classifier using a predeterminedstructur ”if- then” from decision tree.


Occupants’ behavior is one of the major influenceon building energy consumption. (Honga et al. , 2015)introduced methods in modeling occupant behavior andquantifying its impact on building energy use. Themajor themes include advancements in data collectiontechniques, analytically and modeling methods and sim-ulation applications, which provide insights into behav-ior related energy savings potential and impact. Thereis a large gap between the predicted energy demandand the consumption, once the building is in use. Onecause could be that occupant behavior might not fit withthe energy concept and thus cause counterproductiveeffects (JKarin Schakib-Ekbatan, 2015). (Ebadat et al. ,2013) focuses on how to estimate the number of oc-cupants in a room by processing CO2 concentration,temperature and HVAC actuation levels in order toidentify a dynamic model. In (Dong et al. , 2010),hidden Markov models have been used for estimatingoccupancy using a wireless ambient sensing systemas well as wired carbon dioxide sensors and a wiredcamera network in order to establish actual occupancylevels.

The most popular approach for determining aperson’s posture, motions, and activity is to use externaltracking methods that employ cameras, RF beacons, orsimilar sensors that monitor the body or markers onthe body (YangSong and pietro Perona, 2000). Thismethod is fairly precise, but also most demanding interms of setting up the infrastructure, maintaining thehardware, algorithmic complexity, and privacy issues. The problem of real time estimation of occupancy in acommercial building has also been investigated in (Liaoand Barooah, 2010), where merging sensor data withmodel predictions was essential. Additionally, real-timeestimation of building occupancy is extremely valuableduring emergency egress. In (Tomastik et al. , 2010), anextended Kalman filter, which combines sensor readingsand a dynamic stochastic model of people movementswas used. However, for various applications like activityrecognition or context analysis within a larger officespace, information regarding the presence or absenceof people is not sufficient and an estimation of thenumber of people occupying the space is essential. (Lam et al. , 2009) investigates this problem in openoffices, estimating occupancy and human activities us-ing a multitude of ambient information, and comparethe performance of HMMs, SVMs and Artificial NeuralNetworks.

However, none of these methods generatehuman-understandable rules which may be very helpfulto building managers. Fine grained knowledge of humanoccupancy in buildings can result in a better controlof HVAC and hot water systems Kazmi et al. (2017). Perhaps the earliest practical example of such activecontrol can be found in the ’Neural Network house’which used extensive data logging to control HVACand hot water production. Such levels of data gatheringare usually impractical because of both economic andprivacy concerns. Nevertheless, HVAC efficiency gainsusing occupancy estimates are now well documentedwith up to 20% savings demonstrated, and up to 15%in Agarwal (2010). In general, an occupancy countalgorithm that fully exploits information available fromlow cost, non-intrusive, environmental sensors and pro-vides meaningful information is an important yet littleexplored problem in office buildings.


A new methodology for occupancy estimation hasbeen investigated by using a parametrized classifierapproach. Parametrized classifier is a process dependson predetrmined structure for occupancy estimation(i. e decision tree). It uses a predetermined classifierstructure with parameters to be adjusted according to theincoming sensor sata. Tuning problem can be solved byadjusting the classifier parameters (i. e node thresholdsof the decision tree) in the final structure according toeach updated record set and how much it’s differentfrom the previous one, An objective function will bedetermined to minimize the dis- tance between actual(coming from camera) and estimated (coming from theclassifier) number of occupants in the room. Optimiza-tion covers a required period of training in the studiedarea. Any classifier could be used in this approach, butstill it is important to choose a general structure forthe sake of adaptability. Additionally, the number ofparameters should be low because the tuning mechanismrelies on an optimization process that may becomeinefficient when complexity increases.


To perform the task of finding the number of oc-cupants, a relation has to be discovered between theoffice en vironment and the number of people in it. The office environment can be represented as a setof state variables, At = [A1,A2,. . . ,Am]t. This set ofstate variables A at any instance of time t must beindicative of occupancy. A state variable can be termedas a feature, and therefore the set of features as featurevector. Similarly, the m-dimensional space that containsall possible values of such a feature vector is the featurespace. The underlying approach for the experiments isto formulate the classification problem as a map from afeature vector into some feature space that comprisesseveral classes of occupancy or activities. Therefore,the success of such an approach heavily depends onhow good the selected features are. In this case, featuresare attributes from multiple sensors accumulated over atime interval. The choice of interval duration is highlycontext dependent, and has to be done according tothe granularity required. Features is the information ex-tracted from the data i. e acoustic pressure from a micro-phone, time slot, occupancy from power consumption,door or window position, motion counting,day type,indoor temperature.

One quantitative measurement ofthe usefulness of a feature is information gain, whichdepends on the concept of entropy (Arora et al. , 2015). Information gain is helpful to distinguish among a largeset of features, the most worthwhile to consider foroccupancy estimation. A supervised learning approachhas been used. Occupancy has been determined beforeusing a classification algorithm: occupancy countingwas manually annotated using a video feed from twocameras strategically positioned in an office to sim-ulate the occupant replies, determine the structure ofparameterized classifier and validate interactive learningresults. There methods of supervized learning have beenchossen: A. decision tree with parametrized classifier The decision tree classification technique has beenselected because it provides both very good resultsand the results are easy to analyze and adapt. Thedecision tree algorithm selects a class by descendinga tree of decision nodes. Each internal node representsa comparison of a single feature value with a learnedthreshold. The target of the decision tree algorithm isto select features that are more useful for classification. if Xi ≤ threshold thenleft child nodeelseright child nodeend ifIn order to evaluate the parametrized classifier ap-proach, decision tree is one of the most importantsupervised learning methods for activity recognition. According to our (Amayri et al. , 2016), decision treesgave an human- readable results which can be analyzedand easily adapted for building managers. Additionally,the possibility of limiting the depth of the tree in order tosimplify the analysis of the ”if – then” rules, enable usersto quickly extract useful information about occupancyestimation.


The case studies are performed into two similaroffices H358 and H355 which are located in GrenobleInstitute of Technology (figure 1), Maximum number ofoccupants in both offices are six. Office H358 has morefrequent visits with a lot of meetings and presentationsthroughout the week compared to office H355 whichis more limited to its three formal student’s presence. The setup for the sensor network includes in the bothoffices are illustrated in figure(1).

  • video cameras for recording real occupancynumbers and activities.
  • An ambiance sensing network, which measurestem- perature, relative humidity (RH), motions, CO2concentration, power consumption of 3 laptops, doorand window positions, and acoustic pressure frommicrophone. Data are sent thanks to the ENOCEANprotocol on significant value change event.
  • A centralized database with a web applicationfor continuously retrieving data from different sources.

Fewer number of sensors are installed in H355 officecompared to H358 one, but still best sensors accordingto (Amayri et al. , 2016) available for occupancyestimation model. A. Average error versus accuracyGenerally, in machine learning classifiers, the valida-tion of estimated results can be checked by consideringthe accuracy (precision), while in this research, it hasbeen proposed a concept of average error, due to thedependency between the entire levels of occupancy seefigure(2). In addition the floating values of occupancydue to use quantum time (i. e 30 minutes) see figure(3)to collect the required trainning data. Centers for each level from K-meanerror is a distance between actual points and estimatedpoints. Indeed, average error is more interesting thanaccuracy to validate occupancy estimation in building. Average error allows to take into account how much isthe change in the occupancy, while accuracy considersonly estimated occupancy level is correct or not. B. Defining the occupancy levelIn this section, a method for choosing the numberof levels (L) of occupancy for classification purposes isdiscussed. This number is not fixed and can be changedin accordance with the required average error. The levelof occupancy could be defined by applying K- meansclustering which is a popular data clustering algorithm. However, one of its drawbacks is the requirement forthe number of clusters k to be specified before thealgorithm is applied. The performance of a clusteringal- gorithm may be affected by the chosen k value. Therefore, instead of using a single predefined k, a set ofvalues might be adopted. It is important for the numberof values considered to be reasonably large, to reflectthe specific characteristics of the data sets. At the sametime, the selected values have to be significantly smallerthan the number of objects in the data sets, which is themain motivation for performing data clustering. In ourcase study, different values of k were used i. e 2 to 6,and the least average error from occupants estimationwas chosen to be the best K value.


Figure 4 shows average errors associated with eachlevel when applying decision tree and random forestprocedures. Accordingly, 5 levels of occupancy was thebest option for the occupancy classification, howeverit is compatible with K-mean clustering, as shown infigure 4. Table 1 shows the proposed centers for eachlevel from K-mean method:A depth equal to 2 is the limitation chosen for thenext analysis of occupancy estimation because of thelow average error of the resulting decision tree and ofthe little number of thresholds to adjust. Additionally,the tree is readable and rules are quite general. Notethat, (if-then) rules from the tree structure could beextracted now easily to be applied in a tuning context. Adepth-limited decision tree classifier has been selected. The number of state variables has been limited totwo (acoustic pressure and motion) to facilitate theoptimization mechanism. Thresholds after adaptation from the optimizationmethod with the data sensors distribution, acoustic pressure thresh-old=0. 33 and motion detector threshold=0. 7end ifTuning problem can be solved by adjusting theclassifier parameters (node thresholds of the decisiontree) in the final structure according to each updateddata set and how much it’s different from the previousone. However, low/high in the structure refer to thethresholds, which have been determined by decisiontree. An objective function is determined to minimizethe distance between actual (coming from the videocamera during training period) and estimated (comingfrom the classifier) number of occupants in the room. Optimization covers a required training period. The thresholds for acoustic pressure=0. 033, and formotion detector=0. 5, which is predetermined by deci-sion tree in the office H358. A new data is coming ateach time quantum=30 minutes (office H355), then anoptimization methods starts to adapt these thresholdsaccording to the new data i. e, using basin-hopping opti.

Distribution of parameterized classifier error due to usessome random initial parameters in an optimization method, with anaverage error equal 0. 26, H355 case studymization method. It is a none linear stochastic algorithmand possibly not convex, which attempts to find a globalminimum of a smooth scalar function of one or morevariables according to the objective function (minimizethe distance between actual and estimated occupancy inthe room). To validate the method, a dataset coveringone month from 01-March-2016 has been used. Theparameterized classifier for occupancy estimation hasbeen performed 100 times to show the distribution of theerror because the optimization method (Basin-hopping)uses some random initial param- eters values, as shownin figure 8. The new thresholds have been changeda little comparing to H358, i. e. for acoustic pressure=0. 022 and for motion detector=0. 7 which can be seenin figure 7. The acoustic pressure threshold decreased,because of the better installation for the windows inH355 office than in H358. In H358 there is a windowconnected to the corridor with bad performance. Thethreshold of the motion detector increased a little dueto its position to detect all the users in the office H355which is less effective than the one in office H358. Comparing with the estimation in H358, it’s almostsimilar in both offices with an average estimation erroralmost to 0. 19 in H358 which can be seen in figure 5. The parameterized classifier method is a good approchefor a similar context but it is difficult to be extendedto a different environment (i. e, residential area), whiledecision tree is more general. A. Energy efficient hot water production using occu-pancy estimatesTypically, hot water production systems consumeenergy due to regular reheat cycles following a naı̈verule based controller: Here at refers to the controlaction of the heating element at time t. When it isset to 1, it reheats the storage vessel and remains idlewhen the mode is 0.

This decision is made based ontemperature sensor information from the storage vesselTs which is compared against a threshold Ttg-∆T,where Ttg is the temperature the vessel is reheatedto during a reheat cycle, and ∆T is the temperaturethe storage vessel is allowed to fall relative to Ttgbefore a reheat cycle is initiated. This behaviour can beoptimized by making the reheat behaviour dependent onthe remaining energy content in the storage vessel andthe (predicted) behaviour of the human occupant:at ={1, if Wt <∑i+nk=i Wk + M0, if Wt ≥∑i+nk=i Wk + MHere Wt is the hot water volume left in the vessel,defined here as the amount above 45 ◦C. Since theenergy content in the storage vessel is not observ-able directly, Wt can only be approximated using amodel of the storage vessel. This model can be builtusing either thermodynamics knowledge of the storagevessel or can be learned directly from sensor data. The model then defines the remaining volume of hotwater in the storage vessel which is compared against∑i+nk=i Wk, the amount of hot water predicted to beconsumed over the next n time steps.

Furthermore,a safety margin, M is introduced in the control for-mulation to account for unexpected draws caused bystochastic human behaviour and is usually considereda fixed threshold based value. This greedy approach toreheating the storage vessel belongs to the family ofjust-in-time control strategies where both late and earlyheating is penalized. In doing so, it improves the energyefficiency of hot water production while maintainingoccupant comfort. This strategy is demonstrably optimalfor climate agnostic heating elements such as electricresistance or gas boilers, given no additional knowledgeabout human behaviour. For heat pumps, where ambi-ent temperature affects the efficiency, it performs sub-optimally (since an earlier reheat cycle might benefitfrom higher ambient temperature) thereby necessitatingmore sophisticated optimization strategies such as meta-heuristics for planning


In this paper, a methodology has been proposedand implemented to validate the occupancy estimationmodel when manual labeling is done by an expert. This methodology started by collecting sensors data. Moreover, the concept of decision tree and parametrizedclassifiers have been used to determine the most relevantsensors useful in occupancy estimation model. A testdata has been collected to assess the performance ofestimation model, in term of average error. Experimentsto integrate occupancy estimation and hot water produc-tion control have also been carried out. These concludethat energy efficiency can be effectively increased by upto 5% when occupancy estimates are incorporated intothe optimal control formulation. This paper presentedimportant results in estimating occupancy and reducingenergy demand and, in doing so, it contributes tothe growing body of literature illustrating the energy-occupancy nexus. The workcan be extended towardsactivity recognition in an office context.

Remember: This is just a sample from a fellow student.

Your time is important. Let us write you an essay from scratch

100% plagiarism free

Sources and citations are provided

Find Free Essays

We provide you with original essay samples, perfect formatting and styling

Cite this Essay

To export a reference to this article please select a referencing style below:

Decision Tree And Parametrized Classifier For Estimating Occupancy In Energy Management. (2020, July 14). GradesFixer. Retrieved October 28, 2020, from
“Decision Tree And Parametrized Classifier For Estimating Occupancy In Energy Management.” GradesFixer, 14 Jul. 2020,
Decision Tree And Parametrized Classifier For Estimating Occupancy In Energy Management. [online]. Available at: <> [Accessed 28 Oct. 2020].
Decision Tree And Parametrized Classifier For Estimating Occupancy In Energy Management [Internet]. GradesFixer. 2020 Jul 14 [cited 2020 Oct 28]. Available from:
copy to clipboard

Sorry, copying is not allowed on our website. If you’d like this or any other sample, we’ll happily email it to you.

    By clicking “Send”, you agree to our Terms of service and Privacy statement. We will occasionally send you account related emails.


    Attention! this essay is not unique. You can get 100% plagiarism FREE essay in 30sec

    Recieve 100% plagiarism-Free paper just for 4.99$ on email
    get unique paper
    *Public papers are open and may contain not unique content
    download public sample

    Sorry, we cannot unicalize this essay. You can order Unique paper and our professionals Rewrite it for you



    Your essay sample has been sent.

    Want us to write one just for you? We can custom edit this essay into an original, 100% plagiarism free essay.

    thanks-icon Order now

    Hi there!

    Are you interested in getting a customized paper?

    Check it out!
    Having trouble finding the perfect essay? We’ve got you covered. Hire a writer uses cookies. By continuing we’ll assume you board with our cookie policy.