Home — Essay Samples — Law, Crime & Punishment — Identity Theft — Feature Selection Technique In The Network Traffic Dataset

Feature Selection Technique in The Network Traffic Dataset

Categories: Cyber Security Identity Theft Network Security

Human-Written

About this sample

Human-Written

Words: 1052 |

Pages: 2|

6 min read

Published: Jul 15, 2020

Words: 1052|Pages: 2|6 min read

Published: Jul 15, 2020

Nowadays security is a big threat to the digital world. The use of internet, computers, mobile, tablets has become ubiquitous and the cyber-attack has grown rapidly. There are various kinds of cyber-attacks such as Spoofing, sniffing, denial-of service, phishing, evil twins, pharming, click fraud and malware. Malicious software’s are harmful for both computer and network. Cyber-attack growth has increased drastically and has compromise the systems, take away valuable information and destroy important structure, producing vast losses, per incident it costs dollar 345 in average.

Not only the growth of internet uses but also number of new malware is become another reason of digital threat. More than 317 million new pieces of malware were created in 2014. Conventional anti-virus and intrusion detection system cannot detect zero day attack. According to the Symantec Internet Security Threat Report 2010 the circulation of malware over 5 million on the internet. As a result, security specialist are very much devoted to develop an efficient malware detection method. In this work we describe several feature selection technique, due to detect malware from network traffic dataset using machine learning algorithm. Because feature selection is very important task for malware detection. Malware can be detect through static and dynamic features. Although anti-virus software are developed based on signature of malware, it fails when zero day malware attack occur. Malware detection system captures network traffic dataset to distinguish between malware and goodware (normal and suspicious activity).

The network traffic dataset has lots of packets with huge features. Some feature may be very important but some are may not be relevant for making decision. However, it increases the processing time and decreases the efficiency of malware detection system. That’s why, the main purpose of feature selection technique is to reduce the dimensionality of feature space, remove the redundant and irrelevant feature from network traffic dataset.

There are many approach developed to represent the proliferation number of malware that revolt every day. Hansen et al. introduced an approach named Random Forests Classifier for detecting and classifying the vast amount of malware which comes from known or unknown malware family. This approach reduce the feature space expressively. And Cuckoo sandbox also used as a behavioral traces of analyzed samples due to achieving high malware detection rate and family classification.

Tian et al. were used logs of API calls to distinguish malware from cleanware by scrutinizing the behavioral features. This work also proposed for both malware family classification and detection by applying pattern recognition algorithms in virtual environment. They achieved approximately 97% accuracy by using a dataset of 1, 368 malware and 456 cleanware. In another study the applicability of sandbox environment to obtain the run-time behavior of malware was discussed. The proposed work differentiate malware by using a heuristic method termed N-grams analysis and adopt Information Gain feature selection technique to choose the best features for classification. Cuckoo sandbox examine the malware behavior which are running on Virtual Machine. They found SPegasos, achieved highest accuracy, better detection rate from different feature length such as 200, 400 and 600.

Authors proposed a method of bilayer abstraction based on the dynamic analysis of API sequences for malware detection. Behavioral features are abstracted by low layer and high layer behavior. They also propose an enriched support vector machine named OC-SVM Neg due to use benign software samples available which provide false alarm rate better. The number of 14863 malware and 2623 benign programs are collected from VXHeaven and Malheur. This work conveyed good result to detect unknown malware.

On the other hand, Santos et al. developed a hybrid malware detector for detecting unknown malware by attaining feature statically and dynamically. For testing their proposed system they collect malware and benign programs from two different source. One is VXHeaven for malware samples yet for benign programs they rely on their setup. For feature vector they used opcode sequence, system call, exceptions, etc. This hybrid approach is efficient for extracting feature both statically and dynamically.

In another research a supervised system introduced for detecting malware. From different observation area they extracted 972 behavioral features. They used naïve bayes, decision tree (J48) and random forest as machine learning algorithm to come up with decision. In this paper, unknown malware could be detected within one month if static rule pre-defined by Snort or Suricata systems.

Fukishima et al. have implemented a prototype for malware detection. Authors evaluated apprehensive process behavior on windows OS due to avoid false positives. This behavior based method achieved about 60% accuracy for detecting malware without false positive. That’s why, they used 83 malware and 41 goodware for evaluation.

Nari et al. proposed an automated method for classifying malware considering network activity of malware. They created a behavioral graph which not only characterize the samples network behavior but also dependencies on the network flows. This method were efficient for malware sample classification.

According to authors represented a data mining technique to detect new malicious executables. Three different types of feature: Portable Executable (PE), byte-sequence n-grams and string features were used for feature extraction. Their dataset consist of 3265 malware and 1001 clean programs where total number of programs 4266. For malware classification they also used multi-Naïve Bayes method which highest accuracy of detection rate 97. 76% over unfamiliar programs.

In the other study authors developed an efficient malware classification technique based on string information which executables. They extracted printable strings from 1367 sample containing viruses, unpacked Trojan and clean files. They flourished to gain 97% classification accuracy using k-fold cross validation from unpacked malicious and used also Random forest as an effective classifier.

R. Islam et al. introduced a classification systems which is integrated static and dynamic features. For this work they composed two set of dataset where first one is collected between 2003 and 2007 another one is collected between 2009 and 2010. Using Random forest classifier they achieved accuracy of 97%.

Ahmed et al. combined two different dynamic features (from spatial and temporal information) in sandbox to detect malware available in run-time API calls. They achieved classification accuracy of 96. 3% using 516 executables files. In similar way, Wagener et al. executed small amount of malware files (104) to generate lists of API calls and then calculated the similarity between two API call sequences by using similarity matrix. They succeeded to detect 93% accuracy.

Research Of The Shattering Effects Of Literal And Metaphorical Identity Theft

Behavioural Biometrics: A Survey And Classification

This essay was reviewed by

Dr. Oliver Johnson

More about our Team

Cite this Essay

Feature Selection Technique In The Network Traffic Dataset. (2020, July 14). GradesFixer. Retrieved July 16, 2026, from https://gradesfixer.com/free-essay-examples/feature-selection-technique-in-the-network-traffic-dataset/

“Feature Selection Technique In The Network Traffic Dataset.” GradesFixer, 14 Jul. 2020, gradesfixer.com/free-essay-examples/feature-selection-technique-in-the-network-traffic-dataset/

Feature Selection Technique In The Network Traffic Dataset. [online]. Available at: <https://gradesfixer.com/free-essay-examples/feature-selection-technique-in-the-network-traffic-dataset/> [Accessed 16 Jul. 2026].

Feature Selection Technique In The Network Traffic Dataset [Internet]. GradesFixer. 2020 Jul 14 [cited 2026 Jul 16]. Available from: https://gradesfixer.com/free-essay-examples/feature-selection-technique-in-the-network-traffic-dataset/

copy

Keep in mind: This sample was shared by another student.

450+ experts on 30 subjects ready to help
Custom essay delivered in as few as 3 hours

Get high-quality help

Dr Jacklynne

Verified writer

Expert in: Information Science and Technology Law, Crime & Punishment

(204 reviews)

“ She followed all my directions. It was really easy to contact her and respond very fast as well. ”

+120 experts online

Hire writer

Learn the cost and time for your paper

Paper Topic

Deadline: in 10 days

Number of pages

Email Invalid email

By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy. We’ll occasionally send you promo and account related email

"You must agree to out terms of services and privacy policy"

Get an estimate

No need to pay just yet!

Remember! This is just a sample.

You can get your custom paper by one of our expert writers.

Get custom essay

121 writers online

Still can’t find what you need?

Browse our vast selection of original essay samples, each expertly formatted and styled

Feature Selection Technique in The Network Traffic Dataset

Cite this Essay

Related Essays

Still can’t find what you need?

Related Essays on Identity Theft

Related Topics

Get Your Personalized Essay in 3 Hours or Less!

Get Your
Personalized Essay in 3 Hours or Less!