Early Bird (< 30th April): SGD2999
Normal (> 1st June): SGD3899
Seats Available: 8
REGISTER ONLINE NOW
Overview
Making & Breaking Machi ne Learning Systems is a fast paced session on machine learning from the Infosec professional’s point of view. The class is designed with the goal of providing students with a hands-on introduction to machine learning concepts and systems, as well as making and breaking security applications powered by machine learning.
The lab session is designed with security use-cases in mind, since using machine learning in security is very different from using it in other situations. Students will get first hand experience at cleaning data, implementing machine learning security programs, and performing penetration tests of these systems.
Each attendee will be provided with a comprehensive virtual machine programming environment that is preconfigured for the tasks in the class, as well as any future machine learning experimentation and development that they will do. This environment consist of all of the most essential machine learning libraries and programming environments friendly to even novices at machine learning.
At the end of the class, students will be put through a CTF challenge that will test the machine learning development and exploitation skills that they have learned over the course in a realistic environment.
Key learning objectives:
Familiarizing yourself with popular machine learning algorithms and how to adapt these for different problems
How to clean and sanitize data using powerful data processing libraries in Python
How to build a spam classifier and online anomaly detection system in Python
How to do performance evaluations of machine learning classifiers
Examples for using machine learning in intrusion detection, botnet detection, phishing detection, web vulnerability analysis, malware classification, and behavioural analysis
Perform tuning of machine learning systems to improve classification/detection results
Perform security evaluations and penetration tests on machine learning systems
Fuzzing machine learning classifiers
How to avoid vulnerabilities in machine learning system and algorithm design
How to use Apache Spark to design scalable and distributed real-time machine learning systems
Write your own machine learning captcha solver
Who should attend:
Security Professionals
Web Application Pentesters
Software/application developers
People interested to start using machine learning for security
Hardware/Software requirements:
Latest version of VirtualBox Installed
Administrative access on your laptop with external USB allowed
At least 20 GB free hard disk space
At least 4 GB RAM (the more the better)
Agenda:
Day 1
● Introduction to machine learning
○ Hands-on guided exploration of Python machine learning libraries:
Data-wrangling using Numpy and Pandas
Scikit-learn’s functions and capabilities
Data visualization using Matplotlib/Seaborn
Walkthrough of the most commonly used machine learning algorithms (with quick hands-on examples/visualizations for select algorithms)
Supervised learning algorithms
Linear/logistic regression
Support Vector Machines
Unsupervised learning algorithms
Hierarchical/k-Means clustering
Decision trees/Random forests
Semi-supervised learning
2-hour example: Building (and bypassing) an email spam filter with scikit-learn
Day 2
Loading data efficiently
Using a labeled email/spam corpus training and test set, extract salient features to build a word model of spam
Model tuning, cross-validation, and evaluation process
With complete knowledge of the system, manually craft a piece of spam to bypass the filter
Lecture on application of machine learning in the security/abuse space
Spam, fraud, malware, phishing, and intrusion detection short examples
Principles behind selecting the best machine learning models for different use-cases
Considerations when using machine learning in an adversarial/malicious networks
Using Keras/TensorFlow for anomaly detection with convolutional neural networks
Choosing the appropriate model for implementing different types of problems – efficacy comparison of different machine learning techniques for solving the anomaly detection problem, and what other considerations to have
2-hour example: Building a simple network intrusion detection system with 2 different machine learning models
Importance of understanding the data and the threat model before designing a solution for the problem
Model tuning, cross-validation, and evaluation process
Guided comparisons of the performance characteristics for each implementation
Visualizing and presenting the data for ease of analysis by security operation professionals.
Day 3
Streaming pipelines for machine learning using Apache Spark MLlib (PySpark)
Overview of Apache Spark
General architecture
Distributed, scalable machine learning deployments with Spark
Guided example of a streaming architecture for network anomaly detection using reinforcement learning on Spark
Evaluating the security of machine learning systems
Techniques and guided example of fuzzing a classifier and regressor to find blind spots in the model
Evaluation of intelligent learning system architecture that is resilient to model poisoning by an adversary
Machine Learning CTF challenge – captcha bypass challenges (using captcha character classification starter code provided)