Posts

Ensemble Methods

Ensemble Methods

abstract Ensemble methods is the art of combining diverse set of ‘weak’ classifiers together to create a strong classifier. They use multiple learning algorithms to obtain a better predictive performance than what could be obtained from any of the constituent learning algorithms alone. It means you build several different models then “fuse” the prediction of each …

Read More Read More

Support Vector Machines

Support Vector Machines

abstract Recall that algorithms like the perceptron look for a separating hyperplane. However, there are many separating hyperplanes and none of them are unique. Intuitively, for an optimal answer, you want an algorithm that finds a hyperplane that maximizes the margin between different classes so that it can perform better on new data. Support Vector Machines …

Read More Read More

Neural Networks

Neural Networks

abstract Artificial neural networks are powerful computational models based off of biological neural networks. The neural network has an input layer, hidden layer, and output layer, all connected by weights. The inputs, with interconnection weights, are fed into the hidden layer. Within the hidden layer, they get summed and processed by a nonlinear function. The outputs …

Read More Read More

Perceptron

Perceptron

abstract The perceptron is a supervised learning algorithm that only works on linearly separable data, as its goal is to find a hyperplane to separate different classes. It is known as the single-layer neural network and is the most basic version of an artificial neuron. The perceptron takes in inputs with weights and runs it …

Read More Read More

Logistic Regression

Logistic Regression

abstract Logistic regression, despite its name, is a linear model for classification rather than regression. Logistic regression is a supervised classification algorithm where the dependent variable (label) is categorical, i.e. yes or no. It takes a linear combination of features with weights (parameters) and feeds it though a nonlinear squeezing function (sigmoid) that returns a …

Read More Read More

Linear Regression

Linear Regression

abstract Linear regression is the most basic type of regression and commonly used in predictive analysis. Unlike the previous algorithms, linear regression can only be used for regression as it returns a real predicted value, i.e. 567 dollars per share, or predicting your son grows to be 6ft4.  It models the relationship between dependent variable and one …

Read More Read More

Naive Bayes

Naive Bayes

abstract Naive Bayes methods are a set of supervised learning algorithms based on applying Bayes’ theorem. It’s actually fairly intuitive and simple to compute. Instead of looking for (probability of the label, given the data) directly like discriminative models, we calculate (probability of the label) and  (probability of the data, given the label) to find . …

Read More Read More

Decision Tree

Decision Tree

Abstract Decision Trees are supervised learning methods used for classification and regression. In a tree structure, nodes represent attributes of the data, branches represent conjunctions of features, and the leaves represent the labels. Using the C4.5 tree generation algorithm, at each node, C4.5 chooses the attribute of the data that most effectively splits its set …

Read More Read More

K-Means Clustering

K-Means Clustering

abSTRACT K-Means is an unsupervised learning algorithm, meaning that you don’t have the label. The purpose of the algorithm is to partition the data into K (K can be set to any number) different groups so you can find some unknown structure and learn what they belong to. First, you start off with K clusters …

Read More Read More

K-Nearest Neighbors

K-Nearest Neighbors

Abstract K-Nearest Neighbors (KNN) is one of the easiest (but effective) algorithms in machine learning used for classification and regression. When I have a new piece of data that I want to classify, I simply take that new piece of data and compare it to every piece of data in my training set. I then store K (K …

Read More Read More