data-learn is an online machine learning system, which is the field of study that gives computers the ability to learn without being explicitly programmed. It evolved from pattern recognition and computational learning theory in artificial intelligence, is closely related to statistics and has strong ties to mathematical optimization.

Machine learning explores the construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a learning model from example inputs in order to make data-driven predictions, rather than following strictly static program instructions. It can be divided in supervised and unsupervised learning tasks.


Supervised Learning

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output. These problems are categorized into regression and classification problems.

In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.


Example:

Given data about the size of houses on the real estate market, try to predict their price. Price as a function of size is a continuous output, so this is a regression problem.

We could turn this example into a classification problem by instead making our output about whether the house "sells for more or less than the asking price." Here we are classifying the houses based on price into two discrete categories.


Unsupervised Learning

Unsupervised learning allows us to approach problems without knowing what our results should look like. No labels are given to the learning algorithm, leaving it on its own to find structure and discover hidden patterns in its input.

We can derive structure from data by clustering it based on relationships among variables, without necessarily knowing their effects. It can be used for market segmentation, social network analysis, organizing computer clusters and astronomical data analysis.


Example:

Suppose a doctor over years of experience forms associations in his mind between patient characteristics and illnesses that they have. If a new patient shows up then based on this patient’s characteristics such as symptoms, family medical history, physical attributes, mental outlook, etc the doctor associates possible illness or illnesses based on what the doctor has seen before with similar patients.


Usage

Machine Learning is employed in a range of computing tasks where designing and programming explicit algorithms is infeasible, like:

  • Adaptive websites
  • Advertising
  • Affective computing
  • Bioinformatics
  • Brain-machine interfaces
  • Cheminformatics
  • Computer vision
  • Finance
  • Fraud detection
  • Game playing
  • Information retrieval
  • Machine perception
  • Medical diagnosis
  • Natural language processing
  • Optimization and metaheuristic
  • Recommender systems
  • Robot locomotion
  • Search engines
  • Sentiment analysis
  • Sequence mining
  • Software engineering
  • Spam filtering
  • Speech and handwriting recognition
  • Stock market analysis
  • Structural health monitoring
  • Syntactic pattern recognition