You are on page 1of 23

How to learn machine learning from

scratch?

Machine learning is the buzz


word these days and
everybody wants to know
something about it. In times to
come machine learning will be
a concept that everyone who
needs to stay competitive will
have to know about.
What is Machine Learning?

Traditional programs take data as input and produces data as output.


However a machine learning algorithm takes data as input but produces a

program as an output. This machine generated program can now take new

data, process it and produce output data.

Machine learning algorithms automate the process of creating programs

using historical data. In simple words, it gives computers the capability to

extract knowledge from data and store it for future judgement.

Wikipedia defines machine learning as…

Machine learning is the subfield of computer science that, according to

Arthur Samuel, gives “computers the ability to learn without being explicitly

programmed”.

This brings us to our first introspective question…

Which kind of machine learner are you?

Depending on the role you have in your organization or the role you aspire

for, you might fall in one of the three different category of machine learners.

1. Businesses user
Business user​ is involved in the day to day running of a business. They

run the operations and are responsible for defining and executing the

business processes of a company. In traditional companies, the executives,

operations team and managers fall in this category.

For this kind of user a high level understanding (non technical) of what

Machine Learning can do and what it can’t do is beneficial. They need just
enough information to equip themselves to determine whether they will see

return on investment on machine learning or not?

Here are some examples of successful use cases of machine

learning:

● Traditionally for customer support operation teams spend lot of


dollars on costly human resources. Machine learning can automate
menial operational tasks like customer support.
● Machine learning can analyze tons of usage data (big data) and
make remarkable suggestions on business tactics that can be applied
to increase revenue.
● To know more interesting use cases read our blog ​8 great
applications of Machine Learning

2. Machine Learning Engineers & Data Scientists


This learner is someone who will apply machine learning to real life

problems. They are the ones who will be the consumers of all machine

learning frameworks like Watson, Spark or Sci-kit learn.

They will have a flare for playing with data. They will love to gather data,

clean it, augment it with missing information and then use it for machine

learning. They are also called machine learning engineers or data

scientists.
Industry is hungry for machine learning engineers and data scientists who

can apply machine learning algorithms to help the business reduce costs or

expand their revenue streams.

This group will not be responsible for creating algorithms. They will users of

existing machine learning libraries to solve the problem at hand.

They will be required to understand the strengths and the weaknesses of

different machine learning algorithms. They will have to know how a given

algorithm behaves in different situations and what is the algorithm best

suited for a given type of problem.

They will be responsible for using programming (mostly Python, R, Scala,

Java, etc. ) for gathering data from across the organization and public

sources, cleaning it and then massaging it to be fed into the machine

learning algorithms.
These users will combine art and science to solve the given business

problem. Lot of their time will go in trial and error approaches before they

arrive at an optimum solution.

3. Theorists and Researchers

If you belong to this group, then you are working on creating cutting edge

machine learning libraries that will be used by the machine learning

engineers and the data scientists.


You would be an aspiring student or actively studying computer science or

mathematics.

An example would be the group of students from the university of Waikato,

New Zealand. They have created an open source machine learning library

called ​Weka​.

The core team of IBM that developed ​Watson​ would also belong to this

group.

Machine learning enthusiasts who contribute to open source projects like

Spark and Sci-kit learn would also be a part of this group.

How to go about understanding Machine learning?

The method of learning would depend on which of the above type of

learner you are?

If you are a business user who needs to get a high level of understanding

then your best bet is doing online research. There are plenty of resources

available on youtube to give you that preliminary understanding.


We at MCAL Global did a webinar that summarizes machine learning. We

recommend you watch it. It is available at the following link:

Machine learning with Python webinar

Machine learning engineers and data scientists will also find the above

webinar useful. It will give an overview on machine learning concepts and

its applications. It will also give details on the line items that needs to be

learnt to start this journey.

However if you are an aspiring machine learning engineer or data scientist

you will need professional training.

Just reading free stuff online and watching free videos will not give you the

kind of depth you are looking for. You will have to invest in some form of

structured training with a mentor.

We at MCAL Global have an online instructor led training called​“Machine

learning using Python”​ which helps people hone in their machine learning

and data science skills. It is a weekend course so working professionals


could also enroll in it. For detailed information on this course email us at

ml@mcal.in

If you plan to become a researcher who is looking to create machine

learning algorithms, we recommend that you enroll in a college for a long

term formal education in Computer Science or Mathematics.

If you are at a point where you are trying to make a decision whether to go

down the path of machine learning or not then try to answer two questions:

1. Does machine learning interest me?


2. What are the future prospects in the field of machine learning?

You yourself are the best judge on your interest in this field. You will have

to evaluate it yourself. No one can make that decision for you.

But if you have doubts on the future prospects of machine learning, then

look at the following info graphic:


What should I do next?

Use the table below to identify which kind of learner you are, then you can

review the learning approach recommended for you and relevant resources

suggested for you.

What Kind Of Learner


What should be Suggested
am I ?
my approach? Resources
● Machine
Learning
Business Learner-
with Python

For Business Webinar


Online Research & ● Contact us
Users,executive &
Videos at
managers ml@mcal.in
with your
questions

● Machine
Learning
with Python
Webinar

Machine Learning Online Self placed ● Instructor led

Practitioner – For IT courses or machine

Engineers and Data instructor led learning

Scientists courses using python


● Stanford
course on
coursera
● Contact us
at
ml@mcal.in
with your
questions

● Download
the book
Elements of
Statistical
Enroll in a masters
Theorists & researchers Learning
or postgraduate
– Machine learning ● Contact us
program in a
providers at
university
ml@mcal.in
with your
questions

Do I need to learn programming?

If you are a business learner then you don’t need to learn any

programming. In fact you don’t even have to know how various algorithms

work.
All you need to know is what is machine learning at a high level. What are

its strong points and where it doesn’t work. Armed with this information you

will be able to take strategic decisions.

Any problem that can be solved by machine learning should have the

following characteristics:

● There should be a pattern in the data.​ Without this basic


hypothesis machine learning doesn’t work. Machine learning doesn’t
work on random data. So it’s crucial that the data collected for solving
the problem has some patterns hidden in it. There is an underlying
correlation that exists.
● The pattern or correlation should not be known​. There should be
a general sense of pattern but the exact pattern should be unknown.
Because if the pattern is known then what’s the point of machine
learning?
● There should be lots of relevant data​. Machine learning algorithms
are data hungry and work well when lots of relevant data exists for
the algorithms to analyze and detect the patterns. As human beings
learn from experience, the machines learn from data. The more data
you have the more experienced your machine learning model will be.
A good overview of these concepts are in our machine learning webinar:

Machine learning with Python webinar

Which programming language should I learn?

If you want to become a data scientist or a machine learning engineer then

you will have to pick a programming language.

Without the knowledge of programming you will not be able to use machine

learning or create new algorithms. At some point in time you will have to

delve into the programming side of the world.

For machine learning you don’t need to understand the heavy duty GUI

intensive programming, web based or socket programming. All you need to

know is how to read, write and manipulate data. How to write mathematical

logic behind the algorithms.

In a nutshell your use of programming will be targeted towards machine

learning. Now here is where the biggest question arises…


Which programming language should I learn?

There are a plethora of programming languages Java, C, C++, .Net, Scala,

Ruby, Python, R etc. It gets confusing quite fast when it comes to making a

decision which one to use?

If you do a little bit of online research, it will become clear to you that

Python is emerging as one of the leader in the machine learning space. It is

followed by R.
You will notice that analysts who want to apply machine learning to solve

real world problems are jumping on the Python train.

The ones who want limited programming would and stick to academics are

going for R.

A quick google search on the top programming languages will convince you

that Python is in the top 5 list.

R or Python?

To help you make a decision we made a side by side comparison of the

strengths and the weaknesses of Python and R.


Python Programming Language

​ Strengths ​Weakness

● Open Source
● Platform Independent
● Amazing data
manipulation capabilities
● Object Oriented
Programming
● Top 3 programming ● To learn one must put more

Languages of the word effort than R

● Default programming ● I am thinking very hard but

language of Data can’t think of any other

Scientists weakness

● Amazing out of box


scientific libraries
● Huge community & fan
following
● Last but not the least -the
easiest language to learn
& use
R Programming Language

​Strengths ​Weakness

● Open Source
● Loved by statisticians
● Lot of out of box capabilities & ● Not as flexible in data
algorithms manipulation or data
● Huge community & fan munging.
following ● Do not have the full
● Pays emphasis on model flexibility of a
interpretability rather than programming
predictive Analytics. language.
● Much easier than python.

We suggest, take a deep breadth and analyze your needs before picking

your programming language. If you are confused and not able to make a

decision then just go for Python :). It is a good choice.

What topics should I cover in Programming?


In general learning a programming language is an ongoing process and

involves a lifetime of learning.

Luckily for machine learning and data science we don’t need to learn it all.

We recommend focusing on the following topics on any programming

language that you decide to pick.

● Programming Basics​ – Keywords, Statements, Operators, Data


Types
● Flow Control​ – If else, For loop, While loop
● Functions​ – User defined functions, Arguments, Return value
● File Handling​ – List files, Read files, Write to files
● Miscellaneous items​ – Exception Handling, Logging
● Amazing Libraries (applicable to Python)​ – Matplotlib, NumPy,
Pandas and Scikit-learn

Enough about Programming, what topics should I cover in Machine


Learning?

Machine learning at its core is made up of Data Structures, Algorithms,

Statistics, Linear Algebra, Probability theory and Calculus.


If you are beginning your career as a data scientist you will need to learn

about following topics:

● Statistics​ – Mean, Median, Mode, Standard Deviation, Normal


Distribution, Z-Score, histograms
● Probability​ – Basics, Bayesian Probability
● Calculus​ – Gradient Descent, Root mean square, Distance
● Machine Learning – Step Zero​ – Supervised Learning,
Unsupervised Learning, Regression, Classification, Clustering
● Machine Learning – Step One​ – Linear Regression, Polynomial
Regression, Regularization, Ridge Regression, LASSO Regression,
Logistic Regression
● Machine Learning – Step two​ – Decision Trees, KNN algorithm, K
Means Clustering, Principal Component Analysis, Linear Discriminant
Analysis, Quadratic Discriminant Analysis
● Machine Learning – Final Step​ – Deep Learning and other
advanced algorithms.

For more details please visit our website:​ ​https://mcalglobal.com/

We would love to hear from you about the article or any other question you

may have. Please drop us an email at ml@mcal.in to get in touch with us


with your question and feedback. Feel free to reach out to us with your

training or consulting needs as well. We would love to hear from you.

You might also like