Imagine a world where computers learn from data, make predictions, and can even be better than humans at some tasks. That’s the fascinating realm of machine learning (ML)! It’s a field that’s transforming industries, from healthcare to finance, allowing doctors to uncover hidden medical issues in images and helping investors make sense of stock markets and property trends. ML has this incredible ability for tackling complex problems that can leave us humans scratching our heads.
This is the start of my series on mastering machine learning. I’ll cover all the essential machine learning concepts in this collection blog posts. In this article, I’ll start discussing what machine learning is and how it relates to the programming concepts you’ve likely encountered in your computer science journey. So, let’s dive in and explore the fundamentals of machine learning together!
The Basics
Let’s start with the fundamentals. Programming is like giving instructions to a computer to perform a task. Over the time, computer scientists have developed concepts like “data structures” (ways to organize data) and “algorithms” (step-by-step problem-solving guides) to make programming better. With traditional programming, you can create all sorts of software applications, from dynamic websites to user-friendly command line tools. These are the building blocks of the digital world.
Machine learning, on the other hand, is a part of artificial intelligence (AI). Instead of giving a computer manual instruction, ML involves creating algorithms and models that allow computers to analyze data, recognize patterns, and make their own smart decisions or predictions. It is all about using math and statistics to help your computer make decisions like a human would. Machine learning and programming have a lot in common. ML takes programming a step further by adding statistical algorithms into the mix.
Note: In programming, the code you write is called a program, while in machine learning it’s referred to as a model.
A machine learning model is like a living entity that can take decisions based on data. So, what’s crucial in machine learning is DATA, which brings us to the foundational approaches in programming and machine learning.
Let’s Talk About Approach
Programming relies on rule-based or logic-driven approaches. You should explicitly define the rules and logic that guide the computer’s behavior. In contrast, machine learning employs the “data-driven” approach. You provide data to the model, which then extracts and learns the underlying patterns using various techniques such as regression, classification or clustering. It’s similar to how humans learn from experience; machines learn from data.
The cool part is, once trained, the model can predict outcomes with pretty good accuracy even if you give entirely new data, it hasn’t seen before.
Data and Learning
Now, when it comes to data in machine learning, there are two main types: “labelled” and “unlabeled.” Labelled data contains inputs mapped with outputs, so that the model can learn the patterns and validate itself to providing better answers. Usually, this type of method is called Supervised learning, which needs human intervention for providing the labels. Supervised learning also has high accuracy as the model can exactly determine the pattern based on mappings.
However, with unlabeled data, it’s not easy to predict the output without some effort. The model needs to find its own patterns and group things together based on some parameters. This technique is called “clustering”, and this type of learning is known as Unsupervised learning.
Both have their uses, and which one you choose depends on the situation and the data you have. There are some additional methods such as Semi-supervised learning and Self-supervised learning, which uses a combination of both labelled and unlabeled data to make predictions. You can refer to this article for more information on types of machine learning.
Machine learning models are only as good as the data they are trained on. It is important to use high-quality data that is relevant to your problem. Even a small error in the data can lead to significant inaccuracies in the model’s prediction power. Based on my own experience, I’ve seen a case where changing the dataset I used boosted the model’s accuracy significantly.
Computational demands
Let’s discuss the aspects in computation level. Central Processing Units (CPUs) execute instructions sequentially, one at a time. Programming typically deals with text-based input or outputs and doesn’t need much computational resources.
In the case of machine learning, the models often need large datasets, sometimes including images and videos for training. While CPUs are good at handling various tasks, they are not particularly efficient at any specialized task.
Graphics Processing Units (GPUs) are specially designed for graphics and handling repetitive calculations quickly and in parallel. This makes them ideal for training machine learning models faster. Without GPUs, training complex models might take hours even days.
Furthermore, Tensor Processing Units (TPUs) are introduced for deep learning tasks, making the computation even faster and more efficient for certain AI applications. You can find more information about types of computation resources in this video.
“Machine learning has high computational demands.”
You can run on an ordinary computer, but it might take months for training powerful models. For learning purposes, I prefer to use cloud platforms like Google colab, GitHub codespaces because they provide powerful CPUs. This not only saves you valuable time but also allows you to experiment and learn more effectively.
Tools of the Trade
You can write code in many ways, and as the field of programming has evolved, different paradigms for creating optimized and scalable programs have emerged. Two of these are functional programming, object-oriented programming. Functional programming is a way of writing code that uses simple functions and reusing them when needed. Object-oriented programming is a way of writing code that uses objects. Objects will have both data and instructions on what to do with that data.
You can also choose different programming languages, from basic ones like C to fancier ones like C++ and Java. These languages give you different levels of control, speed, and ways to write code, so you can pick what fits your project best. Most languages support both paradigms. Choosing one depends on the task at hand.
When it comes to machine learning, the concept is not confined to a single programming language. You have the flexibility to implement machine learning models using a variety of languages, depending on your preferences. However, Python stands out as the most popular choice in the field of machine learning, thanks to its rich ecosystem of libraries and ease of use. Python’s machine learning libraries, such as TensorFlow and PyTorch, provide pre-written code and tools that simplify the development process of complex models.
Conclusion
Machine learning is a powerful tool that can be used to solve a wide range of problems. It is a data-driven approach that requires large amounts of data to train and deploy machine learning models. It needs good computational power for better data processing. Machine learning models can be implemented in any programming language, but there are pre-written libraries available that can make the process easier and faster.
Further, you can use any programming language to implement machine learning. For implementing complex models easily, you can use pre-written libraries for faster development processes. Find how to python for machine learning in the next post.