Today’s smart software is based on all kinds of technologies. For example, do you know the difference between artificial intelligence, machine learning and deep learning? And what is big data all about? And what exactly does a data scientist do? In this article, we explain all these terms and what they can mean for your organisation.
Artificial intelligence (AI) is simply what the term says: the same kind of intelligence that we know from humans and animals, but then recreated with computers. However, behind this simple term lies great complexity, because scientists and philosophers still haven’t quite figured out what intelligence is, let alone how to create an artificial form of the phenomenon.
We’re still a long way from copying general intelligence, but we can break intelligence down into various sub-domains. Reasoning, problem solving, planning, learning, using language, seeing; these are all forms of intelligence. And for each of these forms there is also a domain within AI aimed at integrating human capabilities into a computer.
A large – and probably the best-known – domain within AI is machine learning. This domain aims at reproducing human beings’ ability to learn. As human beings, we constantly learn from situations in our daily lives. For example, if a child looks at a cat and the child’s parents say ‘cat’, repeating it over and over again, even with other cats, the child will learn to recognise cats over time.
Machine learning techniques work in the same way, based on examples. You give such a system numerous examples of pictures of cats with the description ‘cat’, and from that training the system learns what all those pictures have in common. If you then show the system a picture of a cat that it has not seen yet, it will also recognise the cat in this picture.
In your own business, of course, you will probably not be looking to recognise cats. But there are countless things you can machine-learn. For example, you can develop a predictive model for the probability of a purchase or next purchase. This will then rank prospects based on the likelihood that they will buy again. This will allow you to give a huge boost to the success rate of your sales pitches.
(Artificial) neural networks are an important approach to machine learning. As their name suggests, this approach is inspired by the workings of the human brain. After all, the human brain consists of a network of connected neurons, the brain cells, which transmit electrical pulses to each other. Similarly, an artificial neural network consists of artificial neurons, which can receive input from other neurons and pass output to other neurons.
Usually, these neurons are built in layers in an artificial neural network. You then have an input layer, one or more intermediate (hidden) layers and an output layer:
We speak of deep learning when the network has a large number of hidden layers. Deep learning has led to major advances in AI over the last decade in areas such as object recognition, speech recognition, quality assurance, fraud detection and even medical diagnosis.
Deep learning is therefore a specific machine learning technique, like others that exist. And machine learning is one of the domains of artificial intelligence. Visually, we can represent it like this:
This representation also contains two other concepts: big data and data science. Let’s start with the first one.
We speak of big data when we are talking about a data collection that is too large or too complex to process with a traditional database management system. A problem in machine learning, where you have to analyse hundreds of thousands of photos, for example, is big data. Deep learning lends itself perfectly to tackling big data problems.
In practice, you switch to big data techniques the moment you can no longer run the machine learning system on a PC. Often, you will run into limitations such as too little memory or too little processor power for the large amount of training data. The solution is then to run the deep learning software on clusters of computers in a data centre. The task is then divided and executed in parallel on several computers.
So what is data science? This is an interdisciplinary field that uses scientific methods and algorithms to gather knowledge from large data collections in order to solve problems in all kinds of application areas.
In the image above, you can see that data science is part of AI, especially machine learning. Data science includes the field of big data and part of deep learning. But data science also includes aspects that fall outside the domain of AI.
What does a data scientist do?
Exactly what these aspects of data science outside AI are becomes clearer when we analyse what a data scientist does. The skills of a data scientist fall into four areas:
- AI: knowledge of machine learning and related techniques, as well as statistical models and related mathematics.
- Computer science: knowledge of programming languages such as Python and R, databases such as SQL and NoSQL and deployments in the cloud.
- Domain knowledge: knowledge of the business context of the applications in which AI solves problems.
- Communication: the skills to explain and visualise data-driven insights.
Example: Detecting presence of impurities
An example will help clarify things. A waste management company wants to use an AI-based camera along the conveyor belts to detect impurities. The data scientist uses her domain knowledge of the business to translate the company’s assignment into an algorithmic problem. She draws on her knowledge of AI to use deep learning to train a model on photos of pure and impure plastic streams. To achieve this, she writes code in Python which he runs on a cloud platform. She visualises the results using her communication skills.
From data to action
Of course, companies have been able to extract insights from data for some time. This is often referred to as Business Intelligence (BI), with reports and interactive dashboards as tools.
Usually, however, BI insights are limited to looking back: What happened and why? There are also quite a few manual processes involved and the users still have to make a lot of decisions in these processes themselves.
Advanced analytics and especially AI go one step further and allow you to look ahead. The results are predictions (What will happen?) and recommendations (What should I do?). In addition, only minimal manual processes and decisions are needed to take action.
The most advanced forms of AI also automate these last human steps. This allows the system to perform actions fully automatically on the basis of data.
Getting started with data science
Do you want to improve your business results by using advanced data technologies such as AI, machine learning, deep learning and big data? But you don’t know how to start?
Cegeka has an extensive team of data scientists who will help you in your quest to move from data to action. We offer a total package of data solutions and use a proven approach that goes further than just a proof of value.