Updated: Sep 16
In the last 1 year artificial Intelligence (AI) has moved from hype to real world usage thanks to applications like ChatGPT, Bard and so on. In this blog we review what AI is , where it stands today and its relationship to machine learning and deep learning. The last part of the blog examines how AI seems to be the next logical phase in the human history of information processing.
Right now for most people, AI falls into one of two categories: Either it is this wonderful technology that will solve a lot of problems or it is this evil technology that cannot be trusted. Let us look at what it really is and is not.
What is Artificial intelligence ?
Artificial intelligence is the use of data, math and computer programs to try to do what humans do with information.
What are some human behaviors that a computer might be able to replicate ?
As human we are talk, listen, understand and act based on that knowledge. In computer science speech recognition has been around for while. Alexa and Siri can talk as well.
As humans we can read, write. In computer science this falls under the topic natural language processing.
We can see images and remember them. In computer science the field of computer vision addressed this area.
Let us expand the definition we wrote earlier. Artificial intelligence is the use the machines that includes hardware and software to mimic human brain and human behavior. Artificial intelligence is a broad concept that include machine learning, robotics, deep learning and many other topics as shown in the figure below.
But then, what are machine learning and deep learning ?
Given data, humans can classify data and deduce patterns. But there are limits to what the human brain can do. Machine learning is the field in computer science where computer programs learn data and then predict or classify. What do we mean by "learning" ?
In machine learning, the program is fed a large number of examples (input , output) as input in what it called training. The training produces what is called a model. The model is then able to predict results for inputs who results are unknown. Model is nothing but an algebra function like y = mx +c (simple linear function) and in most cases it will be a complex polynomial function. So you see the terms "model" and "learning" are quite misleading.
Let us use a example to show how this technology helps.
The graph below in figure 2 plots sales of products as function of page views.
Given a function with 2 variables you can visualize and predict values of y (sales) given x (views). However If I just gave you a number of values of x and y, without giving you the function represented by the line, what would you do ? Well, a mathematician can come up with the function using known techniques like linear regression. Once you have a function, you can predict y, for a new value of x. This is essentially what machine learning is.
But we know how sales depend on other things as well besides views. Let us add another variable z which might be the number of times the product was added to a shopping cart. Now it becomes harder to visualize and arrive at the 3 dimensional graph and function.
But It is still manageable. But real world problems have lot more variables. What happens when we have 4,5,20,100 .... 1000 ..... 10000 variables. It goes beyond the scope of what the human brain can comprehend and image. That is where computers excel at doing the work for us.
In another example, the program is fed emails that are tagged as SPAM or NOT SPAM. Later when the program is given a new email, it can tag the email as SPAM or NOT SPAM. This is a form of machine learning called classification.
Neural networks and Deep learning
Deep learning is a specialized form of machine learning. If you understand machine learning as producing a function that takes input and produces output, then think of deep learning as a network of such functions. Multiple layers of nodes where each node takes an input and produces an output. Each layer has multiple nodes. Output from each layer is input to another layer. In the figure below, the nodes are analogous to the neurons in the brain and the input output pathways are like the neural network in our brain. Deep learning can solve much more complex problems than traditional ML. Think of each layer as solving the problem a little more and getting you closer to the final solution.
A classic example for deep learning is being able to recognize handwritten numbers or text.
For more details see the video listed in reference 1 which is excellent introduction to deep learning.
Of late, the buzz is not about ML or deep learning but generative AI. Generating AI uses the above techniques to detect patterns using large amounts of data and generate new content. Generative AI uses a combination of supervised and unsupervised ML, deep learning and other new techniques. The most popular generative AI application today is ChatGPT ( generative pre trained transformer). It can take questions in plain language and produce answers. At the least, it is a better search engine because it gives you direct answers ( as opposed to sifting through 100s of links that a typical search returns. Not only that, it can generate essays, images, write code and much more. For example, I can ask ChatGPT "Write a sample spring boot application that consumes messages from rabbitmq". It will generate sample code. As of today, It is not always accurate but there is a lot of potential.
The essence of AI
The essence of how most of AI works is captured in 3 concepts : cost function, gradient descent and back propagation. I will not explain them here. but in the resources section I list 2 good videos that explain these topics much better than I could ever do. If you understand these three concept at least intuitively, you are well on your way to understanding how AI works under the hood.
Where computers excels at several levels better than the human mind is processing absorbing, remembering or detecting patterns in large amounts of data. So in some ways computers and AI can enable us to do more that we can do with our own brain. Computers and AI however cannot do anything without data. Human brain on the other hand can think and do well even when there is insufficient data.
Looking at it from a historical angle, artificial intelligence just seems like another step in the evolution of human information processing history.
The first generation was when humans developed scripts and developed the ability to write and read information. This happened about 5000 years when human developed symbols and started carving them on stones. Around the same time numbers were also developed. Humans were no longer limited by what their brains could store. Information could be stored outside the human brain and retrieved when needed.
The second generation was the publishing on paper. With paper and ink, the pace at which information could be published and shared increase substantially. Early writings were by hand using paper and ink. By 15th century printing presses automated the publishing of copies of books.
The third generation was the invention of computers to digitally store information (1940/50).
The fourth generation is the modern internet era where scale of digital information increased many times. The scale and size of the data is simply beyond the humans brains capacity to learn and comprehend.
Every step of evolution expanded the human mind and enabled it to do more.
I see artificial intelligence as the fifth generation of evolution where these AI tools process all the massive digitized data and help humans make better decisions and become more productive.
In summary, Artificial intelligence is field based on math and computer science that tries to mimic what the human brain does. You do not need to be an expert to use and benefit from artificial intelligence. As you have seen in the case of ChatGPT, the experts will develop the tools. Whether you are a doctor or salesperson or certified accountant or stock broker, you will use AI tools to make yourself more productive. But it is very useful to understand the underlying technologies at least at a high level. That will put you in a better position to decide when to rely on AI and when not to.
Artificial intelligence will not replace any human functions that require even a little bit of thinking. But it will for sure increase our productivity.
1. Neural networks and deep learning
2. Spelled out introduction to neural networks and back propagation
3. Sapiens: A brief history of humankind