Hands-On Artificial Intelligence for Banking
上QQ阅读APP看书,第一时间看更新

What is AI?

AI, also known as machine intelligence, is all about creating machines that demonstrate the intelligence that is usually displayed by humans in the form of natural intelligence. John McCarthy coined the termartificial intelligence in 1955.

AI has witnessed two winters so far: once in the 1970s with the reduction of funding by the Defense Advanced Research Projects Agency or DARPA (https://www.darpa.mil/), then known as ARPA, and another time with the abandonment of an expert system by major IT corporates such as Texas Instruments (http://www.ti.com/) and Xerox (https://www.xerox.com/).

In a way, AI aids in the process of transferring decision making from humans to machines, based on predefined rules. In the field of computer science, AI is also defined as the study of intelligent agents. An intelligent agent is any device that learns from the environment and makes decisions based on what it has learned to maximize the probability of achieving its predefined goals.

AI is capable of solving an extremely broad range of problems. These problems include, but are not limited to, simple mathematical puzzles, finding the best route from one location to another, understanding human language, and processing huge amounts of research data to produce meaningful reports. The following is a list of capabilities that the system must have in order to solve these problems along with a brief description of what each means:

  • Reasoning: The ability to solve puzzles and make logic-based deductions
  • Knowledge representation: The ability to process knowledge collected by researchers and experts
  • Planning: The ability to set goals and define ways to successfully achieve them
  • Learning: The ability to improve algorithms by experience
  • Natural Language Processing (NLP): The ability to understand human language
  • Perception: The ability to use sensors and devices, such as cameras, microphones, and more, in order to acquire enough input to understand and interpret different features of the environment
  • Motion: The ability to move around

How does a machine learn?

Let's take a quick look at the basics of machine learning. There are three methods that a machine can use in order to learn: supervised learning, unsupervised learning, and reinforcement learning, as described in the following list:

  • Supervised learning is based on the concept of mining labeled training data. The training data is represented as a pair consisting of the supplied input (also known as a feature vector—this is a vector of numbers that can represent the inputted data numerically as features) and the expected output data (also known as labels). Each pair is taggedwith a label. Thefollowing diagram illustrates the supervised learning method:
  • Unsupervised learning is based on a situation where the training data is provided without any underlying information about the data, which means the training data is not labeled. The unsupervised learning algorithm will try to find the hidden meaning for this training data. The following diagram illustrates the unsupervised learning method:
  • Reinforcement learning is a machine learning technique that does not have training data. This method is based on two things—an agent and a reward for that agent. The agent is expected to draw on its experience in order to get a reward. The following diagram depicts the reinforcement learning method:

Software requirements for the implementation of AI

The open source movement (which will be discussed in the Importance of accessible banking section) propels software development. The movement is coupled with the improvement of hardware (for example, GPU, CPU, storage, and network hardware). It is also supported by countless heroes who work on improving hardware performance and internet connectivity. These technicians have developed the AI algorithm to the point where it delivers near-human performance.

The following diagram depicts the typical technology stack that we should consider whenever we implement software to perform machine learning projects:

The following table breaks down several key technologies that contribute to the different software components mentioned in the preceding diagram:

With our brief introduction to the tools, technologies, and packages that we will use throughout the course of this book complete, let's now move on to explore an important area of AI—deep learning. The following section will explain deep learning and neural networks in detail.

Neural networks and deep learning

In addition to the open source movement, research breakthroughs in neural networks have played a big role in improving the accuracy of decision making in AI algorithms. You can refer to Deep Learning (https://www.deeplearningbook.org/)by Ian Goodfellow, Yoshua Benjio, and Aaron Courville for a more mathematical and formal introduction, and you can refer to Deep Learning with Keras (https://www.packtpub.com/big-data-and-business-intelligence/deep-learning-keras?utm_source=github&utm_medium=repository&utm_campaign=9781787128422) by Antonio Gulli and Sujit Pal for a concise analysis for developers.

Deep learning is a special subfield or branch of machine learning. The deep learning methodology is inspired by a computer system that is modeled on the human brain, known as a neural network.

Online customer support by banks via a mobile or web application chatbot is an excellent example of deep learning in banking. Such applications (that is, chatbots) are powerful when it comes to understanding the context of customer requests, preferences, and interests. The chatbot is connected to backend applications that interact with data stores. Based on the customer's inputs or selection of services, the chatbot presents to the customer various alternative sub-services to choose from.

The chatbot or deep learning applications work in layers. It can be compared to learning a language. For instance, once a person masters the alphabet by rigorously learning how to identify each letter uniquely, they will be eligible to move on to the next layer of complexity—words. The person will start learning small words and then long words. Upon mastering words, the person will start forming sentences, understanding grammatical concepts at different layers of complexity. Once they reach the top of this hierarchy of layers of complexity, the person will be able to master the language.

You might have noticed that in each phase or layer of the hierarchy, the learning becomes more complex. Each layer is built based on the learning or knowledge gathered from the previous layer of complexity. This is how deep learning works. The program keeps on learning, forming more knowledge with new layers of complexity based on the knowledge received from the previous layer. The layered complexity is where the word deep was taken from. Deep learning is a type of unsupervised learning, so it is much faster than supervised learning.

The major impact of deep learning is that the performance of the model is better as it can accommodate more complex reasoning. We want financial decisions to be made accurately. This means that it will be more cost-effective to give theshareholders of banks a reasonable return while balancing the interests of the bank's clients.

What we expect from a smart machine is as simple asinput,process, andoutput, as shown in the following diagram:

In most financial use cases, we deploy supervised learning, which resembles the process of training an animal—here, you provide a reward for a correct outcome and discourage an incorrect outcome. That's why we need to have the outcome (that is, the target variable) for training to happen.

Hardware requirements for the implementation of AI

While setting the budget for the hardware required by a bank, you need to ensure that it encapsulates the right configurations. This will allow you to deliver the promised results in terms of financial results or time to market, especially now that you are about to start a bank from scratch!

You'd better be sure that every penny works, given that the economic pressures on banks are pretty high. In order to do any of this, we need to understand the contribution that hardware makes to AI in order to ensure we have the right resources.

Graphics processing units

Besides the software and algorithms, the use of a Graphics Processing Unit(GPU) and Solid-State Drive(SSD) helps to speed up machine learning. The use of GPUs and SSDs makes it possible for a computer to think intelligently.

A GPU is a specially designed circuit that can process calculations in a parallel manner. This applies to computer graphic processing, where each of the pixels needs to be processed simultaneously in order to produce a full picture. To visualize this, suppose that there are 10 pixels to be processed. We can either process each of the 10 pixels one by one, or we can process them in 10 processes simultaneously.

TheCPU has the unique strength of having a fast processing time per pixel, while the GPU has the strength of multiple threads to handle flat data all at once. Both CPUs and GPUs can do parallel data processing with varying degrees. The following table shows the difference between sequential and parallel data processing:

Aside from being great at processing images, a GPU is also leveraged for deep learning. Although deep learning describes the number of layers the neural network has, deep neural networks are often characterized as having a wide record and lots of variables to describe the input.

When used in combination with a GPU, the SSD also improves the speed to read and write data to the CPU/GPU for processing.

Solid-state drives

Another hardware requirement for machine learning is a storage device called an SSD. The traditional hard disk has a mechanical motor to place the head that reads or writes data at a designated location on the magnetic tape or disk. In contrast to this, the SSD reads and writes data using an electric current on a circuit without the movement of a motor. Comparing the mechanical movement of motors with the electric current onboard, an SSD has a data retrieval speed that is 20 times faster.

For students in operation research, comparing the two is as simple as identifying the hardware capacity, which is akin to how we design a factory—find the capacity and reduce the bottlenecks as much as possible!

Modeling approach—CRISP-DM

CRISP-DM refers to a cross-industry standard process for data mining. Data mining is the process of exploring large amounts of data to identify any patterns to be applied to the next set of data to generate the desired output. To create the models in this book, we will use the CRISP-DM modeling approach. This will help us to maintain a uniform method of implementing machine learning projects. The following diagram depicts the project execution using the CRISP-DM approach in a machine learning project:

As you can see in the preceding diagram, there are various phases of the CRISP-DM approach. We can explain them in detail, as follows:

  1. Business Understanding: This phase involves defining the business objectives for the project. During this phase, you clarify the queries related to the core business objectives. For example, a core business objective may be to predict when the customers leave a particular website using the historical data of the customer's interaction with the website. The relevant query to clarify might be whether the payment interface currently in place is the reason for customers navigating off the website. Business success criteria are also laid out during this phase of the project execution.
  2. Data Understanding: This phase involves understanding historical data that is mined in the database or data store. The data is analyzed for its size, format, quantity, number of records, significance in relation to business, fields, source of data, and more.
  1. Data Preparation: This phase involves raising the quality of the data to the level required for the machine learning algorithms to process it. Examples of data preparation include formatting data in the desired format, rounding the numbers to an acceptable degree of precision, and preparing derived attributes.
  2. Modeling: This phase involves selecting a modeling technique or algorithm to be applied. A modeling algorithm is used to find a function that, when applied to an input, produces the desired output.
  3. Evaluation: This phase involves assessing the accuracy of the training model that was built in the previous phase. Any required revisions to the model are made in order to increase efficiency and accuracy.
  4. Deployment: This phase involves defining a deployment strategy for the training model in the live environment to work on new data. The models are monitored for accuracy.

After roughly covering what AI is, how machines learn, and the methods of AI implementation, it is now time to look at banking as a sector or industry. In the following section, we will explore the various types of banking and the challenges involved in the banking sector.