Photo by Pierre Acobas on Unsplash

Contents:
  1. Introduction
  2. Searching Strategies
  3. Local Search
  4. Adversarial Search
  5. Knowledge Based Agent
  6. First Order Logic

Reference:

Introduction

Definition
Humanly Rationally
Thinking Thinking humanly — cognitive modeling. Systems should solve problems the same way humans do. Thinking rationally — the use of logic. Need to worry about modeling uncertainty and dealing with complexity.
Acting Acting humanly — the Turing Test approach. Acting rationally — the study of rational agents: agents that maximize the expected value of their performance measure given what they currently know.


Foundation of Artificial Intelligence
  • Philosophy: logic, methods of reasoning, foundations of learning, language, rationality
  • Mathematics: logic (formal representation and proof), algorithms, computation, (un)decidability, (in)tractability, probability
  • Economics: formal theory of rational decisions
  • Neuroscience: plastic physical substrate for mental activity
  • Psychology: adaptation, phenomena of perception and motor control, experimental techniques (e.g. psychophysics)
  • Computer Engineering: how to efficiently build a computer to build AI?
  • Control Theory and Cybernetics: simple optimal agent designs
  • Linguistics: knowledge representation, grammar

An agent is something that acts, and has its own behaviour. It can also be defined as anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators. Rational agent acts to achieve the best (expected) outcome. This (rational agent) approach has two advantages: more general compared to the "laws of thought" approach, and more amenable to scientific develepment.

In building rational agents, the following (called the PEAS) must be defined:
  • Performance
  • Environment
  • Actuators
  • Sensors

The structure of an agent is the architecture (hardware: sensor and actuator), and the program (software). There are four types of agents, which are then generalized into one, that is learning agents:
  • Simple Reflex Agent: Does an action based on the current state, and ignores past sensor history.
  • Modal Based Agent: The world/environment is modeled through the use of sensor and action history, and does an action based on the world model.
  • Goal Based Agent: Does an action based on informations gathered from the world model and goal information.
  • Utility Based Agent: Does an action based on its own happiness/utility. This is also the measurement of the agent's performance.

Searching Strategies

Problem Solving Agents
Solving a problem can be simplified if the agent can adopt a goal, and aim to satisfy it. Goal formation, which is the first step in problem solving, is done in the beginning of solving a problem. Next, problem formulation, which is formally defining a problem by five components, must be done:
  • Initial State: The state where the agent starts
  • Actions: Moves which the agent can use/take
  • Transition Model: Description of each action
  • Goal Test: Check whether a state is a goal state
  • Path Cost: Measure the cost to achieve the goal

Searching for Solution
Search tree can be used to model the sequence of actions:
  • Root: Initial state
  • Branches: The actions
  • Nodes: Result of an action
  • Expanding: The process of generating child nodes, that is, after performing an action.

The following are the often used terms:
  • Leaf Node: Node that doesn't have children.
  • Frontier: The set of all leaf nodes available for expansion at any given point.
  • Explored Set: The set of all expanded nodes.

Wwe can measure the performance of that solution:
  • Completeness: Is the algorithm guaranteed to find a solution when there is one?
  • Optimality: Does the strategy find the optimal solution?
  • Time Complexity: How long does it take to find a solution?
  • Space Complexity: How much memory is needed to perform the search?

Uninformed Search Strategies and Informed Search Stratgies
The former searching strategy does not require any knowledge or information regarding the available states. On the other hand, the latter requires some specific knowledge which helps in finding the solution. These specific knowledge refer to evaluation cost f(n) and heuristic function h(n).

Uninformed Search Strategies - Breadth-First Search (BFS)
The shallowest node is expanded first repeatedly until the goal node is about to be expanded. At this moment, the solution path has been found.

  Example 1 - BFS Simulation
Simulate the node expansion from S to G using BFS:
BFS Example

Solution:
BFS Solution Tree

The path found is A → B → C → G.

Uninformed-Cost Search (UCS)
Uninformed-Cost Search is one of the Uninformed Searching Strategies. In this strategy, the node with lowest path cost is expanded first repeatedly until the goal node is about to be expanded.

  Example 2 - UCS Simulation
Simulate the node expansion from S to G using BFS:
UCS Example

Solution:
UCS Solution Tree

The path found is A → D → F → G.

Depth-First Search (DFS)
Depth-First Search is one of the Uninformed Searching Strategies. In this strategy, The deepest unexpanded node is expanded repeatedly until the goal node is about to be expanded.

  Example 3 - DFS Simulation
Simulate the node expansion from S to G using BFS:
DFS Example

Solution:
DFS Solution Tree

The path found is A → B → C → G.

Other Uninformed Searching Strategies
These uninformed strategies is similar to DFS, but with a twist. First, the Depth-Limited Search (DLS). This is basically DFS, but has the depth limit l, where nodes at level l does not have any successor. This limitation solves the infinite path problem. Lastly, there is Iterative-Deepening Search. This is basically DLS, but the said limit is increased until the depth of the shallowest solution d is reached.

Greedy
Greedy to expand the node that is closest to the goal, or it can be said that f(n) = h(n). It will keep expanding nodes with the least h(n) until the goal node is about to be expanded. At this moment, the solution path has been found.

A* Search
A* search tries to expand the node by combining the cost to reach the node (path cost) and the cost to get from the node to the goal, or it can be said that f(n) = g(n) + h(n). It will keep expanding nodes with the least g(n) + h(n) until the goal node is about to be expanded. At this moment, the solution path has been found.

Unlike the previously discussed searching strategies, the solution is in the form of a path. However, in many optimization problem, the goal we're trying to achieve is the solution. For example, if we were to arrange 8 queens in a board such that none of them attacks one another, the only thing that matters is the board configuration, not the order in which they were added.

The main idea of a local search is to use one node (state), and keep moving to that node's neighbor repeatedly to improve that node until the best state is achieved. Local search does not maintain a tree, so it does not require much memory. It can also manage to often find satisfying solution in large or infinite state spaces. Local search algorithms are useful for solving pure optimization problems, and an objective function is often used to find a best state.

Hill Climbing (Greedy Local Search)
It starts by choosing a random node, and keeps moving in the direction of increasing value repeatedly until a peak is reached. Though, there are problems with this algorithm. First, the peak the algorithm found could be the local maxima/minima not the global maxima/minima. Second, there could be more than one local maxima/minima. Third, there could be several adjacent nodes with the same value creating a plateau of local maxima/minima.

However, there are variants of hill climbing which overcame these problems:
  • Sideways Move: Escapes from plateau where best successor is equal to the current state.
  • Random Restart: It keeps trying to to find a goal or several possible solution and choose the highest one.
  • Stochastic: It keeps trying to to find a goal or several possible solution and choose the highest one.

There are other searching method similar to this:
  • Local Beam Search: Maintains k states, selects k best states and shares that information among the states.
  • Stochastic Beam Search: Those k states are chosen randomly, which helps alleviate the problems of states grouping in the same part of the state space.

Genetic Algorithms
It is a variant of the stochastic beam search, and it is inspired by natural selection. It starts by choosing a population, or k randomly generated states. The objective function is called a fitness function where the higher the value the better it is.

Adversarial Search

Alpha-Beta Pruning
In games, finding the optimal strategy allows us to win the game. When finding our best possible moves, we also consider that our enemy also plays optimally too. So, minimax strategy can be used. While both achieve the same thing, alpha-beta pruning can be said as the optimized version of the minimax algorithm. Alpha-beta pruning can prune, or skip calculation on choices that won't be optimal.

  Example 6 - Alpha Beta Pruning Simulation


Knowledge Based Agents

Introduction
A knowledge based agent keeps tracks of things. We can tell it facts and ask for inference, which involves deriving new sentence from old sentences (inference). It takes percept as input and return an action.

Building a Knowledge Based Agent
  • Declarative Approach: Starting with an empty knowledge base, the agent designer tells a sentence one by one to the agent until the agent knows what to do.
  • Procedural Approach: Desired behavior is directly encoded as a program code.

Propositional Logic
It is the simplest logic. It uses 5 connectives, which is negation, conjunction, disjunction, implication and biimplication. It can be in the form of an atomic sentence (which only contains 1 connective) or a complex sentence (which contains multiple atomic sentences).

First Order Logic

Various Logics
Language Ontological Sentence Value
Propositional Logic Facts T/F/U
First Order Logic Facts, Object, Relations T/F/U
Temporal Logic Facts, Object, Relations, Times T/F/U
Probability Theory Facts Degree of Belief ∈ [0, 1]
Fuzzy Logic Degree of Truth ∈ [0, 1] Specific Interval Value

Introduction
First-Order Logic consists of few basic elements:
  • Constants: Specific entity or object.
  • Predicates: Relations.
  • Functions: Functions.
  • Variables: x, y, a, ...
  • Connectives: Same as the connectives used in Propositional Logic.
  • Equality: =
  • Quantifiers: ,

  Example 7 - FOL, CNF and Resolution Tree
Given sentences as premises:
  1. Jack owns a dog.
  2. Every dog owner is an animal lover.
  3. No animal lover kills an animal.
  4. Either Jack or Curiosity killed the cat, who is named Tuna.

Create the FOL and CNF for each premise, and Proof By Resolution that curiosity killed the cat!

Solution:
  1. Jack owns a dog.
    FOL:
    CNF:
  2. Every dog owner is an animal lover.
    FOL:
    CNF:
  3. No animal lover kills an animal.
    FOL:
    CNF:
  4. Either Jack or Curiosity killed the cat, who is named Tuna.
    FOL:
    CNF:

Since we want to prove that curiosity killed the cat, we will use the premise , which is the negation of what we want to prove:
Resolution Tree


Quantifying Uncertainty

Acting Under Uncertainty
Agent may need to deal with uncertainty, whether due to partial observability and/or nondeterminism. Its knowledge, at best, can only provide a degree of belief in the relevant sentences. Which is why, to deal with degree of belief, probability theory is used since it provides a way of summarizing the uncertainty.

Sample Space
In probability theory, the set of all possible worlds (all possible outcome) is called the sample space. The probability of each possible world is
Propositions
There is also propositions, which is the set of two or more possible worlds:
For any φ,

Unconditional & Conditional Probability
There are two kinds of probability, unconditional and conditional. The main difference is that the former doesn't require prior knowledge and is not affected by any events. When flipping a coin or rolling a die, the outcome won't be affected by anything. The latter is affected by a specific event (hence the name conditional) and requires prior knowledge. When a red ball is already taken out of a bag full of colored balls, the outcome of getting a specific colored ball changes.
Bayes Theorem
Let A and B be an event, where B affects A. Then, the probability of A happening given B is:

Random Variables
In probability theory, variables (like the sum of n dice rolls) are called as random variables. Every single variable has a domain (e.g. all possible outcome). For instance, a variable of a single die roll is from 1 to 6.
Basic Probability Theory
It is the probabilities of all possible values of a random variable.
Probability Density Function
It is a way to write probability distribution when there are infiniately many values.
Joint Probability Distribution
It is a way to write probability distribution when there are more than one variables.

  Example 8 - Bayes Theorem
Suppose we have three rigged card packs. Each pack contains 5 cards, and:
  • Pack I contains 1 rare card, and 4 common cards.
  • Pack II contains 2 rare card, and 3 common cards.
  • Pack III contains 0 rare card, and 5 common cards.

Suppose we draw a random card from one of these card packs. We are only told that the card we drew is a rare card. What is the probability of that rare card drawn from Pack I?

Solution:
First, let's define the events:
  • C1 as the event of choosing pack 1.
  • C2 as the event of choosing pack 2.
  • C3 as the event of choosing pack 3.
  • R as the event of taking a rare card.

So, from the problem, we know that:

What we want to calculate, is :


Hence, the probability of the rare card to be drawn from Pack I is one third.


Fuzzy Sets

Introduction
Traditional set theory requires an element to be either a part, or not a part of a set. For instance, binary-valued logic only allows the values of the parameters to be etieher 0 or 1. Human reasoning isn't always as exact as this. So, we can use fuzzy sets and logic to approximate reasoning.

Fuzzy sets are an extension of two-valued sets to handle the partial truth concept, which is the natural language certainty modeling. Every element of a fuzzy set have its own membership degree which indicates the certainty whether the element belongs to that set, or not.

Definition
Suppose X is the domain (universe of discourse) and x is a specific element of the domain X. The fuzzy set A is described by a membership mapping function:


Fuzzy sets can be defined for discrete (finite) or continuous (infinite) domains. Compare the notations used for two types of domain (summation and integral aren't algebraically interpreted):
  • If the domain X is discrete, the fuzzy set can be denoted as:
  • If the domain X is continuous, the fuzzy set can be denoted as: