• Same day as lecture day
• Exam will only be available from 12:00am to 11:59pm on the exam day
• When exam starts, have to finish the exam in one attempt
• Format: Online
Final-term topics
• First-order Logic and Inference in First Order Logic
• Markov Decision Process
• Reinforcement Learning
• Learning: Supervised learning, Linear Regression, Logistic regression, Neural
• Deep Learning: DNN, CNN, RNN, LSTM
• Artificial Intelligence – A Modern Approach, Third Edition, Stuart J. Russell and Peter Norvig
• First-order Logic+ Inference in First Order Logic: Lecture Slides
+ Chapter 8 + Chapter 9
• Markov Decision Process: Lecture Slides+ Chapter 17
• Reinforcement Learning: Lecture Slides+ Chapter 21
• Learning: Lecture Slides + Chapter 18
• Deep Learning: Lecture Slides
Sample Question
• Suppose discount factor γ= 0.5. Calculate utility values for all the grid
position using Value Iteration for time t= 0 to time t=3
Sample Question
Write the Bellman equation for state (1, 2)
Sample Question
• Suppose i
is the policy shown in right fig:
– Then we have i
(1, 1)=Up, i
(1, 2)=Up
– Write the simplified Bellman equations for Ui
(1, 2)
Sample Question- True/False
• In an MDP, the larger the discount factor, the more strongly favored are shortterm rewards over long-term rewards. True or False. If false, correct the
• The utility U∗ of the optimal policy π∗ must satisfy a set of equations
called the Markov conditions. True or False. If false, correct the statement
• MDP instances with small discount factors tend to emphasize near-term
• For reinforcement learning, we need to know the transition probabilities
between states before we start
Sample Question
• For each of the below applications/scenarios described below, indicate
which technology is best suited: RL or MDP
• You are playing a game of Tic Tac Toe against a random opponent.
You can see the board and choose actions, but your opponent
choose random actions.
Sample Question
• The law says that it is a crime for an American to sell weapons to hostile
nations. The country Nono, an enemy of America, has some missiles, and all
of its missiles were sold to it by Colonel West, who is American.
• Prove that Col. West is a criminal
• Prove by Forward Chaining and/or Backward Chaining
Sample Question
• What are the choices of weight vector [w0, w1, w2] that can classify y as y= x1 XOR x2?
Sample Question
• Consider a Convolutional Neural Network (CNN) that has an Input layer containing a 13 x 13
image that is connected to a Convolution layer using a 4 x 4 filter and a stride of 1 (i.e., the
filter is shifted horizontally and vertically by 1 pixel, and only filters that are entirely inside the
input array are connected to a unit in the Convolution layer). There is no activation function
associated with the units in the Convolution layer. The Convolution layer is connected to a
max Pooling layer using a 2 x 2 filter and a stride of 2. (Only filters that are entirely inside the
array in the Convolution layer are connected to a unit in the Pooling layer.) The Output layer
contains 4 units that each use an ReLU activation function and these units are fullyconnected to the units in the Pooling layer.
Q1: How many units are in the Convolution layer?
Q2: How many distinct weights must be learned for the connections to the
Convolution layer?
Q3: How many units are in the Pooling layer?
Short Questions
• Describe the Bellman equation using utility relationship between states
• Describe temporal difference learning. Why is it called “temporal difference”
• What is Q-learning? What are the major differences between Q-learning and
SARSA learning?
• Mention the differences between MDP and Reinforcement Learning
• Describe forward chaining with example
• Describe backward chaining with example
• What is knowledge base. Explain with an example
Short Questions
• Can you resolve the following two sentences using general unifier? If so,
what sentence results?
• Write first-order-logic sentence:
– A sibling is another child of one’s parents
• Person(John) [read “John is a person”]
– Is it a predicate symbol, function symbol or constant symbol
• What is the difference between first-order logic and propositional logic?
Short Questions
• Describe backpropagation technique of Neural Networks
• What are the hyperparameters for training a neural network. Describe the
How to prepare?
• Review class materials
• make sure that you understand the concepts

