TU Wien:Einführung in die Künstliche Intelligenz VU (Eiter, Tompits)/Zusammenfassung

Intelligent Agents[Bearbeiten | Quelltext bearbeiten]

Rational agent: For each possible percept history, select an action that is expected to maximize its performance measure, given the evidence by the percept history and whatever built-in knowledge the agent has.

PEAS: Performance measure, Environment, Actuators, Sensors

Environment Types

fully observable vs. partly observable
single-agent vs. multi-agent
deterministic vs. stochastic
episodic vs. sequential
static vs. dynamic
discrete vs continuous
known vs. unknown

Agent Types

simple reflex agents
model-based reflex agents
goal-based agents
utility-based agents

Problem Solving and Search[Bearbeiten | Quelltext bearbeiten]

Search strategies are evaluated along the following criteria:

completeness: does it always find a solution if one exists?
time complexity: number of nodes generated/expanded
space complexity: maximum number of nodes in memory
optimality: does it always find a least-cost solution?

Time and space complexity are measured in terms of

$b$ maximum branching factor of the search tree
$d$ depth of the least-cost solution
$m$ maximum depth of the state space (may be $\infty$ )

Uninformed Search Strategies[Bearbeiten | Quelltext bearbeiten]

Breadth-first search (BFS)
Uniform-cost search (UCS)
Depth-first search (DFS)
Depth-limited search (DLS)
Iterative deepening search (IDS)

Heuristic Search[Bearbeiten | Quelltext bearbeiten]

Use problem- or domain-specific knowledge during search.

admissable heuristic: For every node $n$ the following holds:

$h(n)\leq h^{*}(n)$ where $h^{*}(n)$ is the true cost from n
$h(n)\geq 0$
$h(g)=0$ for every goal $g$ (follows from 1 and 2)

consistent heuristic: For every node $n$ every successor $n'$ of $n$ and every operator $a$ , $h(n)\leq c(n,a,n')+h(n')$ holds where $c(n,a,n')$ are the path costs for $a$ .

We consider two algorithms:

greedy search
A* search

Local Search[Bearbeiten | Quelltext bearbeiten]

historyless, one-step change

hill-climbing: always pick the highest-valued neighbor -> gets stuck on local maxima; to escape shoulders we can allow sideway moves but then we loop on flat maxima -> limited sideway moves
simulated annealing: allow some bad moves but gradually decrease their size & frequency
genetic algorithm (GA): stochastic local beam search + successors from pairs of states

Learning from Observations[Bearbeiten | Quelltext bearbeiten]

The states of the world can be represented with different levels of granularity:

atomic: states
factored: attributes with values
structured: dependence model between attributes

Feedback is important for learning; there are different modes:

unsupervised learning
reinforced learning
supervised learning

Inductive Learning[Bearbeiten | Quelltext bearbeiten]

TBD

Decision Tree Learning[Bearbeiten | Quelltext bearbeiten]

Entropy measures how uncertain the result of an event is.

Shannon entropy

$H=\sum _{i}-P_{i}\log _{2}P_{i}$

If we have $p$ positive and $n$ negative examples we need $H({\tfrac {p}{p+n}})$ bits to classify an example.

For an attribute $A$ we can partition all examples by the possible attribute values. For each attribute value $v_{k}$ there are $p_{k}$ positive and $n_{k}$ negative examples.

We can calculate the the amount of information gained from testing the attribute $A$ with:

$Rem(A)=\sum _{k}{\frac {p_{k}+n_{k}}{p+n}}H\left({\frac {p_{k}}{p_{k}+n_{k}}}\right)$

The resulting information gain is $Gain(A)=H({\tfrac {p}{p+n}})-Rem(A)$ .

Neural Networks[Bearbeiten | Quelltext bearbeiten]

input units, hidden units, output units

Different network types:

feed-forward network: uni-directional
recurrent neural network: directed cycles with delays

Expressiveness of Multi-Layer Perceptrons (MLPs):

1 layer (no hidden layer): all linearly separable functions
2 layers: all continuous functions (arbitrary precision)
3 layers: all functions

Perceptron Learning Rule: $Err=y-h_{w}(x)$ , where $y$ is the target output for $x$; $w_{i}\leftarrow w_{i}+\alpha \cdot Err\cdot g'(in)\cdot x_{i}$ , where $\alpha$ is the learning rate

There are many applications of neural networks (e.g. prediction of pollution or stock developments or pattern/face/speech recognition).

deep learning: neural networks with many layers, requires lots of data

Constraint Satisfaction Problems[Bearbeiten | Quelltext bearbeiten]

Planning[Bearbeiten | Quelltext bearbeiten]

Three problems can occur in reasoning about actions:

the frame problem: how to represent things which are not changed by actions (axioms)
the ramification problem: how to represent implicit effects
the qualification problem: required preconditions that an action succeeds

We are only concerned with classical planning environments, which are

fully observable
deterministic
finite
static (change happens only when the agent acts)
discrete (in time, actions, objects and effects)

Two planning languages:

STRIPS

(Stanford Research Institute Problem Solver)

Closed-World Assumption: unmentioned literals are false

Example:

Action(Fly (p, from, to),
  Precond: At(p, from) ∧ Plane(p) ∧ Airport(from) ∧ Airport(to)
  Effect: ¬At(p, from) ∧ At(p, to))

ADL

(Action Description Language)

Open-World Assumption: unmentioned literals are unknown
Contrary to STRIPS it supports equality and variable types.

Making Simple Decisions[Bearbeiten | Quelltext bearbeiten]

nondeterministic, partially observable environments

The agent's preferences are expressed by a utility function $U(s)$ .

Axioms of Utility Theory

Orderability
Transitivity
Continuity
Substitutability
Monotonicity
Decomposability

Decision networks are a general framework for supporting rational decisions.

Three types of nodes:

chance nodes (ovals) represent random variables
decision nodes (rectangles) represent points where a decision maker has a choice of actions
utility nodes (diamonds) represent the agent's utility function