mutable struct VariationalLearner
p::Float64
gamma::Float64
endA model of language learning
Fixed bug in the learn! function so that learning also occurs on strings in the intersection \(L_1 \cap L_2\) of the two languages.
Fixed the set diagrams for \(L_1 \setminus L_2\) and \(L_2 \setminus L_1\).
Plan
- Starting this week, we will put programming to good use
- We’ll start with a simple model of language learning
- Here, learning = process of updating a linguistic representation
- Doesn’t matter whether child or adult
Grammar competition
- Assume two grammars \(G_1\) and \(G_2\) that generate languages \(L_1\) and \(L_2\)
- language = set of strings (e.g. sentences)
- In general, \(L_1\) and \(L_2\) will be different but may overlap:

Grammar competition
- Three sets of interest: \(L_1 \setminus L_2\), \(L_1 \cap L_2\) and \(L_2 \setminus L_1\)



Concrete example
- SVO (\(G_1\)) vs. V2 (\(G_2\))

Grammar competition
- Suppose learner receives randomly chosen strings from \(L_1\) and \(L_2\)
- Learner uses either \(G_1\) or \(G_2\) to parse incoming string
- Define \(p =\) probability of use of \(G_1\)
- How should the learner update \(p\) in response to interactions with his/her environment?
Variational learning
- Suppose learner receives string/sentence \(s\)
- Then update is:
| Learner’s grammar | String received | Update |
|---|---|---|
| \(G_1\) | \(s \in L_1\) | increase \(p\) |
| \(G_1\) | \(s \in L_2 \setminus L_1\) | decrease \(p\) |
| \(G_2\) | \(s \in L_2\) | decrease \(p\) |
| \(G_2\) | \(s \in L_1 \setminus L_2\) | increase \(p\) |
Exercise
How can we increase/decrease \(p\) in practice? What is the update formula?
One possibility (which we will stick to):
- Increase: \(p\) becomes \(p + \gamma (1 - p)\)
- Decrease: \(p\) becomes \(p - \gamma p\)
The parameter \(0 < \gamma < 1\) is a learning rate
Why this form of update formula?
- Need to make sure that always \(0 \leq p \leq 1\) (it is a probability)
- Also notice:
- When \(p\) is increased, what is added is \(\gamma (1-p)\). Since \(1-p\) is the probability of \(G_2\), this means transferring an amount of the probability mass of \(G_2\) onto \(G_1\).
- When \(p\) is decreased, what is removed is \(\gamma p\). Since \(p\) is the probability of \(G_1\), this means transferring an amount of the probability mass of \(G_1\) onto \(G_2\).
- Learning rate \(\gamma\) determines how much probability mass is transferred.
Plan
- To implement a variational learner computationally, we need:
- A representation of a learner who embodies a single probability, \(p\), and a learning rate, \(\gamma\)
- A way to sample strings from \(L_1 \setminus L_2\) and from \(L_2 \setminus L_1\)
- A function that updates the learner’s \(p\)
- Let’s attempt this now!
The struct
- The first point is very easy:
Sampling strings
- For the second point, note we have three types of strings which occur with three corresponding probabilities
- Let’s refer to the string types as
"S1","S12"and"S2", and to the probabilities asP1,P12andP2:
| String type | Probability | Explanation |
|---|---|---|
"S1" |
P1 |
\(s \in L_1 \setminus L_2\) |
"S12" |
P12 |
\(s \in L_1 \cap L_2\) |
"S2" |
P2 |
\(s \in L_2 \setminus L_1\) |
- In Julia, sampling from a finite number of options (here, three string types) with corresponding probabilities is handled by a function called
sample()which lives in theStatsBasepackage - First, install and load the package:
using Pkg
Pkg.add("StatsBase")
using StatsBaseNow to sample a string, you can do the following:
# the three probabilities (just some numbers I invented)
P1 = 0.4
P12 = 0.5
P2 = 0.1
# sample one string
sample(["S1", "S12", "S2"], Weights([P1, P12, P2]))"S12"
Tidying up
- The above works but is a bit cumbersome – for example, every time you want to sample a string, you need to refer to the three probabilities
- Let’s carry out a bit of software engineering to make this nicer to use
- First, we encapsulate the probabilities in a struct of their own:
struct LearningEnvironment
P1::Float64
P12::Float64
P2::Float64
end- We then define the following function:
function sample_string(x::LearningEnvironment)
sample(["S1", "S12", "S2"], Weights([x.P1, x.P12, x.P2]))
endsample_string (generic function with 1 method)
- Test the function:
paris = LearningEnvironment(0.4, 0.5, 0.1)
sample_string(paris)"S12"
Implementing learning
- We now need to tackle point 3, the learning function which updates the learner’s state
- This needs to do three things:
- Sample a string from the learning environment
- Pick a grammar to try and parse the string with
- Update \(p\) in response to whether parsing was successful or not
Exercise
How would you implement point 2, i.e. picking a grammar to try and parse the incoming string?
We can again use the sample() function from StatsBase, and define:
function pick_grammar(x::VariationalLearner)
sample(["G1", "G2"], Weights([x.p, 1 - x.p]))
endpick_grammar (generic function with 1 method)
Implementing learning
- Now it is easy to implement the first two points of the learning function:
function learn!(x::VariationalLearner, y::LearningEnvironment)
s = sample_string(y)
g = pick_grammar(x)
endlearn! (generic function with 1 method)
- How to implement the last point, i.e. updating \(p\)?
Aside: conditional statements
- Here, we will be helped by conditionals:
if COND1
# this is executed if COND1 is true
elseif COND2
# this is executed if COND1 is false but COND2 is true
else
# this is executed otherwise
end- Note: only the
ifblock is necessary;elseifandelseare optional, and there may be more than oneelseifblock
Aside: conditional statements
- Try this for different values of
number:
number = 1
if number > 0
println("Your number is positive!")
elseif number < 0
println("Your number is negative!")
else
println("Your number is zero!")
endComparison \(\neq\) assignment
To compare equality of two values inside a condition, you must use a double equals sign, ==. This is because the single equals sign, =, is already reserved for assigning values to variables.
if 0 = 1 # throws an error!
println("The world is topsy-turvy")
end
if 0 == 1 # works as expected
println("The world is topsy-turvy")
endExercise
- Use an
if ... elseif ... else ... endblock to finish off ourlearn!function - Tip: logical “and” is
&&, logical “or” is|| - Recall:
| Learner’s grammar | String received | Update |
|---|---|---|
| \(G_1\) | \(s \in L_1\) | increase \(p\) |
| \(G_1\) | \(s \in L_2 \setminus L_1\) | decrease \(p\) |
| \(G_2\) | \(s \in L_2\) | decrease \(p\) |
| \(G_2\) | \(s \in L_1 \setminus L_2\) | increase \(p\) |
Important! The following function, which we originally used, has a bug! It does not update the learner’s state with input strings from \(L_1 \cap L_2\). See below for fixed version.
function learn!(x::VariationalLearner, y::LearningEnvironment)
s = sample_string(y)
g = pick_grammar(x)
if g == "G1" && s == "S1"
x.p = x.p + x.gamma * (1 - x.p)
elseif g == "G1" && s == "S2"
x.p = x.p - x.gamma * x.p
elseif g == "G2" && s == "S2"
x.p = x.p - x.gamma * x.p
elseif g == "G2" && s == "S1"
x.p = x.p + x.gamma * (1 - x.p)
end
return x.p
endfunction learn!(x::VariationalLearner, y::LearningEnvironment)
s = sample_string(y)
g = pick_grammar(x)
if g == "G1" && s != "S2"
x.p = x.p + x.gamma * (1 - x.p)
elseif g == "G1" && s == "S2"
x.p = x.p - x.gamma * x.p
elseif g == "G2" && s != "S1"
x.p = x.p - x.gamma * x.p
elseif g == "G2" && s == "S1"
x.p = x.p + x.gamma * (1 - x.p)
end
return x.p
endlearn! (generic function with 1 method)
Testing our code
- Let’s test our code!
bob = VariationalLearner(0.5, 0.01)
paris = LearningEnvironment(0.4, 0.5, 0.1)
learn!(bob, paris)
learn!(bob, paris)
learn!(bob, paris)
learn!(bob, paris)
learn!(bob, paris)0.51489901495
trajectory = [learn!(bob, paris) for t in 1:1000]1000-element Vector{Float64}:
0.5097500248005
0.514652524552495
0.5195059993069701
0.5143109393139004
0.5191678299207614
0.5239761516215538
0.5287363901053382
0.5334490262042848
0.5381145359422419
0.5327333905828194
0.5374060566769913
0.5420319961102213
0.5466116761491191
⋮
0.8043883364948524
0.7963444531299039
0.7983810085986048
0.7903971985126188
0.7924932265274927
0.7945682942622178
0.7966226113195956
0.7986563852063996
0.7906698213543356
0.7927631231407922
0.7948354919093843
0.7968871369902905
Plotting the learning trajectory
using Plots
plot(1:1000, trajectory)Bibliographical remarks
Summary
- You’ve learned a few important concepts today:
- Grammar competition and variational learning
- How to sample objects according to a discrete probability distribution
- How to use conditional statements
- How to make a simple plot of a learning trajectory
- You get to practice these in the homework
- Next week, we’ll take the model to a new level and consider what happens when several variational learners interact