Simulations on social networks
lecture
Plan
- Last time we learned about (social) networks
- Today, we’ll learn how to interface Graphs.jl with Agents.jl
- This allows ABM simulations on networks: our population of agents can now be structured in very interesting ways
- We’ll also talk about how to gather statistics from simulation runs
- For a concrete example, we will implement a simulation with variational learners on a social network without spatial effects
Requirements
- We will need the following packages today:
- Agents
- Graphs
- Plots
- Statistics
- DataFrames
- Now would be a good time to make sure you have all these installed
- Code for today’s lecture: VL2.jl under “Bric-a-brac”
Catching errors
- Before moving on, we need to discuss an important technicality
- Sometimes, your code throws an error
- For example, try:
rand([])
- This tries to draw a random element from an empty array, which of course won’t work
- For example, try:
- Often, you want to eliminate these errors
- For example, make sure that you never use things like
rand([])
- For example, make sure that you never use things like
- Other times, they may be unavoidable, and need to be caught
- For a concrete example, assume the above network
- Suppose we want to obtain a random neighbour of a given node
- If that node does have neighbours, then all is well
- But if the node happens to have no neighbours (such as node number 5), then we have a problem!
- If we don’t want our code to crash, we need to catch the error
- Errors are caught using a
try ... catch ... end
block:
tryrand([])
catchprintln("Trying to draw from an empty container!")
end
Trying to draw from an empty container!
- Here, Julia will try to execute everything found between the
try
andcatch
keywords - If an error is thrown, then it is caught and the stuff between the
catch
andend
keywords is executed, without crashing
- You can also leave the catch block empty, if you just want to continue silently!
tryrand([])
catchend
Strategy
- Recall the 5 steps to define a model in Agents.jl:
- Decide on model space
- Define agent type(s)
- Define rules that evolve the model
- Initialize your model
- Evolve, visualize and collect data
- For this particular application:
- Decide on model space — we’ll use a graph
- Define agent type(s) — reuse existing code, with small modifications
- Define rules that evolve the model — reuse, with small modifications
- Initialize your model — just like before
- Evolve, visualize and collect data — we’ll talk more about this
1. Space
- In a previous lecture, we used:
= (10, 10)
dims = GridSpace(dims) space
- Now, we use (for example):
= erdos_renyi(100, 0.3)
G = GraphSpace(G) space
2. Agent
- Previously, we defined:
@agent struct GridVL(GridAgent{2}) <: VariationalLearner
::Float64
p::Float64
gamma::Float64
P1::Float64
P2end
- We now add a new type:
@agent struct NetworkVL(GraphAgent) <: VariationalLearner
::Float64
p::Float64
gamma::Float64
P1::Float64
P2end
3. Rules
- We previously used:
function VL_step!(agent, model)
= random_nearby_agent(agent, model)
interlocutor interact!(interlocutor, agent)
end
- Here,
random_nearby_agent
returned the 8 agents surroundingagent
in theGridSpace
random_nearby_agent
also has a method forGraphSpace
s- However, in a graph, an agent may be neighbourless!
- We need to consider this, and so define:
function VL_step!(agent::NetworkVL, model)
try= random_nearby_agent(agent, model)
interlocutor interact!(interlocutor, agent)
catchend
end
4. Initialize model
- Previously, we used the following to initialize a model:
= StandardABM(GridVL,
model GridSpace((10, 10)),
= VL_step!) agent_step!
- Now we do (for example):
= StandardABM(NetworkVL,
model GraphSpace(erdos_renyi(10, 0.5)),
= VL_step!) agent_step!
Putting it all together
# create space
= erdos_renyi(10, 0.5)
G = GraphSpace(G)
space
# initialize model
= StandardABM(NetworkVL,
model
space,= VL_step!)
agent_step!
# add agents
for i in 1:10
add_agent_single!(model, 0.01, 0.01, 0.4, 0.1)
end
5. Evolve, visualize and collect data
- Previously, we used the
step!
function from Agents.jl to evolve our model - We also wrote our own custom functions for retrieving summary statistics
- Graphs.jl actually contains data collection functions which make this even easier
- The most important of these (for us) are
run!
andensemblerun!
Using run!
to collect data
run!
is likestep!
, except it also collects the model’s state and returns it as aDataFrame
DataFrame
s are tables that are used to represent data
- Syntax:
run!(model, n; adata)
, wheren
is the number of time steps we want to run the model foradata
is a keyword argument that specifies what data (“agent data”) is gathered
- For reasons which are not superbly clear,
run!
has two return values, twoDataFrame
s- We can safely ignore the second one
- Example: suppose we want to collect each agent’s
p
field (their weight for grammar \(G_1\)) at each time step, over 4000 model iterations (each agent is updated 4000 times). - This is achieved by:
= run!(model, 4000; adata = [:p]) data, _
- Note the two return values; since we are not interested in the second one, we store it in the
_
variable (think of this as a trash bin) - Also note that
p
is prefixed with a colon (:p
) – this is crucial! - Also note that
adata
is a vector (this means you can specify more than one agent field to be collected, should you wish to do so)
Note
Why does p
have to be prefixed by a colon? Well, we couldn’t just put p
in there, as that would refer to a variable whose name is p
. But that’s not what we want. What we want is somehow to refer to each agent’s internal p
field. This can be achieved using so-called symbols. In Julia, symbols always begin with a colon (:
). Think of them as labels. They are a bit like strings, but not quite (for example, a string is composed of characters but a symbol isn’t!).
- The variable
data
now contains aDataFrame
with three columns: the time step, the agent’s ID, and the agent’sp
:
data
40010×3 DataFrame
39985 rows omitted
Row | time | id | p |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 0 | 1 | 0.01 |
2 | 0 | 2 | 0.01 |
3 | 0 | 3 | 0.01 |
4 | 0 | 4 | 0.01 |
5 | 0 | 5 | 0.01 |
6 | 0 | 6 | 0.01 |
7 | 0 | 7 | 0.01 |
8 | 0 | 8 | 0.01 |
9 | 0 | 9 | 0.01 |
10 | 0 | 10 | 0.01 |
11 | 1 | 1 | 0.0099 |
12 | 1 | 2 | 0.0099 |
13 | 1 | 3 | 0.0099 |
⋮ | ⋮ | ⋮ | ⋮ |
39999 | 3999 | 9 | 0.999996 |
40000 | 3999 | 10 | 1.0 |
40001 | 4000 | 1 | 1.0 |
40002 | 4000 | 2 | 1.0 |
40003 | 4000 | 3 | 0.999999 |
40004 | 4000 | 4 | 0.999999 |
40005 | 4000 | 5 | 1.0 |
40006 | 4000 | 6 | 1.0 |
40007 | 4000 | 7 | 1.0 |
40008 | 4000 | 8 | 1.0 |
40009 | 4000 | 9 | 0.999996 |
40010 | 4000 | 10 | 1.0 |
- We can now, for example, plot this:
using Plots
plot(data.time, data.p, group=data.id)
- Here, note that:
- columns of a data frame are selected using the dot (
.
) - we use the
group
keyword argument onplot
so that each agent gets its own trajectory in the plot
- columns of a data frame are selected using the dot (
Collecting aggregated data
- Often we don’t need to collect data for each agent individually
- For example: it is often enough to know how the average, or mean, of
p
evolves - This is very easy to do with
run!
: all we need to do is to feed theadata
keyword argument with the tuple(:p, mean)
instead of plain:p
- More generally, in place of
mean
, you can put any function that you want to aggregate over the agents
- Example:
using Statistics
= erdos_renyi(10, 0.5)
G2 = GraphSpace(G2)
space2
= StandardABM(NetworkVL, space2,
model2 = VL_step!)
agent_step!
for i in 1:10
add_agent_single!(model2, 0.01, 0.01, 0.4, 0.1)
end
= run!(model2, 4000; adata = [(:p, mean)]) data2, _
- Now the returned dataframe contains a
mean_p
column:
data2
4001×2 DataFrame
3976 rows omitted
Row | time | mean_p |
---|---|---|
Int64 | Float64 | |
1 | 0 | 0.01 |
2 | 1 | 0.0099 |
3 | 2 | 0.009801 |
4 | 3 | 0.00970299 |
5 | 4 | 0.00960596 |
6 | 5 | 0.0105099 |
7 | 6 | 0.0104048 |
8 | 7 | 0.0103008 |
9 | 8 | 0.0101977 |
10 | 9 | 0.0100958 |
11 | 10 | 0.00999481 |
12 | 11 | 0.00989486 |
13 | 12 | 0.00979591 |
⋮ | ⋮ | ⋮ |
3990 | 3989 | 0.998819 |
3991 | 3990 | 0.998831 |
3992 | 3991 | 0.998842 |
3993 | 3992 | 0.998854 |
3994 | 3993 | 0.998865 |
3995 | 3994 | 0.998877 |
3996 | 3995 | 0.998888 |
3997 | 3996 | 0.998899 |
3998 | 3997 | 0.99891 |
3999 | 3998 | 0.998921 |
4000 | 3999 | 0.998932 |
4001 | 4000 | 0.998942 |
plot(data2.time, data2.mean_p)
Exercise
- Recall that the constructor of a Watts–Strogatz network,
watts_strogatz(n, k, beta)
, takes three arguments:n
: number of nodesk
: initial degreebeta
: rewiring probability
- Your task: simulate VLs in a Watts–Strogatz network, exploring how/if variation in
beta
affects the evolution of meanp
- Use these parameters for your learners:
- initial value of
p
: 0.01 - learning rate
gamma
: 0.01 P1
: 0.2P2
: 0.1
- initial value of
- Use these for your network(s):
n
: 50k
: 8beta
: 0.1 versus 0.5
Solution
It is useful to wrap the model construction in a function:
function make_model(beta)
= watts_strogatz(50, 8, beta)
G = GraphSpace(G)
space
= StandardABM(NetworkVL, space,
model = VL_step!)
agent_step!
for i in 1:50
add_agent_single!(model, 0.01, 0.01, 0.2, 0.1)
end
return model
end
It is now easy to run both models:
= make_model(0.1)
model1 = make_model(0.5)
model2
= run!(model1, 10_000; adata = [(:p, mean)])
data1, _ = run!(model2, 10_000; adata = [(:p, mean)])
data2, _
plot(data1.time, data1.mean_p, label="β = 0.1")
plot!(data2.time, data2.mean_p, label="β = 0.5")
And to visualize the results:
Ensemble data with ensemblerun!
- It looks like there might be a small difference: the change is quicker with \(\beta = 0.5\) compared to \(\beta = 0.1\)
- But is this difference real, or just a random fluke?
- To answer this question, we need to run several repetitions of each simulation!
- This is easiest by using the dedicated
ensemblerun!
function - It works like
run!
but, instead of a single model, takes a vector of models as input
- Like this:
= [make_model(0.1) for i in 1:10]
models1 = [make_model(0.5) for i in 1:10]
models2
= ensemblerun!(models1, 10_000; adata = [(:p, mean)])
data1, _ = ensemblerun!(models2, 10_000; adata = [(:p, mean)]) data2, _
- The returned data frames look like this:
data1
100010×3 DataFrame
99985 rows omitted
Row | time | mean_p | ensemble |
---|---|---|---|
Int64 | Float64 | Int64 | |
1 | 0 | 0.01 | 1 |
2 | 1 | 0.0101 | 1 |
3 | 2 | 0.010199 | 1 |
4 | 3 | 0.010297 | 1 |
5 | 4 | 0.010194 | 1 |
6 | 5 | 0.0100921 | 1 |
7 | 6 | 0.00999118 | 1 |
8 | 7 | 0.00989127 | 1 |
9 | 8 | 0.00999235 | 1 |
10 | 9 | 0.00989243 | 1 |
11 | 10 | 0.00979351 | 1 |
12 | 11 | 0.00969557 | 1 |
13 | 12 | 0.00979862 | 1 |
⋮ | ⋮ | ⋮ | ⋮ |
99999 | 9989 | 0.989517 | 10 |
100000 | 9990 | 0.989622 | 10 |
100001 | 9991 | 0.989326 | 10 |
100002 | 9992 | 0.989433 | 10 |
100003 | 9993 | 0.989538 | 10 |
100004 | 9994 | 0.989643 | 10 |
100005 | 9995 | 0.989747 | 10 |
100006 | 9996 | 0.989649 | 10 |
100007 | 9997 | 0.989753 | 10 |
100008 | 9998 | 0.989855 | 10 |
100009 | 9999 | 0.989957 | 10 |
100010 | 10000 | 0.990057 | 10 |
- To plot these in a sensible way, we need to group by the ensemble column:
plot(data1.time, data1.mean_p,
=data1.ensemble, color=1, label="β = 0.1")
groupplot!(data2.time, data2.mean_p,
=data2.ensemble, color=2, label="β = 0.5") group
- These results suggest that there may be a difference; however, it looks to be small
- To settle this question more conclusively, we need to:
- Run more simulations!
- Do some statistics on the simulation results
- You get to practice both these things in this week’s homework