Simulations on social networks

lecture

Date

25 June 2024

Plan

Last time we learned about (social) networks
Today, we’ll learn how to interface Graphs.jl with Agents.jl
This allows ABM simulations on networks: our population of agents can now be structured in very interesting ways
We’ll also talk about how to gather statistics from simulation runs
For a concrete example, we will implement a simulation with variational learners on a social network without spatial effects

Requirements

We will need the following packages today:
- Agents
- Graphs
- Plots
- Statistics
- DataFrames
Now would be a good time to make sure you have all these installed
Code for today’s lecture: VL2.jl under “Bric-a-brac”

Catching errors

Before moving on, we need to discuss an important technicality
Sometimes, your code throws an error
- For example, try: rand([])
- This tries to draw a random element from an empty array, which of course won’t work
Often, you want to eliminate these errors
- For example, make sure that you never use things like rand([])
Other times, they may be unavoidable, and need to be caught

For a concrete example, assume the above network
Suppose we want to obtain a random neighbour of a given node
If that node does have neighbours, then all is well
But if the node happens to have no neighbours (such as node number 5), then we have a problem!
If we don’t want our code to crash, we need to catch the error

Errors are caught using a try ... catch ... end block:

try
  rand([])
catch
  println("Trying to draw from an empty container!")
end

Trying to draw from an empty container!

Here, Julia will try to execute everything found between the try and catch keywords
If an error is thrown, then it is caught and the stuff between the catch and end keywords is executed, without crashing

You can also leave the catch block empty, if you just want to continue silently!

try
  rand([])
catch
end

Strategy

Recall the 5 steps to define a model in Agents.jl:
1. Decide on model space
2. Define agent type(s)
3. Define rules that evolve the model
4. Initialize your model
5. Evolve, visualize and collect data

For this particular application:
1. Decide on model space — we’ll use a graph
2. Define agent type(s) — reuse existing code, with small modifications
3. Define rules that evolve the model — reuse, with small modifications
4. Initialize your model — just like before
5. Evolve, visualize and collect data — we’ll talk more about this

1. Space

In a previous lecture, we used:

dims = (10, 10)
space = GridSpace(dims)

Now, we use (for example):

G = erdos_renyi(100, 0.3)
space = GraphSpace(G)

2. Agent

Previously, we defined:

@agent struct GridVL(GridAgent{2}) <: VariationalLearner
  p::Float64
  gamma::Float64
  P1::Float64
  P2::Float64
end

We now add a new type:

@agent struct NetworkVL(GraphAgent) <: VariationalLearner
  p::Float64
  gamma::Float64
  P1::Float64
  P2::Float64
end

3. Rules

We previously used:

function VL_step!(agent, model)
  interlocutor = random_nearby_agent(agent, model)
  interact!(interlocutor, agent)
end

Here, random_nearby_agent returned the 8 agents surrounding agent in the GridSpace

random_nearby_agent also has a method for GraphSpaces
However, in a graph, an agent may be neighbourless!
We need to consider this, and so define:

function VL_step!(agent::NetworkVL, model)
  try
    interlocutor = random_nearby_agent(agent, model)
    interact!(interlocutor, agent)
  catch
  end
end

4. Initialize model

Previously, we used the following to initialize a model:

model = StandardABM(GridVL,
                    GridSpace((10, 10)),
                    agent_step! = VL_step!)

Now we do (for example):

model = StandardABM(NetworkVL,
                    GraphSpace(erdos_renyi(10, 0.5)),
                    agent_step! = VL_step!)

Putting it all together

# create space
G = erdos_renyi(10, 0.5)
space = GraphSpace(G)

# initialize model
model = StandardABM(NetworkVL,
                    space,
                    agent_step! = VL_step!)

# add agents
for i in 1:10
  add_agent_single!(model, 0.01, 0.01, 0.4, 0.1)
end

5. Evolve, visualize and collect data

Previously, we used the step! function from Agents.jl to evolve our model
We also wrote our own custom functions for retrieving summary statistics
Graphs.jl actually contains data collection functions which make this even easier
The most important of these (for us) are run! and ensemblerun!

Using `run!` to collect data

run! is like step!, except it also collects the model’s state and returns it as a DataFrame
- DataFrames are tables that are used to represent data
Syntax: run!(model, n; adata), where
- n is the number of time steps we want to run the model for
- adata is a keyword argument that specifies what data (“agent data”) is gathered
For reasons which are not superbly clear, run! has two return values, two DataFrames
- We can safely ignore the second one

Example: suppose we want to collect each agent’s p field (their weight for grammar \(G_1\)) at each time step, over 4000 model iterations (each agent is updated 4000 times).
This is achieved by:

data, _ = run!(model, 4000; adata = [:p])

Note the two return values; since we are not interested in the second one, we store it in the _ variable (think of this as a trash bin)
Also note that p is prefixed with a colon (:p) – this is crucial!
Also note that adata is a vector (this means you can specify more than one agent field to be collected, should you wish to do so)

Note

Why does p have to be prefixed by a colon? Well, we couldn’t just put p in there, as that would refer to a variable whose name is p. But that’s not what we want. What we want is somehow to refer to each agent’s internal p field. This can be achieved using so-called symbols. In Julia, symbols always begin with a colon (:). Think of them as labels. They are a bit like strings, but not quite (for example, a string is composed of characters but a symbol isn’t!).

The variable data now contains a DataFrame with three columns: the time step, the agent’s ID, and the agent’s p:

data

40010×3 DataFrame

39985 rows omitted

Row	time	id	p
	Int64	Int64	Float64
1	0	1	0.01
2	0	2	0.01
3	0	3	0.01
4	0	4	0.01
5	0	5	0.01
6	0	6	0.01
7	0	7	0.01
8	0	8	0.01
9	0	9	0.01
10	0	10	0.01
11	1	1	0.0099
12	1	2	0.0099
13	1	3	0.0099
⋮	⋮	⋮	⋮
39999	3999	9	0.999996
40000	3999	10	1.0
40001	4000	1	1.0
40002	4000	2	1.0
40003	4000	3	0.999999
40004	4000	4	0.999999
40005	4000	5	1.0
40006	4000	6	1.0
40007	4000	7	1.0
40008	4000	8	1.0
40009	4000	9	0.999996
40010	4000	10	1.0

We can now, for example, plot this:

using Plots
plot(data.time, data.p, group=data.id)

Here, note that:
- columns of a data frame are selected using the dot (.)
- we use the group keyword argument on plot so that each agent gets its own trajectory in the plot

Collecting aggregated data

Often we don’t need to collect data for each agent individually
For example: it is often enough to know how the average, or mean, of p evolves
This is very easy to do with run!: all we need to do is to feed the adata keyword argument with the tuple (:p, mean) instead of plain :p
More generally, in place of mean, you can put any function that you want to aggregate over the agents

Example:

using Statistics

G2 = erdos_renyi(10, 0.5)
space2 = GraphSpace(G2)

model2 = StandardABM(NetworkVL, space2,
                     agent_step! = VL_step!)

for i in 1:10
  add_agent_single!(model2, 0.01, 0.01, 0.4, 0.1)
end

data2, _ = run!(model2, 4000; adata = [(:p, mean)])

Now the returned dataframe contains a mean_p column:

data2

4001×2 DataFrame

3976 rows omitted

Row	time	mean_p
	Int64	Float64
1	0	0.01
2	1	0.0099
3	2	0.009801
4	3	0.00970299
5	4	0.00960596
6	5	0.0105099
7	6	0.0104048
8	7	0.0103008
9	8	0.0101977
10	9	0.0100958
11	10	0.00999481
12	11	0.00989486
13	12	0.00979591
⋮	⋮	⋮
3990	3989	0.998819
3991	3990	0.998831
3992	3991	0.998842
3993	3992	0.998854
3994	3993	0.998865
3995	3994	0.998877
3996	3995	0.998888
3997	3996	0.998899
3998	3997	0.99891
3999	3998	0.998921
4000	3999	0.998932
4001	4000	0.998942

plot(data2.time, data2.mean_p)

Exercise

Recall that the constructor of a Watts–Strogatz network, watts_strogatz(n, k, beta), takes three arguments:
- n: number of nodes
- k: initial degree
- beta: rewiring probability
Your task: simulate VLs in a Watts–Strogatz network, exploring how/if variation in beta affects the evolution of mean p

Use these parameters for your learners:
- initial value of p: 0.01
- learning rate gamma: 0.01
- P1: 0.2
- P2: 0.1
Use these for your network(s):
- n: 50
- k: 8
- beta: 0.1 versus 0.5

Solution

It is useful to wrap the model construction in a function:

function make_model(beta)
  G = watts_strogatz(50, 8, beta)
  space = GraphSpace(G)

  model = StandardABM(NetworkVL, space,
                      agent_step! = VL_step!)

  for i in 1:50
    add_agent_single!(model, 0.01, 0.01, 0.2, 0.1)
  end

  return model
end

It is now easy to run both models:

model1 = make_model(0.1)
model2 = make_model(0.5)

data1, _ = run!(model1, 10_000; adata = [(:p, mean)])
data2, _ = run!(model2, 10_000; adata = [(:p, mean)])

plot(data1.time, data1.mean_p, label="β = 0.1")
plot!(data2.time, data2.mean_p, label="β = 0.5")

And to visualize the results:

Ensemble data with `ensemblerun!`

It looks like there might be a small difference: the change is quicker with \(\beta = 0.5\) compared to \(\beta = 0.1\)
But is this difference real, or just a random fluke?
To answer this question, we need to run several repetitions of each simulation!
This is easiest by using the dedicated ensemblerun! function
It works like run! but, instead of a single model, takes a vector of models as input

Like this:

models1 = [make_model(0.1) for i in 1:10]
models2 = [make_model(0.5) for i in 1:10]

data1, _ = ensemblerun!(models1, 10_000; adata = [(:p, mean)])
data2, _ = ensemblerun!(models2, 10_000; adata = [(:p, mean)])

The returned data frames look like this:

data1

100010×3 DataFrame

99985 rows omitted

Row	time	mean_p	ensemble
	Int64	Float64	Int64
1	0	0.01	1
2	1	0.0101	1
3	2	0.010199	1
4	3	0.010297	1
5	4	0.010194	1
6	5	0.0100921	1
7	6	0.00999118	1
8	7	0.00989127	1
9	8	0.00999235	1
10	9	0.00989243	1
11	10	0.00979351	1
12	11	0.00969557	1
13	12	0.00979862	1
⋮	⋮	⋮	⋮
99999	9989	0.989517	10
100000	9990	0.989622	10
100001	9991	0.989326	10
100002	9992	0.989433	10
100003	9993	0.989538	10
100004	9994	0.989643	10
100005	9995	0.989747	10
100006	9996	0.989649	10
100007	9997	0.989753	10
100008	9998	0.989855	10
100009	9999	0.989957	10
100010	10000	0.990057	10

To plot these in a sensible way, we need to group by the ensemble column:

plot(data1.time, data1.mean_p, 
     group=data1.ensemble, color=1, label="β = 0.1")
plot!(data2.time, data2.mean_p, 
      group=data2.ensemble, color=2, label="β = 0.5")

These results suggest that there may be a difference; however, it looks to be small
To settle this question more conclusively, we need to:
- Run more simulations!
- Do some statistics on the simulation results
You get to practice both these things in this week’s homework