A simple 4 layer contact network based on survey data

Assume we want to build a simple contact network model consisting of 4 location types: households, school classes, work places and cities. To increase realism, we use survey data to create the population of actors and to define some of the location properties.

[1]:
import pop2net as p2n
from pop2net.data_fakers.soep import soep_faker

In this example, we only use fake survey data (but of course you should real survey data here):

[2]:
df_soep = soep_faker.soep(size=1000)
df_soep.head()
[2]:
age gender work_hours_day nace2_division hid pid
0 86.0 male 0.000000 -2 5199 8974
1 22.0 male 8.098868 86 5199 5637
2 48.0 male 7.740646 87 5199 9915
3 30.0 male 0.000000 -2 5199 9697
4 64.0 male 7.416154 78 9158 9346

The first contact layer Home is a location class where actors of one household meet each other for 12 hours. We use the actor attribute hid(household id), which is provided by the survey data, to group the actors in their empirical households.

[3]:
class Home(p2n.LocationDesigner):
    def split(self, actor):
        """Group the actors by their household id."""
        return actor.hid

    def weight(self, actor):
        """Weight the connection between the actor and the Home by 12."""
        return 12

The second layer models work places. The actors are grouped by their NACE2 division which is provided in the survey data. The connection is weighted by the actors’ empirical work hours given by the survey data.

[4]:
class Work(p2n.LocationDesigner):
    n_actors = 10

    def filter(self, actor):
        """Ignore actors that have 0 work hours or an invalid NACE2 value."""
        return True if actor.work_hours_day > 0 and actor.nace2_division > 0 else False

    def split(self, actor):
        """Group actors by NACE2 division."""
        return actor.nace2_division

    def weight(self, actor):
        """Weight the connection between the actor and the Work instance
        by the actor's daily work hours."""
        return actor.work_hours_day

The third type of location are cities. We build 2 of them. Using stick_together(), we make sure that actors of the same household live in the same city.

[5]:
class City(p2n.LocationDesigner):
    n_locations = 2

    def stick_together(self, actor):
        """Keep actors of the same household together when assigning the actors to cities."""
        return actor.hid

The fourth contact layer models a school consisting of multiple classrooms including actors of the same age. Using nest() we ensure that children from the same city visit the same school.

[6]:
class School(p2n.LocationDesigner):
    n_actors = 15  # Set the number of actors to 15.

    def filter(self, actor):
        """Ignore actors younger than 6 or older than 18."""
        return True if 6 <= actor.age <= 18 else False

    def split(self, actor):
        """Group the actors by age."""
        return actor.age

    def weight(self, actor):
        """Weight the connection between the actor and the School by 6."""
        return 6

    def nest(self):
        """Nest this location type within the location type City."""
        return "City"

Create the necessary pop2net objects:

[7]:
env = p2n.Environment()
creator = p2n.Creator(env)
inspector = p2n.NetworkInspector(env)

In the following we build the network. 100 rows are sampled from the df_soep and are translated into actors. The argument sample_level ensures that we always sample complete households. Using the argument location classes we can define which contact layers we want to use to build our network.

[8]:
creator.create(
    df=df_soep,
    n_actors=100,
    sample_level="hid",
    location_designers=[
        Home,
        City,
        Work,
        School,
    ],
)

inspector.plot_networks(location_color="label")
[9]:
inspector.eval_affiliations()


______________________________________
Number of locations
______________________________________

                count
location_label
Home               39
Work               29
School              9
City                2


______________________________________
Number of actors per location
______________________________________

                     mean       std   min    25%   50%    75%   max
location_label
City            50.500000  2.121320  49.0  49.75  50.5  51.25  52.0
Home             2.589744  1.312249   1.0   2.00   2.0   3.00   6.0
School           1.444444  0.527046   1.0   1.00   1.0   2.00   2.0
Work             1.655172  1.203443   1.0   1.00   1.0   2.00   6.0


______________________________________
Number of affiliated locations per actor
______________________________________

      n_affiliated_locations
mean                2.603960
std                 0.491512
min                 2.000000
25%                 2.000000
50%                 3.000000
75%                 3.000000
max                 3.000000

Maybe you have noticed that some of the school classes or work places are undercrowded because the overall population is too small. Let’s create a new network model and increase the population size to 5000 actors:

[ ]:
env = p2n.Environment()
creator = p2n.Creator(env)
inspector = p2n.NetworkInspector(env)

creator.create(
    df=df_soep,
    n_actors=5000,
    sample_level="hid",
    location_designers=[
        Home,
        City,
        Work,
        School,
    ],
)

The table below shows that now the population is large enough to fill all locations as we wanted.

[11]:
inspector.eval_affiliations()


______________________________________
Number of locations
______________________________________

                count
location_label
Home             1640
Work              226
School             35
City                2


______________________________________
Number of actors per location
______________________________________

                       mean       std     min      25%     50%      75%  \
location_label
City            2500.500000  3.535534  2498.0  2499.25  2500.5  2501.75
Home               3.049390  1.513212     1.0     2.00     3.0     4.00
School            13.142857  3.835920     4.0    10.00    15.0    16.00
Work               9.623894  1.855959     2.0    10.00    10.0    10.00

                   max
location_label
City            2503.0
Home               9.0
School            18.0
Work              13.0


______________________________________
Number of affiliated locations per actor
______________________________________

      n_affiliated_locations
mean                2.526895
std                 0.503316
min                 2.000000
25%                 2.000000
50%                 3.000000
75%                 3.000000
max                 4.000000