# Information Theory and the Immune System

EE376A (Winter 2019)

I am a PhD candidate in the Cancer Biology program at Stanford. I’m fascinated by how the immune system works, and my primary motivation for my thesis work is to better understand its intricacies in order to improve patient care. You’ll see that my outreach activity and final project are both linked to these interests. If you have similar interests, or if anything I’ve shared below sparks your own interest, please feel free to reach out! You can reach me at matallah@stanford.edu.

## Outreach Activity

### How do we measure information?

Information is a decrease in uncertainty. For my outreach activity I wanted to convey this concept to students, and help them understand how to qualitatively estimate how much information is given in a each of a set of ‘clues’ based on how close each clue gets them to an answer. At each step I asked them how many possible answers there were, and how close they felt they were to getting the right answer (this was indicated by sliding a marker along an answer bar.

### Concept Introduction

We began with a simple example: suppose they asked me what my favorite color was.

Now I give them the first clue: it’s not black.

And the second clue: it’s the color of the sky.

When I asked them which clue got them closer to the answer, and then which one had more information, all of the students correctly identified the second clue. The younger students often said that they had the final answer after the second clue, although one particularly precise student observed that the sky could be almost any color depending on the weather, atmospheric contents, and time of day!

### Concept Application

With the concept of information content thus understood, we moved on to the main activity. I had four stuffed monkeys on my table who didn’t feel well – each student was invited to choose a monkey, and they would try to figure out what was wrong with it. They were then given a clipboard with the monkey’s “medical record”, containing a table listing the four possible things that could be making the monkey feel ill. Each of the four illnesses had three characteristics: how a patient with that illness would feel, what the symptoms were, and which immune cells would be active (I briefly explained that immune cells are cells in your body that help fight disease).

I then gave them the clues one by one, and they were instructed to circle the clue each time they saw it on the clipboard. Some clues applied to most of the diseases, therefore not reducing uncertainty and providing little information, whereas some were specific to a single disease. By the end of the activity, they will have circled things in each column, but one column had many more circles than the others. Linking this back to information theory, each clue thus allowed them to narrow down the possible diagnoses, until the uncertainty has been reduced enough to allow them to diagnose the patient. We then reviewed the relationship between information and a decrease in uncertainty (or, getting closer to an answer), and I asked them which clues provided the most information, and which diagnoses they would give their chosen monkey. With the exception of the very youngest students, they were most often able to correctly identify the diagnosis as well as the clues that gave the most information.

#### Summary of Outreach

I really enjoyed the outreach event, and was impressed by how quickly the students were able to grasp and apply the concept of qualitative measurement of information. They were excited and engaged, and I myself enjoyed seeing all the different activities and displays other students in the class had to offer. It was also fun to think about all the concepts we learned in class through the lens of choosing one to present to kids in an interesting way.

## Project

Title: Analysis of the Immune Communications Network

Authors: Michelle Atallah

#### Introduction

The immune system is a complex and dynamic network of immune cells and components that are responsible for defending us from pathogenic threats and returning us to health after illness and infection. Too much immune activity can be harmful, as occurs in autoimmune diseases, while too little can leave one with an inability to fight of infections, resulting in death. The ability to precisely modulate immune activity is therefore of high interest in both research and clinical medicine.

An effective immune response is the result of highly coordinated activity across the many components of the immune system (Altan-Bonnet 2019). Although recent technological advances have enabled us to collect information about immune responses in great detail, there currently exists no method that allows us to integrate these measurements into a comprehensive, systems-level analysis of the immune response (Kidd 2014). As a result these studies are largely descriptive and provide very few mechanistic insights.

The fields of genetics and proteomics have long benefited from network analysis tools that interpret experimental data in the context of existing knowledge of genetic and proteomic biological networks (Huang 2009), recognizing that no component in these systems exists or functions in isolation. Despite equal recognition that the immune system also functions as an integrated system, no such tool exists for immunology. The goal of this project is to take a first step towards developing such a tool.

#### Preliminary Data

For the first part of my PhD I set out to build a network map of the immune system to serve as a knowledgebase upon which immune network analysis methods could be built. To this end I have recently built and characterized a manually-curated map of the immune interactome by manually extracting immune interactions described in Janeway’s Immunobiology 9th edition (Murphy 2017), a canonical immunology textbook (Figure 1a). In this network entities that interact with the immune system or participate in an immune response are represented as nodes. Edges describe the interactions between the nodes, with information recorded for each edge including the type of interaction, receptors or other molecules involved, and immune processes in which the edge participates. By thus structuring the information presented in the textbook we have created what is, to the best of our knowledge, the first system-wide graphical representation of the human immune interaction network (Figure 1b).

This graph essentially represents the immune communications network, and makes it easier to examine how information is transmitted across the different components of the immune system. This is a large network, with 253 nodes and 1112 edges. With a low density of 0.02 and a short average path length of 3.25, the immune network is a sparse but very efficiently connected network. This makes sense: the low density reflects the specialization of immune components while the short path length is appropriate given the need for effective communication (in the event of pathogenic invasion, a rapid immune response is of the essence in containing and eliminating an infection).

#### Goals

Immune responses take place wherever there is a pathogen or infection, often in the tissues. However, it is difficult to collect samples of tissues from humans – oftentimes blood is the only sample available. The first goal for this project was therefore to establish whether levels of circulating immune cells in the blood accurately reflect active immune mechanisms in humans. To this end I found a dataset of longitudinal immune measurements taken from peripheral blood of patients who had received either a flu vaccine, pneumonia vaccine, or an inert saline injection. I examined these data for time-lagged correlations between levels of circulating immune cells, and asked whether they reflect known immune mechanisms (Goal 1) and whether I could map these correlations back onto my immune interaction network to reconstruct the path through the network that the immune system took to produce the current active immune response (Goal 2).

#### Datasets

The data used for this project were generated as part of an IRB-approved study (Obermoser 2013) and are publicly available on ImmuneSpace (immuespace.org; Brusic 2014) under acquisition number SDY180.

This dataset consists of 47 patients randomized into three cohorts, each of which received one of the one of the following vaccines: influenza (Fluzone), pneumonia (Pneumomax23), or a saline injection (control). Blood was collected from each patient at 9 timepoints ranging from before vaccination until 28 days post-vaccination, and assays were done to examine levels of circulating cytokines as well as populations of circulating immune cells.

#### Methods

The data described above were downloaded as CSV files and analyzed in R. The data were filtered to select only timepoints that were regularly spaced (-7, 0, 7, 14, 21, and 28 days post-vaccination) in order to enable normalized cross-correlation analysis. Cross correlation was calculated for each immune cell pair, and the highest correlation for each pair (along with its lag) was extracted. Patients were grouped according to treatment cohorts, and correlations were compared across the cohorts.

#### Results

Because I am interested in identifying how information is transmitted across the immune network and by which cells it is transmitted, I focused on strong correlations (correlation scores > 0.05) with a lag > 0. There were 1639 immune cell pairs that fit this criteria among flu patients, 942 pairs in pneumonia patients, and 809 pairs in patients who had received an inert saline injection. This is higher than expected, as these patients’ immune systems were not deliberately stimulated, and therefore theoretically shouldn’t have initiated any immune response. However, it is difficult to control for environmental exposure in human subjects, and it’s possible that these patients’ immune systems were responding to something other than the experimental intervention.

Many of the correlation results were as expected. For example, CD3+ cells (all T cells) correlated nearly perfectly with CD4+ cells (a subset of CD3+ T cells) with a lag of 0.
In addition, Plasmablasts (activated but not yet terminally differentiated B cells) correlated quite well (correlation score of 0.798) with Plasma cells (terminally differentiated effector B cells) with a time lag of 1. Given the frequency of sampling once a week, this correlates with the known timeline of B cell differentiation, which is estimated at 7-10 days (Murphy 2017).
These sense checks recapitulated known dynamics of immune responses, providing a level of confidence that further analyses could be believable.

Enriched cell types across patient cohorts
After finding these strong correlations for each patient group, I began by asking whether the cell types involved in these significant correlations differed across the groups. I found that they did, and that the cell types involved were more similar between the flu and pneumonia vaccine cohorts than between either vaccine cohort and the saline control group (Figure 2).
The saline cohort had higher representations of many types of activated helper T cells. The only exception was T follicular helper cells, which are a cell type that aid in the activation and proliferation of B cells to initiate an antibody response. It therefore makes sense that these were the only T helper cell population enriched in both vaccine cohorts, as both vaccines elicit antibody responses. Likewise, the saline group showed far lower levels of antibody-secreting plasma cells.

Comparison of the influenza and pneumonia vaccines
I began comparing the treatment groups by looking for directional interactions (composed of a pair of immune cells with correlation above 0.5 and time lag of > 0) that were common between patients who had received the flu vaccine and the pneumonia vaccine. There were only 27 such interactions, of which the majority (n = 17) involved regulatory T cells and Plasma cells (antibody-producing B cells). Regulatory T cells are responsible for limiting immune reactions to prevent excessive damage to the host, and both vaccines are known to produce protective antibodies (made by Plasma cells). It is therefore expected that this part of the immune mechanism is shared between the two patient cohorts.

Pathway tracing and validation of multi-step pathways
Because each timepoint in my dataset is 7 days apart, correlations in which the lag is >1 or <-1 suggest that there is an additional effector cell or immune component facilitating the interaction. I randomly selected several of these multi-step interactions and attempted to identify these middle effector components using my directional immune communications graph. An effector was identified in every instance. In some cases, only a single mediator was found on the path between the two nodes (Figure 3a), in other cases, a few possibilities existed (Figure 3b). Given the highly interconnected nature of the immune network, the number of possibilities expanded quite a bit with each additional lag step. Figure 3c illustrates the possible mechanisms behind the identified correlation of CD4+ T cells with neutrophils with a lag of 2 timepoints.

#### Conclusions

This project confirmed that levels of circulating immune cells do appear to reflect active immune mechanisms in humans, which means that peripheral blood samples can reliably be used to assess immune system activity. Though I was able to identify time-lagged correlations between immune components, the high level of connectivity between components in the immune network made it difficult to map these correlations onto the network in a way that would output a suggested path that resulted in the current immune response.

#### Future Directions

I believe that the field of immunology will benefit greatly from increased application of quantitative techniques to biological problems. Many concepts from information theory are especially relevant, as our understanding of immune responses is essentially an understanding of how information is shared across the immune network. Two particularly promising directions that I would like to pursue in the future are below:

Calculate directed mutual information to measure the strength of edges in the network: Though we have a relatively solid understanding of the relationships between immune components, it is not yet known what the relative strengths of each are. For example, it is now well established that in cancer patients, immune cells are often activated against tumor antigens, but unable to perform their functions because the tumor suppresses the immune system (Allegrezza 2016). There are thus multiple activating mechanisms as well as multiple suppressive mechanisms at play, but how a cell integrates these conflicting signals and decides what to do is not known. Mutual information might be an approach that can help us discover out how cells weigh and integrate these signals.

Quantify the information in each immune signal. Because the immune system needs to be able to do its job even in the midst of significant interference, there’s a significant amount of redundancy in the immune communication network. Continuing the example of suppression of the immune system by cancer, there exist many immune “checkpoints”, which are receptors on the surface of immune cells that tumors use to suppress immune function (Topalian 2015). While some have had significant success in the clinic, others chosen based on knowledge of their near-identical functions have been completely ineffective in patients (Lowe 2018). If we could quantify the amount of information in each of these many seemingly similar signals, we could potentially use it to select the mechanism that is most likely to result in clinical benefit if targeted therapeutically.

The principles of information theory apply to so many different fields, and I look forward to continuing to apply the concepts I’ve learned in this class to my research going forward.

#### References

Allegrezza MJ, Conejo-Garcia JR. Targeted Therapy and Immunosuppression in the Tumor Microenvironment. Trends Cancer. 2017 Jan;3(1):19-27. doi: 10.1016/j.trecan.2016.11.009. Epub 2016 Dec 23. Review. PubMed PMID: 28718424.

Altan-Bonnet G, Mukherjee R. Cytokine-mediated communication: a quantitative appraisal of immune complexity. Nat Rev Immunol. 2019 Feb 15. doi: 10.1038/s41577-019-0131-x. [Epub ahead of print] Review. PubMed PMID: 30770905.

Brusic V, Gottardo R, Kleinstein SH, Davis MM; HIPC steering committee. Computational resources for high-dimensional immune analysis from the Human Immunology Project Consortium. Nat Biotechnol. 2014 Feb;32(2):146-8. doi: 10.1038/nbt.2777. Epub 2014 Jan 19. PubMed PMID: 24441472; PubMed Central PMCID: PMC4294529.

Huang, D.W., Sherman, B.T., and Lempicki, R.A. (2009). Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37, 1–13.

Kidd BA, Peters LA, Schadt EE, Dudley JT. Unifying immunology with informatics and multiscale biology. Nat Immunol. 2014 Feb;15(2):118-27. doi: 10.1038/ni.2787. Review. Erratum in: Nat Immunol. 2014 Sep;15(9):894. PubMed PMID: 24448569; PubMed Central PMCID: PMC4345400.

Lowe, D. IDO appears to have wiped out. Science Translational Medicine, 2018. http://blogs.sciencemag.org/pipeline/archives/2018/05/01/ido-appears-to-have-wiped-out

Murphy, K., and Weaver, C. (2017). Janeway’s Immunobiology (United States: Garland Science/Taylor & Francis Group).

Obermoser G, Presnell S, Domico K, Xu H, Wang Y, Anguiano E, Thompson-Snipes L, Ranganathan R, Zeitner B, Bjork A, Anderson D, Speake C, Ruchaud E, Skinner J, Alsina L, Sharma M, Dutartre H, Cepika A, Israelsson E, Nguyen P, Nguyen QA, Harrod AC, Zurawski SM, Pascual V, Ueno H, Nepom GT, Quinn C, Blankenship D, Palucka K, Banchereau J, Chaussabel D. Systems scale interactive exploration reveals quantitative and qualitative differences in response to influenza and pneumococcal vaccines. Immunity. 2013 Apr 18;38(4):831-44. doi: 10.1016/j.immuni.2012.12.008. PubMed PMID: 23601689; PubMed Central PMCID: PMC3681204.

Topalian SL, Drake CG, Pardoll DM. Immune checkpoint blockade: a common denominator approach to cancer therapy. Cancer Cell. 2015 Apr 13;27(4):450-61. doi: 10.1016/j.ccell.2015.03.001. Epub 2015 Apr 6. Review. PubMed PMID: 25858804; PubMed Central PMCID: PMC4400238.