Sunday, July 16, 2023

Day 4 of #100daysofnetworks

 Welcome to day 4 of #100days of networks. 

If you would like to learn more about networks and network analysis, please buy a copy of my book!

You can find the code for today's exercise here.

Today, I am going to show you how to ZOOM IN on any part of a network. We've made good progress on this adventure, so far, and we're following a logical path.

  • Day 1: We discussed expectations for this journey
  • Day 2: We covered network basics and did whole network visualization
  • Day 3: We talked about centralities and other ways to identify important nodes and edges
  • Day 4: We are going to learn how to zoom in on those important nodes

Why would we want to ZOOM IN on important nodes? Well, there are different ways to look at any network:

  • Whole Network Analysis (WNA): you can learn about the overall shape and size of a network. All networks are unique. Even the same network will be unique, if looked at temporally, as networks evolve over time. Whole Network Analysis is just a snapshot in time.
  • Egocentric Network Analysis: this is what we are doing today. Zooming in on individual nodes will tell you about an ego node's connections (alters), and a bit about alters' connections too.
  • Community Analysis: If Whole Network Analysis is at the WHOLE NETWORK scale, and Egocentric Network Analysis is at the NODE neighborhood scale, then community analysis is zoomed out a bit from Egocentric Network Analysis. In community analysis, I'm looking at groups of nodes. I am less interested in single nodes. I am more interested in how nodes behave together, or collaborate.

But today's discussion is on Egocentric Network Analysis. We are going to ZOOM IN on nodes of interest. That is the simplest way to think about Egocentric Network Analysis. It is less complicated than it sounds.

Whole Network - Spot Check

ALWAYS, it is a good idea to start any network analysis by doing Whole Network Analysis. However, we've looked at this network many times and know that it is small and simple enough to visualize, so let's do that, and use our eyes for insights.


This should look familiar. We've looked at this a few times by now. You should be able to see a few key nodes and a few key groups, even without looking closer.

Next, I am going to show you how to "zoom in" on any node in the network. Scroll up and try to identify all of Claquesous's connections. It's very difficult to do, because he is part of a denser area in the network. For this, we need to be able to look closer.

The best first option for looking closer is to look at a node's Ego Graph. In an Ego Graph, the node of interest (Claquesous) is in the center, known as the ego node. All of the node's connections are shown as connections around the ego node, and they are known as alters. The two things to keep in mind: ego and alters. The ego is in the center, the alters are around it.

A very cool thing that can happen in an Ego Graph is that you will also be able to see alters' connections to other alters. Rather than an Ego Graph simply being a star, sometimes there are other connections that can be interesting. In those cases, you can look closer with your eyes, or you can take another approach, which I do often: drop the center, and the alters will show as isolates and small groups.

I will attempt to show all of this in this notebook. First, let's use PageRank to identify the most important nodes in this network.


In the code, I show an easy way to extract a list of the top N nodes and use them for Egocentric Network Analysis in our next steps. I also show how to do each of the individual visualizations shown below. Please get to know the code, and try it for yourself!

Now, let's look through the top five characters shown in the above visualization.

Valjean

Here is Valjean's Ego Graph. 


Even without clicking the image for a closer look, I can see that there is one CENTER node (ego: Valjean) and lots of peripheral nodes (alter nodes). I can see that this is not a simple star network, but that there is some clustering on the center left, bottom right, and top center right. These are groups that exist in this Ego Graph. 

When an Ego Graph has plenty of complexity and is interesting to look at, one of my favorite tricks is to DROP the center. By doing this, it drops the ego node (Valjean) out of the graph, causing the graph to shatter into pieces, exposing the groups that exist in the graph. Let's do that.


Even without clicking the image to look closer, I can see that the center node is gone and that the network has shattered into pieces. When a network shatters into pieces, it often exposes some of the things I've talked about previously:

  • Connected Components: there are often several clusters of nodes still linked together. Above, I can see one large cluster on the left, and one smaller cluster on the top right. Look for a few dots situated closely together on the top right. That's the second group.
  • Isolates: there are also often several isolate nodes, which are nodes with no connections whatsoever. Above, I can see five isolate nodes. They were only connected to Valjean. With Valjean removed from his own Ego Graph, these nodes became isolates.

But most importantly, we've identified that Valjean is connected to two separate groups. The differences between these groups could make for interesting analysis. Why are they not connected? What do they do differently? And why are none of the isolates connected to anything else? What makes the isolates so utterly unspecial or special that nothing is connected to them?

Let's keep moving. I am going to do the same for the next four important characters. We could do this for every single node in the network, and it would take a very long time to analyze, but a tremendous amount of learning could be done about the story of Les Miserables if network analysis was used along with content analysis to dig deeper. Thus, the marriage of Network Science, Social Network Analysis, and Natural Language Processing is special and important to me. Moving on.

For these next characters, put your thinking caps on. Look at the images and try to answer the questions I ask.

Myriel


Myriel's Ego Graph is almost a star network, but there are three characters on the right who are connected with each other. Myriel has a high PageRank score because of the number of edges, but Myrie's Ego Graph is very simple. If we drop the center node, what do you think will happen? How many isolate nodes do you think we will see? How many groups will we see?


As expected, dropping the center node shattered the network and left one small group and several isolate nodes. I can see seven isolate nodes and one small group. What is this small group that Myriel was a part of? What do they believe and do? Who are their members? How do they know each other?

Gavroche


Like Valjean, Gavroche has a very interesting Ego Graph. I can see the one center node. How many groups do you see? A group can be two people. If we drop the center node, how many groups do you think we'll be able to see? How many isolates?


This graph actually tricked me. I expected that there'd be three groups, but that is because I simply was not looking closely enough. In the earlier image, It looked like there were three groups: top left, bottom left, and bottom right. There are three groups. However, a few people in the bottom group had connections with the top group, so dropping Gavroche was not enough to split these two groups. They have some cohesion. 

Did you guess the number of groups correctly? How about the number of isolates?

One of the groups was Child 1 and Child 2. What is their relationship with Gavroche?

What is the isolate's relationship with Gavroche?

Finally, what is this larger cluster of characters? Why are two groups linked together, with or without Gavroche? Who are these people? How would the absence of Gavroche in the story affect these characters?

Marius


Marius' Ego Graph has some interesting complexity as well. I can see at least one densely connected group of nodes on the top left, and I get the feeling that this is actually two separate groups of people but that there is some cohesion with the top left group. I expect that this network will not shatter if we drop the center node and that there will be no isolates. What is your bet? Try to draw a mental picture of what will happen after Marius is dropped.


As I expected, the group remained intact, even with Marius removed. Valjean is an important node in this network, and he has helped keep it together, along with others. 

Who are these people? How do they know each other? Why is this network so resilient? If these characters are important, what would it take to eliminate their ability to work together? On the other hand, what would bolster the network? 

Javert


Javert's Ego Graph is the last we will do today, but we could go much further. Feel free to learn from my code and investigate every node in the network. It's a great way to explore and learn!

What do you see? I see to central nodes: Javert and Valjean. I see one group of nodes on the left, and they are connected with both Javert and Valjean. I see some characters on the right who have connections to characters on the left. Because of all of this, if the center node is dropped, I suspect that this network will be resilient and not shatter. Because none of the nodes have fewer than two edges, I expect we will have zero isolates, because 2 - 1 = 1. Every remaining node will have at least one edge to another node. The network will remain intact. Essentially, this is a large group of connected individuals.


As I suspected, the network did not shatter. After dropping the center node, all remaining nodes are still connected with other nodes. The dense group is a little more discernable.

What is this group? Why are two very important characters so central in this network? 

What are the Takeaways?

It is fun to explore any kind of social network, no matter the topic. You can learn a lot about any topic by exploring the social networks that exist inside that topic. In today's exercise, the topic is Les Miserables. We could have taken the raw text of this story, used the techniques from my book, and literally converted raw text into an explorable network. We can use the text of the book alongside the network to learn more about individual communities, and I will show how to do this at a later date. This is new material that is not included in the first edition of my book.

This exercise also showed that different shapes of networks are more resilient to attack. For instance, if you take a star network (the second character) and drop the center node, the network shatters into pieces. If you take a more densely connected network and drop even the most important node, the network can still remain intact. What are the implications of that for cybersecurity, for leadership, for national security, for teamwork, or for your own life? What fragile networks exist in your own life? What resilient networks exist in your own life?

For instance, in my own life, I am part of the LinkedIn Data Science community, and regularly post content and participate in conversations. That is a densely connected network, and that network would not be affected by my absence, or any one person's absence. It would just continue to grow and evolve. That's a resilient network that exists in my life. How about a fragile network? I have very few friends I hang out with in person. In a small group, if one person is removed, the effect is devastating on the group.

Let's Zoom Out a Little

I hope you have learned a bit from these discussions. We've already covered enough material for you to jump into network analysis. We haven't talked at all about network construction, but we've found a network to use and learn from. I promise, very soon, we are going to construct our own Graphs, not use something pre-made. I enjoy using networks to explore reality, not just use someone else's networks for learning. 

We've learned how to construct a graph, render visualizations, identify important nodes, and zoom in on important nodes. These are fundamentals that you need, and you have them now, and we're only on day four. Getting the fundamentals out of the way in the beginning will leave us with a lot more time to explore. 

What are you waiting for?

If you find this content interesting, please jump in and give this a try! Install Jupyter or use Google Colab and start exploring. You don't need to know everything on day one. Just get started. Learning to work with networks and explore relationships is powerful, and this skill becomes tremendously useful the deeper you go.

That's Enough for Today

I hope you found this to be an enjoyable read, and I hope my explanations made sense. This blog post was written quickly. If you would like to learn more about networks and network analysis, please buy a copy of my book!


No comments:

Post a Comment

This Blog Has Moved!

This blog has moved to Substack! No more updates will be added to the blogspot blog. I will leave posts here but will not add new ones. New ...