Saturday, July 8, 2023

Welcome to Day 1 of #100daysofnetworks!

Hello everyone! Welcome to day 1 of the 2023 edition of #100daysofnetworks!

As some of you are probably aware, this is the SECOND iteration of this adventure, with the first taking place in 2020. I learned so much from that first adventure, and I'm excited to do this again.

This first post is going to be a bit long, as I'd like to tell a bit about who I am and what I have in mind.

Who am I?

My name is David Knickerbocker. I am a software engineer. Since childhood, I have been obsessed with getting computers to do interesting things. As a child, I was excited when I could programmatically get them to make beep boop noises, and then became interested in creating ASCII art after that. As a teenager, I spent a lot of time building (and breaking) computers. As an adult, my earliest obsession was web development, but that has shifted to data engineering and then to data science. My entire life has revolved around working with computers and getting them to do what I want them to do.

My career has entirely been in Information Technology (IT), but this has led me to working as a web developer, SQL developer, database administrator, data operations (dataops) engineer, data engineer, platform engineer, and now I am chief engineer in a company I am helping build.

My entire career's emphasis has been to help people. This has led me to working in cybersecurity, in a hospital, in companies that help the U.S. Military community, and eventually to building a company to solve certain specific problems. For me, everything I do and build is about using technology to help humans. I don't care about cybersecurity for the sake of cybersecurity or because it pays well. Helping people is my central mission.

Now, my work is focuses around Natural Language Processing (NLP) and Network Analysis. The things I will discuss as part of this adventure are the types of things I work on at work. This isn't a hobby, it has real-world applicability for solving problems. These days, I am most obsessed with network and NLP insights. I enjoy using NLP and networks to map out relationship and to find hidden insights. The marriage of NLP and networks makes this possible.

Outside of work, I enjoy hanging out with my cats (Eddie and Echo), playing guitar, collecting fancy rocks and minerals, and playing video games.


This is Eddie. :)

Why am I doing this?

I launched the first iteration of this adventure in 2020, after completing more than 100 days of #100daysofnlp. During the NLP adventure, I noticed just how much network data was available in the wild, and how incapable I was at doing more than surface level network analysis. I decided that I wanted to go much deeper. After 100 days of Natural Language Processing, I had built up a lot of skill and confidence. So, I wanted to see how far I could go with networks.

The answer is that it took me very far. It took me so far that I got a book deal and a company out of it, and am now so comfortable with network analysis that it has become muscle memory.

However, I haven't been able to focus on learning new things about networks as I once was, and there is more that I have been wanting to explore, but haven't found time for, and haven't had a reason to commit. 

This adventure is my commitment to spending another hundred days (at least), learning new things about network analysis and network science. I am excited to see what I will learn, this next iteration, and I'm excited to get others excited about this stuff as well. 

This is a creative outlet for me, but it is also research. Things I have learned in the first adventure, I use at work, and they went into my book. Stuff I learn in this new adventure, I will use at work, and they will appear in the second edition of my book.

This adventure will lead to new insights and techniques, and it will build skill, confidence, and intuition, as well.

Finally, the last reason is selfish: I enjoy sharing knowledge. I enjoy talking about cool stuff I'm learning, and useful things I have learned.

What did I learn in 2020?

I can still remember day 1 of the earlier #100daysofnetworks. I had just completed #100daysofnlp and felt so inadequate in my ability to use large networks. The first days were very awkward, for me. I had been working with network analysis since about 2018, when I was using network science for understanding dataflows as a data operations engineer, but I had never analyzed of visualized networks with thousands of nodes. 

Back in 2018 or so, I had also built my first social network, using text from the book of Genesis as data. I wanted to do more stuff like that, so I did. I found that practically any text could be used for constructing networks. I wrote about that in my book. I will be showing more of that in this project. 

In 2020, I was still using networkx for network visualization, and I suspected that there were better libraries available. There's still no great way to visualize networks in python, but it's getting better. I've shown how I do it, in my book, and that'll be part of this project as well.

I learned so much from reading various books on Network Science, Social Network Analysis, and Natural Language Processing. I will do another post about my favorite books, in upcoming weeks.

I learned some cool stuff about network fragility, but didn't explore the topic much. I plan to go into that further during this next iteration.

And everything I learned, from my own experience before 2020 to everything I read about in 2020, I practiced those techniques, build skill and confidence, and described them in my book, and I use these skills in my company as well.

What am I doing differently in 2023?

Most importantly, I am setting this up properly.

As I have written a book on this, I've build up an understanding of how this can be taught from basics to advanced, and I will follow that path this time, rather than being so disorganized as I was before.


What is the plan?

I plan on teaching the following, in the following order, but I may jump around a bit, to keep this organized but still a bit flexible. 

  • Introductions - this post
  • Basics w/ networkx pre-made graphs
  • Open datasets
  • Building networks, manually
  • Building networks, automatically, using text
  • Creating datasets
  • Cleaning networks
  • Graph metrics (centralities, etc)
  • Connected components
  • Subgraphs
  • Egocentric Network Analysis
  • Community Detection
  • Network weakness and destruction
  • Tons and tons of playing with interesting datasets
  • Graphs and machine learning
I am open to requests, but won't always say yes. If there's something that you'd like explored, participate on LinkedIn and let me know what you are curious to see.

What else will be covered?

If I'm teaching anything related to networks, Natural Language Processing will be included. NLP and Networks go together like <two things you like together>. Peanut butter and jelly, peanut butter and honey, steak and pepper, classical music and heavy metal.

I will also be discussing Network Science and Social Network Analysis, and differences between the two, and how they are used together.

I would also like to dabble with causal discovery, but am still exploring and learning. I will do my best to include it in this adventure. We have time.

My first use of Network Science was in mapping out dataflows and identifying critical points. I haven't discussed that much, and I'd like to show how I do this. I wrote very briefly in my book about this, but can spend a day or few on it in this adventure.

Finally, Machine Learning and Data Science is a given. 

Final Thoughts

I think this is going to be so much fun, and it gives me a creative outlet. It's useful and therapeutic to have a creative outlet, and the structured approach will allow me to expand my knowledge on what I am interested in, and I'm excited to be able to share this adventure with you as well, so that others can learn from this and get excited.

Network Analysis is a useful skill. If you have any data at all, building skill in Network Analysis will very likely give you new opportunities and perspectives in using that data.

I am going to disable comments on the blog, so that I do not have to spend my time moderating this. Please interact with me and the data science community on LinkedIn! Please join https://www.linkedin.com/feed/hashtag/?keywords=100daysofnetworks and learn with us! Feel free to use the #100daysofnetworks hashtag for your own adventure, and follow along!

Finally, if you like what I'm doing, please buy a copy of my book. You will learn from me during this adventure, but I also put a lot of time and energy writing a book about it. And after reading, please leave a review on Amazon so that others will find my book! Thank you!

No comments:

Post a Comment

This Blog Has Moved!

This blog has moved to Substack! No more updates will be added to the blogspot blog. I will leave posts here but will not add new ones. New ...