I-Should-Know-This-But-I-Don't

In Big Data & 5Vs - Part 2, there is a section where I try explaining what Apache Kafka is, and there I mention how I should start a Things-I-should-know-but-I-don’t series.

Well here I am starting it.

Studying for AWS certifications meant having to learn about AWS services, and I remember being super puzzled when I read descriptions along the lines of:

Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service that enables you to build and run applications that use Apache Kafka to process streaming data.1

Now there’s nothing wrong with how AWS describes their MSK service. The issue was with me having absolutely no idea of what Apache Kafka was, so I did the next reasonable thing: ask Perplexity

I asked what Apache Kafka is on Perplexity

Now I was really confused. What even is an “event streaming platform”? What does it mean for that to be “distributed”? What does it mean for systems to “handle” real-time data? Given the timeframe I had to study for my past 4 certs, I brute-force memorized the descriptions, and don’t get me wrong, I was able to answer exam questions just fine.

But something felt off, like I wasn’t internalizing anything.

So that’s the whole premise of this new series, one that I decided to call I-should-know-this-but-I-don’t. This series probably isn’t for everyone, since I’m going to get super nit-picky and question everything about how terms are used to describe and explain concepts. However, it is for those of you who are bit like me, who can’t stand just memorizing things word-for-word without having the “gut-feeling” that you understand something.

Hope this helps,
Ael

  1. source: AWS Docs on MSK 

Posts