Introduction and call for topics

jonmastersMay 3, 2021Uncategorized

This blog is a personal pet project of mine. I live and breathe for Computer Architecture, semiconductor technology, and all things in between. I was originally a Computer Scientist and so I am self taught when it comes to computer architecture, digital design, and other parts of the “full stack”. I don’t believe in arbitrary barriers between these disciplines. In fact, I believe that hardware and software are better when designed together, and that we as engineers are better when we see all of us as being on the same team, together.

My aim over the coming months is to cover a mixture of topics. Some will be “introductory” (but arbitrarily detailed). For example, I will explain exactly how a compiled/assembled program into machine code is executed in a modern Out-of-Order processor from before fetch (prefetch/predecode, etc.) through to dispatch and backend data flow execution. I will explain how speculation works, what kinds of predictors and state are kept, etc. I will walk through what a cache coherency protocol is, how it works, and how it differs from memory consistency. I will explain how modern “heterogenous” and “composable” systems work. I will attempt to provide useful examples and references to gem5 models and the like.

Some posts will dive into specific topics that are more advanced in nature. For example, I have recently been consumed with memory barriers and how one might speculate right through them using concepts familiar to those working on transactional memory. I might write this up at a high level, and give a related example of TSO implementation on x86.

My writing style will be technical in nature, but I intend for posts to be broadly accessible to those who work in the technology field (but not necessarily in microprocessor design). I will not cover anything proprietary or confidential to any one vendor, but general concepts.

Please reach out to me or comment below with ideas for topics you would like covered.

19 thoughts on “Introduction and call for topics”

Indraneil says:

May 3, 2021 at 1:12 am

I always love a good introduction to OOO concepts & Tomasulo’s algorithm!

LikeLiked by 1 person

Reply
Bhupesh says:

May 3, 2021 at 2:28 am

I would like to subscribe

LikeLike

Reply
Rakshith Ravishankar says:

May 3, 2021 at 3:15 am

>In memory processing
>Row Hammer in DRAM
>Out-of-order processing concepts
>Advance branch prediction

LikeLike

Reply
Jack Harvard says:

May 3, 2021 at 6:08 am

Maybe “why software engineers should care about computer architecture, a personal journey”?

LikeLike

Reply
qlixed (aka ebrizuel) says:

May 3, 2021 at 10:43 am

As a self taught student maybe a good way to start is to point out books and papers about comparch that you found really valuable and informative.

LikeLike

Reply
bibrak says:

May 3, 2021 at 2:41 pm

non Von-Neumann architectures

LikeLike

Reply
Jordi Gonzalez says:

May 3, 2021 at 2:51 pm

Review of notable Apple patents would be interesting!

LikeLike

Reply
SM says:

May 3, 2021 at 4:30 pm

Memory management in general, and virtual addressing in particular (especially from the software perspective)

LikeLike

Reply
Gisa says:

May 3, 2021 at 6:00 pm

Domain isolation using ASIDs in ARMv8

LikeLike

Reply
RedwoodCoast says:

May 3, 2021 at 7:22 pm

SIMD, MMX, SSE, SSE2, SSE3, SSSE3, GPUs
REGISTERED CONCEPTS VS STREAMED

LikeLike

Reply
randomness2014randomness says:

May 3, 2021 at 7:23 pm

SIMD, MMX, SSE, SSE2, SSE3, SSSE3, GPUs
REGISTERED CONCEPTS VS STREAMED

LikeLike

Reply
Scott Hamilton says:

May 3, 2021 at 7:47 pm

How about hardware interrupts on NUMA systems? Extremely large systems with thousand cores struggle with balancing the interrupts coming from hardware. It would be interesting to get a hardware engineers take on the issue coming from the software side I see the struggles.

LikeLike

Reply
Ash says:

May 3, 2021 at 9:02 pm

Differences between a mobile cpu and a conventional one. How does the computer architecture handle power consumption for a mobile processor.

LikeLike

Reply
ObscureBug says:

May 3, 2021 at 9:40 pm

Some topic suggestions to add:

– How locks like mutexes are implemented, hence memory ordering
– Different types of caching (e.g. write through, write back, etc)
– Different types of caches (instruction, data, TLB, TSB, MMU’s, etc)
– How a page fault and hence traps work at the lower levels
– How a context switch works at the lower levels
– How a BMC works and interacts with a system
– Different types of buses and mechanisms for hardware buses (e.g. IPMB, SMBus, LPC Bus, i2c, JTAG, etc)
– Power control and power states
– Microcode, how it is implemented and how it is used
– Multiprocessor vs multicore, chiplets, mutiprocessing, threading, hyperthreading, etc.
– Impacts of word size, cache line lengths, page sizes
– TPM’s and the entire secure boot / firmware area
– Virtualisation implementation at the hardware level
– How I/O works, particularly configuring and talking to HBA’s, etc.

LikeLike

Reply
jh says:

May 3, 2021 at 9:45 pm

Would love to read about & better understand how microcoded CPUs work & are implemented, with relevant examples.

LikeLike

Reply
RK says:

May 3, 2021 at 10:59 pm

Exception Handling
Memory Controller
I/O Controller Architecture
Network Processors

LikeLike

Reply
Vad says:

May 4, 2021 at 11:43 am

Hi, I would like a post about how does SW run on NPU, main differences in concept with regular cortex CPU.
Thanks Jon!

LikeLike

Reply
Bart Smaalders says:

May 4, 2021 at 2:22 pm

A favorite quip is that the hardest software to change is the stuff in people’s heads. What are the trade-offs of new chip features that require people to learn new things (multi-threading, to a lesser extent store buffers, transactional memory, persistent memory) vs features that require little beyond a recompilation?

LikeLike

Reply
Tomáš Pospíšek says:

May 4, 2021 at 3:14 pm

tldr; if we’d be able to start both hardware and software from scratch: how could a vastly improved CPU/SW architecture look like? In case such a paradigm switch is utopically imaginable, could it become reality? How?

I teach an OS course. Every year I learn a bit more that makes me go “woah only now I’m realizing how much resources are wasted on this particular mechanism here”. I have a vague understanding of it all and how it came to be. I’d sum it up as: first there was hardware. At some point Von Neumann style CPU architectures became the mainstream. At some point C tried to be a portable assembler. The Bell Labs gang hacked together a multitasking OS. And that’s the end of it. From then on everything was in C, and the hardware tried to support the C model of how stuff is supposed to work (a heap, friggin pointer arithmetic, a stack) and the syscall/interrupt mechanism, virtual memory (I’m of course overnarrowing the perspective here) OS concept. It seems like a stinking pile of history caught in an eternal maelstrom around itself, never again able to break free from its historically taken architecture decisions.

So if the whole industry/oekosystem could start from scratch, do away with historic decisions, build everything based on the needs of modern high level languages and language concepts, and software systems architectures. How would such a new CPU look like? Is there any hope we can break free from today’s huge pile of historical liabilities/technical debt and move to something very different? Via which niche could that be achieved?

Maybe I’m just naive and my perspective that today’s tech is just a huge pile of historical liablities is rooted in me not having a deep enough understanding of the whole tech stack and the reasons why there’s no other way to solve the given fundamental problems. And so this whole request here is moot. However is it? So that’s my topic proposal.

LikeLike

Reply