Automata Memory Processor Points to Future Systems

SixSigma · on Dec 8, 2015

What goes around .....

Memory executing operations is an explored space, although one that came to a dead end pretty quickly, but that's no reason to dismiss it.

https://en.wikipedia.org/wiki/Content_Addressable_Parallel_P...

The CAPP executes a stream of instructions that address memory based on the content (stored values) of the memory cells. As a parallel processor, it acts on all of the cells containing that content at once. The content of all matching cells can be changed simultaneously.

One of the early Air Traffic Control computers ran such an architecture in 1972

"STARAN might be the first commercially available computer designed around an associative memory. The STARAN computer was designed and built by Goodyear Aerospace Corporation."

https://en.wikipedia.org/wiki/STARAN

A paper about it

http://www.cs.kent.edu/~parallel/papers/p405-batcher.pdf

There's also an interesting book, which I have read, on the subject :

Content Addressable Parallel Processors

Author: Caxton C. Foster

ISBN:0442224338

vanderZwan · on Dec 8, 2015

> “When I was at Intel, and especially as we were looking at exascale computing as a set of problems, the focus was at first, how to get memory closer to the processor. Now it’s shifted to how to get the processor closer to memory.” With Automata, however, he says Micron is bringing those lessons together but then asking, “What about the role of memory for doing some of the processing”? And herein lies the key. The memory becomes the compute.

This reminds me of all the memristor hype from a few years ago. Is there any overlap between the subjects?

nabla9 · on Dec 8, 2015

David Patterson (famous for RISK, RAID and computer clusters) tried to make IRAM (RAM chips with processors) happen 15 years ago. I think the idea is sound. It's just question of when.

http://iram.cs.berkeley.edu/

nickpsecurity · on Dec 8, 2015

See the Venray TOMI processors that do exactly that. Result is incredible performance at many fold lower cost in silicon.

80x86 · on Dec 8, 2015

That is correct. Also, even before that people were putting blit operations in DRAM as well as GC ops.

nabla9 · on Dec 8, 2015

RISC not RISK

dnautics · on Dec 8, 2015

At some point your operations have to get atomistic. A mult requires O(n^2) transistors for n bits, so having each machine word carry its own mult unit seems silly. I suppose one could have a single bit with an xor(add), and(carry/mult).... But isn't that just an FPGA?

abetusk · on Dec 8, 2015

Sorry to be pedantic but multiplication requires O(n ln(n)) not O(n^2) [1].

[1] https://en.wikipedia.org/wiki/Multiplication_algorithm#Fouri...

dnautics · on Dec 8, 2015

oh! very cool. Thanks! I knew about toom-cook, and based on what I do I figured it could be faster by breaking up into 'bigger power of 2 digits'. Now I know for sure (without having to implement it myself).

Realistically though, you're not gonna implement a FFT mult on a 64-bit integer/float even.

Pedantry is not annoying if it's a learning opportunity (or if it's excellently funny), IMO.

timthorn · on Dec 8, 2015

Some more detail & discussion on the platform from a couple of years ago: https://news.ycombinator.com/item?id=7314867

nickpsecurity · on Dec 8, 2015

I see it's biggest application as an accelerator for any application that's basically pattern-matching. Mining text or network packets especially.