Patented Acceleration Technology

 

What is GigaMACS™?

“GigaMACS™” stands for “Giga Multiply and Accumulate,” plural; it is a commercial-ready AI Accelerator.

GigaMACS takes your TensorFlow or other Convolutional and Deep Neural Network (CNN & DNN) model, as-is, and uses our patented technology to compile a hyper-optimized bitstream to use in your FPGA or for your custom ASIC.

GigaMACS™ does not change, filter, or prune the model; the exact calculations are carried through for mathematical precision. 

 
What does GigaMACS™ do?

GigaMACS will automatically accelerate your model to have near-zero latency, require no buffering, and enable your model to handle full camera HD, 4K or 8K at input speed, even in real-time.

GigaMACS works with all Convolutional Neural Network models and delivers a FPGA or ASIC ready-to-use solution.

 

How fast can GigaMACS™ process an image?

GigaMACS will accept input pixels as fast as the camera delivers them. Without using RAM, GigaMACS can easily move models to 240 FPS in high-definition and deliver outputs in real-time.

The only speed limit for GigaMACS is your camera.

 
How well is the current technology performing?

Other technology solutions cannot process full-frame high-resolution images without dropping 85% to 90% of the frames.

While GigaMACS™ is processing high-definition images at 240 FPS in less than 1-millisecond (GigaMACS™ demo), the nVidia A100 can only process 28 FPS with a 41-millisecond latency; this means, the A100 is losing 88% of the data and dropping 212 frames every second (A100 demo).

 

How does GigaMACS™ compare to GPUs?

A test of nVidia’s Tesla V100 on AWS with SqueezeNet reached 25 FPS with high-definition frames (1920x1080x3). GigaMACS automatically optimized SqueezeNet on an FPGA hardware test and achieved the full input rate of 80 FPS with high-definition images (1920x1080x3).

The nVidia hardware clock runs at 10x the FPGA’s speed, but GigaMACS still outperforms the V100 by multitudes. Also, nVidia’s latency was nearly 60-milliseconds, compared to GigaMACS,which was less than 1-millisecond.

GigaMACS will accept high-definition input pixels as fast as the camera can deliver them and produce outputs with near-zero latency.

 
What is the answer to accelerating CNN and DNN Models?

Adding more memory and faster clocks to GPUs and TPUs is a dead-end remedy for accelerating neural network models.

GigaMACS™ implements every node in the model as a synchronal pipeline and dedicates a mass multiplier to each input channel, so all nodes run simultaneously. GigaMACS™ does not use RAM to process input pixels eliminating bottlenecks. Outputs are returned as inputs are still processing, which allows GigaMACS™ to achieve near-zero latency.

As model complexity grows, GigaMACS™ scales with it but never slows down.

 

Gigantor transforms your machine learning model into a parallel pipeline performing as fast as the input.