What is theNeuReality
NeuReality enables AI everywhere by offering a holistic solution for inference deployment that lowers the complexity, cost, and power consumption of AI inference.
NeuReality’s AI-Centric Approach
We’ve developed a revolutionary, AI-centric architecture based on the following elements:
NeuReality has developed a new architecture design to exploit the power of Deep Learning Accelerators (DLAs).
We accomplish this through the world’s first Network Addressable Processing Unit, or NAPU.
This architecture enables inference through hardware with AI-over-Fabric, an AI-hypervisor, and AI-pipeline offload.
This illustration shows all of the functionality that is contained within the NAPU:
AI-centric vs CPU-centric
Traditional, generic, multi-purpose CPUs perform all their tasks in software, our purpose built components perform those same tasks in hardware that was specifically designed for AI inference.
The following table compares these two approaches:
|AI-centric NAPU||Traditional CPU-centric|
|Architecture Approach||Purpose-built for Inference workflows||Generic, multi-purpose chip|
|AI Pipeline Processing||Linear||“Star” model|
|Instruction Processing||Hardware based||Software based|
|Management||AI Process natively managed by cloud|
|AI process not managed, only|
|Pre/Post Processing||Performed in Hardware||Performed in software by CPU|
|System View||Single chip host||Partitioned (CPU, NIC, PCI switch)|
|Scalability||Linear scalability||Diminishing returns|
|Total Cost of Ownership||Low||High|
|Latency||Low||High, due to over partitioning|
In a traditional CPU-centric system, large portions of the workload still need to be run in software.
Only an AI-centric NAPU runs everything in dedicated, purpose-built hardware.
NR1 Network Addressable Processing Unit
The NeuReality NR1 is a network attached inference Server-on-a-Chip with an embedded Neural Network Engine. The NR1 is the world’s first Network Attached Processing Unit (NAPU). As workflow-optimized hardware devices with specialized processing units, native network capabilities, and virtualization capabilities, NAPUs are the ideal form of devices specialized for specific capabilities and important in the heterogeneous data center of the future.
NR1-M Inference Module
The NeuReality NR1-M module is a Full-Height Double-wide PCIe card containing one NR1 Network Attached Processing Unit (NAPU) system-on-chip and a network attached Inference Server and can connect to an external Deep Learning Accelerator (DLA).
NR1-S Inference Server
The world’s first AI-centric server, NeuReality’s AI-centric NR1-S is an optimized design for an inference server which contains NR1-M modules with the NR1 NAPU, which enables truly disaggregated AI service in a scalable and efficient architecture. The system not only lowers cost and power performance by up to 50X but doesn’t require IT to implement for end users.
NeuReality also gives AI-centric hardware the automatic software tools that don’t require the developer to know how it is done on the hardware level.
Software Development Kit
The NeuReality software development kit (SDK) provides toolchain, runtime, and simple UX for data scientists and devops to remove barriers in deploying AI. The SDK consists of both cloud and AI server components. The NR Manager Deployment Tools help with runtime, orchestration, full Kubernetes management, and MLOPs Integration. The NR Monitor Offline Tools allow offloading of AI pipeline and compute graph compilation.
NeuReality’s APIs provide a simplified UX for development and deployment cycles. Inference APIs hide development complexities and simplify the life cycle of AI applications by exposing functionalities through API per user.
|NR Toolchain API||NR Provisioner API||NR Client API|
|For analyzing, training, and validating data, as well as optimizing, compiling, and emulating||For deploying and monitoring||For connecting to register and serve|