Data Direct Networks Japan, Inc., which handles storage products for HPC (supercomputers, etc.), announced that it has begun selling its new data platform “DDN Infinia,” designed for AI infrastructure use, in Japan and offering the “DDN AI Application Development Support Program” on September 9th.
Infinia is a type of high-speed object storage (DDN calls it an “object platform”). Its features include linear scalability of performance and capacity by adding nodes, proven efficiency in real-world environments with over 100,000 GPUs, and containerized support for Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). It also provides S3 data services while avoiding the bottlenecks of traditional storage interfaces such as POSIX and S3. It achieves overwhelming throughput with low latency and direct-to-GPU RDMA zero-copy I/O, with no restrictions on object size or metadata. The “DDN Infinia SDK” is also available. The product will be offered as an appliance. It will be sold through partners and is scheduled to begin shipping in November. The DDN Japan Infinia page currently features the “AI2000,” a rack-mounted appliance equipped with Infinia. Meanwhile, the DDN AI Application Development Support Program supports AI application developers in utilizing Infinia. These include free provision of Infinia and SDK, free technical support for development and verification, and joint verification at the DDN Japan Verification Research Center. On September 9, DDN Japan held a media roundtable to discuss its strategy and initiatives in the AI field, including Infinia. DDN’s storage products are widely used in the HPC field, including on Fugaku. As a vendor with a strong presence in this specific field, the company rarely holds press conferences. However, in recent years, DDN storage has been increasingly adopted in the field of generative AI infrastructure (so-called AI supercomputers), which has characteristics similar to HPC. The media roundtable was held to raise awareness of DDN’s track record and activities in this field.
Doubled investment in Japan in three years to support storage in AI application development First, Robert Trindl, President and CEO of DDN Japan (and Senior Vice President of Global Sales at DataDirect Networks), explained DDN’s strategy and outlook. Trindl said of DDN’s recent history, “Technology that was once a niche HPC field is suddenly becoming the foundation of general data centers.” Generative AI now requires massive amounts of computation. “The ultimate resource is not computation, but data. How data is managed, stored, and distributed affects overall efficiency,” Trindl said. Until now, such computations have primarily been used to train generative AI models, Trindl said, without considering efficiency. However, as more applications use generative AI models for inference, efficiency in areas such as data access speed and power consumption will become an issue. “Our new product (Infinia) focuses on that, creating a new category of product. It can also support scalable applications,” Trindl said. Unlike typical foreign companies, DDN has worked with major vendors in Japan. However, going forward, as it provides products for AI applications, it will need to move beyond simply selling infrastructure. First, it will need to work closely with AI application developers. It will also need a cloud-based demonstration program to combine SaaS applications with AI. It will also need to partner with Taiwanese hardware ODM vendors. Specific measures in Japan include first increasing investment in generative AI, doubling it over the next three years. Additionally, the company will establish a system to support storage in AI application development and launch a cloud-based demonstration program based on actual applications in 2026. It will also strengthen collaboration with hardware partners. “This will enable us to achieve the same success in generative AI infrastructure as we have achieved in HPC,” Trindl concluded.
こちらもお読みください: HENNGE One adds CL-UMP support for Box management
Two storage platforms for AI infrastructure: the HPC-derived “DDN EXAScaler” and the new “DDN Infinia.” Shuichi Ihara, principal engineer at DDN Japan, explained DDN’s approach to the generative AI market. First, Ihara introduced two storage platforms for AI infrastructure: the “DDN EXAScaler,” a highly efficient and scalable parallel file system developed for HPC, and the new “DDN Infinia.” First, he explained AI learning. According to the report, HPC and AI share many common elements, such as simultaneous access from large numbers of nodes and low-latency, high-bandwidth networks. Furthermore, the use of storage in AI training involves periodically saving memory snapshots to enable recovery if a job spanning days or even months is interrupted. As data size increases, the amount of data written also becomes enormous, necessitating high-speed writes. In this field, DDN products are used in NVIDIA’s AI supercomputers, as well as in AI supercomputers at SoftBank, the Joint Center for Advanced High-Performance Computing (JCAHPC), and Sakura Internet. Next, we discuss inference. In LLM inference, text is broken down into tokens, calculated, and predicted. Storing (caching) intermediate calculation results and reusing them shortens the response time for returning results. This storage is hierarchically organized into GPU memory, host memory, local disk, and external storage. The open-source software “LMCache” automatically organizes and manages this hierarchy. DDN is currently testing the use of DDN Infinia as external storage to improve performance. Ihara also said that they are contributing to the LMCache community by providing performance evaluations based on the number of simultaneous users, and evaluation sets that put pressure on either the GPU, memory bandwidth, or memory capacity.
Explaining the features of Infinia DDN CTO Sven Ohme also said about DDN Infinia:He focused on the SDK. First, Ohme compared traditional POSIX (file storage) and S3 (object storage) storage with storage for AI workloads. While traditional storage simply reads and writes data, AI workloads require intelligent processing. He explained that while traditional storage searches for targets by file or object name, AI workloads also require vector and graph search capabilities. Ohme explained that the Infinia SDK was developed to address requirements that traditional storage cannot address. It boasts high-speed data processing and the ability to take action on events occurring in data within the system, while also achieving high scalability through technology cultivated in HPC.
Furthermore, Infinia’s distributed data capabilities allow data generated at the edge to be stored at the edge, while only the necessary data can be transferred to the data center. Ohme also explained that Infinia can integrate various data sources, such as S3 and POSIX, synchronize metadata, and incorporate it into the Infinia event engine. As an example of Infinia’s performance, Ohme presented the response time of a RAG pipeline running on AWS. He compared three implementations: AWS’s native RAG, an implementation using AWS S3 Express, and a version where AWS S3 Express was replaced with Infinia, with minimal changes to the application. He reported a 22x increase in speed and a 50% reduction in costs. Regarding the key to Infinia’s speed, Ohme pointed out that while typical storage, such as S3, has multiple layers of block storage, RAID, file systems, and object storage, Infinia’s Infinia SDK eliminates these intermediate layers and directly accesses the drives, treating them as a KV store (key-value storage). Finally, Ohme summarized the features of the Infinia SDK: It supports a variety of programming languages, including C++, Python, Go, Rust, and Java. It is software-based and portable. It eliminates the bottlenecks of POSIX and S3 interfaces. It performs metadata queries and labeling in parallel at high speed. He also mentioned that it provides acceleration at various levels, including GPUs and KV caches.
ソース ヤフー