Thursday 30 May 2024

IT Operations Needs a New Wardrobe

IT Operations Needs a New Wardrobe

Like sandals with white socks and jackets with shoulder pads, IT Operations is woefully outdated and desperately needs a wardrobe change. Fashionable IT tools come and go, but one way or another, IT Operations always ends up where it started: siloed, reactive, manual and slow to find root cause of business-impacting incidents.

Fortunately, we are the midst of an accelerating software revolution to address this: the AIOps (or artificial intelligence for IT operations) revolution. Unfortunately, AIOps today is beset with vendors dressing up older-generation software and making unsubstantiated claims about their AI credentials. Dell Technologies, however, has a heritage of building AI and automation into its infrastructure products. Through home-grown innovation and partnerships, Dell is extending that track record with the introduction of APEX AIOps—a unique, deeply AI-enabled, full-stack observability and incident management software-as-a-service that is:

  • Exceptionally tailored to support modern AI workloads and cloud to ground/ground to cloud service reliability engineering operations.
  • Backed by industry-leading outcomes, generating meaningful improvements in customer satisfaction while revolutionizing the cost of operations.

Without AIOps, Business Isn’t Dressed for Success


Today’s digital infrastructure is so complex and high scale that managing it is beyond human comprehension without the aid of machine intelligence. Add in AI workloads, and the problem just gets worse. Old tools just don’t cut it. With the continuing shift to multicloud and the extreme demands that AI workloads are placing on infrastructures, the really innovative AIOps solutions will be the trend-setters. The only way we can ensure the integrity of modern GenAI-aware digital infrastructure is by automating the infrastructure operations itself, which in turn will only be possible if AI is at the core of that activity. 

APEX AIOps Software-as-a-Service is Well-Suited for Modern Business


Dell has cleverly integrated AIOps technologies from its own software portfolio (CloudIQ), from its AI partner IBM (Instana) and from its acquisition of Moogsoft, the original AIOps pioneer, for which Dell was an early investor.

Altogether, APEX AIOps’ integrated capabilities are:

  • Infrastructure Observability. AI-driven observability for Dell infrastructure health, cybersecurity and sustainability (based on Dell’s CloudIQ software enhanced with generative AI).
  • Application Observability. AI-driven full-stack observability (based on the integration of Dell’s Infrastructure Observability software and IBM’s Instana Application Observability software) for assuring application reliability.
  • Incident Management. AI-driven incident management to detect incidents and then simplify, automate and accelerate resolution across multivendor/cloud infrastructure (based on Dell’s acquisition of Moogsoft software).

“Dell APEX AIOps Incident Management is a core part of Cognizant’s Neuro IT Operations, which provides organizations with an AI-driven platform designed to reduce the complexity and operating costs of enterprise infrastructure and technology,” said Prasad Sankaran, EVP and Global Head, Cognizant Software and Platform Engineering. “Neuro IT Operations helps IT organizations focus less on infrastructure management and more on business impact through end-to-end AIOps for noise reduction, correlation and automated workflows for faster incident resolution.”

Infrastructure Observability: “Fashions fade, style is eternal,” Yves Saint Laurent


If anything is eternal, it’s the need for rock-solid IT infrastructure: servers, storage, data protection, networks and hyperconverged systems. Physical, virtual, on-premises, core, edge and cloud. With a portfolio of more than 100 patents, APEX AIOps Infrastructure Observability uses AI to help you maximize Dell infrastructure health, cybersecurity and sustainability. Its observability, notifications, recommendations and forecasting help you reduce risk, plan for the future and improve productivity. Infrastructure Observability is based on machine learning, time series correlation, seasonality (a.k.a., normal behavior over time), ensemble models, prophet forecasting, reinforcement learning, natural language processing, generative AI and other AI algorithms.

IT Operations Needs a New Wardrobe
Figure 1: APEX AIOps Infrastructure Observability process.

Based on customer surveys APEX AIOps Infrastructure Observability delivers up to 10X faster resolution of Dell infrastructure issues and saves one day in systems administration per week on average. If you are thinking, “But I have more than Dell infrastructure in my environment,” then read on. 

Application Observability: “Clothes mean nothing until someone lives in them,” Marc Jacobs


Your business lives in your applications. So, you had better make sure they are making your customers, partners and employees look good and feel right. That means no toleration for latency. APEX AIOps Application Observability uses AI to yield high-fidelity application call, error and latency metrics with one-second granularity and correlates that with Dell infrastructure health analytics. This helps you determine if root cause is the Dell infrastructure (e.g., servers and storage) or the application itself. These insights are driven by natural language processing for automation matching, seasonality machine learning algorithms for adaptive threshold smart alerts, forecasting algorithms, machine learning model for log parsing, forecasting algorithms, generative AI and other AI algorithms. Application Observability also sees non-Dell host servers.

IT Operations Needs a New Wardrobe
Figure 2: APEX AIOps Application Observability process.

This solution is an innovative integration of Dell’s APEX AIOps Infrastructure Observability and IBM’s Instana application observability software-as-a-service. Based on customer research, this technology yields up to 70% reduction in mean time to resolution of application issues and 3X increase in application deployments.

Incident Management: “Fashion is like eating, you shouldn’t stick to the same menu,” Kenzo Takada


The fact is every digital infrastructure is a multivendor/multicloud mixed menu. You must have a way to manage it altogether as one entity; otherwise, chaos will prevail. With a portfolio of more than 50 patents, APEX AIOps Incident Management uses AI to automate the entire incident lifecycle for your multivendor/multicloud environment. It ingests operational data from all your multivendor (e.g., application, infrastructure, virtualization, database, etc.) monitoring tools to reduce hundreds of thousands of events into thousands of alerts into tens of actionable incidents. Next, it differentiates each incident’s earliest alert from symptomatic alerts for probable root cause, recommends the specific manual remediation and can trigger automated remediation.

APEX AIOps Incident Management is based on supervised/unsupervised machine learning, natural language processing, deduplication/filtering, correlation, time series analysis, deep learning/neural networks, anomaly detection, causality and other AI algorithms. In fact, being the pioneer (formerly Moogsoft) in AIOps, the platform benefits from over 50 original and patented inventions. Besides out-of-the-box integrations with many incoming third-party monitoring tools, it is integrated with popular ticketing, IT communications and automation tools to trigger outgoing workflows and has an easy-to-use, do-it-yourself integration toolkit.

IT Operations Needs a New Wardrobe
Figure 3: APEX AIOps Incident Management process.

Customer research shows that APEX AIOps Incident Management yields 99%-plus reduction in event noise, 50%-plus reduction in service tickets and 93% reduction in customer-reported issues.

“Over the next year, ESG expects enterprises to become more sophisticated with AI as a lever for business process acceleration, revenue growth and innovation, but this risks introducing more complexity into their IT environments.  With APEX AIOps, Dell delivers a powerful AI-driven observability and incident management solution aimed at the right problems – IT Operations complexity and efficiency.”

– Jon Brown, Senior Analyst – Enterprise Strategy Group

The Future of AIOps


Playing this all forward, the next few years will throw up some big challenges, and organizations caught with last-generation IT tools—or worse AIOps solutions that are really last gen in disguise—will suffer availability embarrassments. These embarrassments could be existential to brand reputation and potentially the companies themselves. A period of rapid adoption of real AIOps solutions will ensue, and this will improve matters. Failure to adopt proven AIOps solutions, like APEX AIOps, could become a defining choice for businesses.

Source: dell.com

Wednesday 29 May 2024

Dell NativeEdge Accelerates AI Innovation at the Edge

Dell NativeEdge Accelerates AI Innovation at the Edge

In an era marked by the transformative influence of artificial intelligence, the strategic importance of edge is more apparent than ever. Today, Dell Technologies is announcing collaborations with NVIDIA, ServiceNow and Microsoft that enhance how businesses harness AI at the edge. We’re also introducing a new release of Dell NativeEdge, our edge operations software platform that offers an end-to-end, full-stack virtualized solution at the edge, streamlining the development, deployment and scaling of AI applications at the edge.

New Dell NativeEdge Collaborations to Simplify AI Application Deployment at the Edge


As part of the latest Dell AI Factory with NVIDIA announcements, we are expanding our strategic partnership with NVIDIA to enable and accelerate AI applications everywhere that customers need it. Dell NativeEdge is the first edge orchestration platform that automates the delivery of NVIDIA AI Enterprise, an end-to-end software platform that includes NVIDIA NIM and other microservices for development and deployment of production-grade applications.

For NVIDIA AI Enterprise customers, this collaboration combines NativeEdge, Dell’s edge operations software platform, with NVIDIA AI tools and SDKs. From video analytics with NVIDIA Metropolis, speech and translation with NVIDIA Riva, to optimized inferencing at the edge with NVIDIA NIM—we’ve got it covered with new deployment blueprints that automate the delivery of NVIDIA AI frameworks to edge devices and beyond.

This NativeEdge capability makes it easy for developers and IT operators to develop and deploy AI solutions automatically at the edge on NativeEdge Endpoints, powered by NVIDIA-accelerated computing. This innovation spans a wide range of use cases, such as visual analytics, industrial automation and personalized retail experiences—and we’re just getting started.

Dell NativeEdge and ServiceNow Create Industry’s First Closed-Loop Integration for the Edge


Building upon our partner ecosystem, Dell NativeEdge now integrates with ServiceNow’s Now Platform, simplifying AI application development and deployment at the edge. This integration enables businesses to efficiently extend IT operations from the core data center to the edge, offering an automated edge management solution that will span from Day 1 initial deployment to Day 2+ operations and beyond. With this closed-loop automation, our partnership will simplify the orchestration, management and workflow of edge computing resources, providing more efficient, agile and secure operations and service models for AI and other edge workloads across multiple industries.

Accelerating Edge Innovation with Azure Arc and Dell NativeEdge


The momentum at the edge is further amplified with our introduction of Azure Arc enablement automation by Dell NativeEdge. This new solution aims to improve the Azure customer experience and enhance security at the edge. The integration focuses on simplifying edge operations, advancing AI capabilities at the edge, and providing comprehensive protection with Zero Trust security principles. With the automation of Azure Arc enablement, NativeEdge enables customers to integrate Azure services into their environment effortlessly, bringing Azure’s cloud benefits such as automation and rapid deployment to the edge. Finally, NativeEdge optimizes the edge by creating curated deployment blueprints and using Azure Arc to simply customer experience – for example, the deployment of Azure IoT Operations on Kubernetes.

Continued Momentum in the Industry ISV Ecosystem


To drive better business outcomes and edge use cases, we have introduced six new NativeEdge solutions for ISVs serving manufacturing, retail and digital cities. In particular, a new Unified Operations Center with Aveva solution enhances data management and citizen services by providing a 360-degree operational view for city planning. It integrates various city systems using Aveva’s expertise, Astrikos AI and Dell NativeEdge, offering a secure, data-driven approach to optimize urban management and infrastructure.

Dell NativeEdge Enhancements Improve Edge Application Performance, Scalability and Security


We’re announcing a new release of Dell NativeEdge that includes application deployment on bare-metal containers delivering better performance, scalability and security. We are introducing REST APIs for Dell NativeEdge to integrate into DevOps workflows and offering new tools (e.g., Visual Studio Code plugin) to help customers develop application integrations. In response to customer demand, we are now offering both NativeEdge software and NativeEdge Endpoints as a single monthly subscription with Dell APEX, providing an OpEx model for consumption.

Additionally, Dell has expanded edge infrastructure support with NativeEdge Endpoints, including Dell Precision workstations and additional PowerEdge servers, including the new PowerEdge T160 Server. A special highlight is the new PowerEdge T160. At just 17L, 42% smaller than its predecessor, it is ideal for small spaces such as retail stores.

Empowering Edge Transformation: Introducing New Edge Services


Finally, we are launching two new Edge Services—ProConsult Advisory Services for Edge and Infrastructure and Application Design Services for Edge. Dell Services experts will help assess your current state and build the edge strategy to reach desired state, as well as design the edge environment to maximize efficiencies, performance and ROI.

Source: dell.com

Tuesday 28 May 2024

Three Easy Steps to Delivering GenAI Use Cases Faster

If your organization is like many, you have identified numerous generative AI (GenAI) use cases that can positively impact your business. And you’re continuing to discover more.

You’d like to conduct proofs of concept (PoC) for as many use cases as possible. At the same time, you have a limited number of AI-capable developers on your team.

One thing is certain: you don’t want your critical developer resources to be spending time on comparing technology stacks. Rather, their focus needs to be on getting business functionality out to users.

Now, as part of the Dell AI Factory, we’re making it easy to boost AI developer productivity in three easy steps. This approach combines retrieval augmented generation (RAG) on a powerful AI mobile workstation running developer workbench software, with new professional services to make it easy for you to take advantage of this innovative approach. In turn, you’ll deliver more solutions to your business users faster.

Step 1: Start with RAG to Easily Bring AI to Your Data


The first step is to use retrieval-augmented generation, or RAG, to “augment” the language model’s training knowledge by retrieving context-specific data from a vector database. The model combines the retrieved data with its training data to answer the prompt. As an example, a model has been trained on general customer support interactions, which is augmented with a vector database populated from past tickets and cases from a company’s own support history database.

The vector database used in a RAG model is easy to keep up to date with automated processes to regularly load new data, such as new pricing or technical improvements.

Developers have the skills to build RAG PoC projects, whereas data scientists must be added to the team to fine-tune a model. RAG also requires less GPU and CPU memory, storage and compute for inferencing than fine-tuned models.

An additional benefit is that a base model, such as one for a virtual professional assistant, can be applied to multiple use cases by using it with RAG and different vector databases.

Three Easy Steps to Delivering GenAI Use Cases Faster

Step 2: Add Precision AI-ready Workstations to Make AI Developers More Agile


IT leaders can equip developers with an environment that lets them work quickly with models, vector embedding and RAG. The environment needs to be fast, contain robust developer tools and protect corporate data from outside exposure.

Three Easy Steps to Delivering GenAI Use Cases Faster
Running GenAI workloads on a powerful workstation creates a dedicated AI developer environment that promotes efficient development, real-time creativity and improved user experiences. Developers have their own sandbox environment for GenAI experiments and PoC projects, with the ability to quickly test variations in model parameters.

Dell Precision AI-ready workstations have powerful, scalable CPUs and the latest professional NVIDIA RTX™ GPUs to meet the demands of GenAI development. They can simplify deployment and development of complex GenAI workloads out of the box, enabling developers and data scientists to customize and deploy LLMs.

When NVIDIA AI Workbench is added to a Precision workstation, developers get a rich set of tools for data science, machine learning and AI project development. AI Workbench streamlines access to popular repositories like Hugging Face and contains tools for RAG retrieval, model customization, inferencing, moving and scaling workloads, automating workflows and much more.

Step 3: Accelerate Your RAG Momentum with Dell Services


Using RAG with a compact language model and vector database simplifies generative AI development projects. Equipping developers with Dell Precision AI-ready workstations with NVIDIA AI Workbench further reduces complexity.

To accelerate your GenAI projects even more, Dell is introducing Accelerator Services for RAG on Precision AI-ready workstations. This service helps customers jumpstart their journey into GenAI.  We provide a ready-to-use mobile lab as a convenient, cost-effective way for customers to explore use cases and improve skills in a low-risk environment. This mobile lab not only enables developers to experiment with and investigate GenAI, but also is an ultra-convenient way to demonstrate the effectiveness and outcomes of GenAI.

Expert consultants will set up a GenAI lab on a mobile Precision workstation and implement a RAG use case with your data. The service includes installation and configuration of NVIDIA AI Workbench. Dell transfers knowledge to your team throughout the process so that each developer is prepared to take on new projects.

Start Exploring Your Use Cases Today


According to IDC, two-thirds of businesses will leverage GenAI and RAG to power domain-specific, self-service knowledge discovery by 2025, improving decision efficiency by 50%.

Dell solution engineering and NVIDIA AI Workbench make it easy to develop GenAI solutions on Dell Precision workstations and then deploy them on a full range of Dell infrastructure—AI servers in the datacenter or at the edge, or in a private cloud.

Experiment with more of your backlog of use cases faster. You’ll deliver solutions to business users sooner, and your AI developers will swiftly ramp up their proficiency.

Source: dell.com

Saturday 25 May 2024

What’s on Tap for 5G at Dell?

What’s on Tap for 5G at Dell?

If you’re wondering how 5G will impact the businesses of tomorrow, consider Exhibit ‘A.’ The Boston area-based brewing company is quickly building up a loyal (and thirsty) following with craftily named beers like Hair Raiser and The Cat’s Meow. As any brewer knows, creating a great beer is a process. More than just choosing the right hops or creating a clever marketing strategy, brewers need a repeatable, measurable process from fermentation through production. It’s a hands-on business that requires diligence because trouble in the production process is always brewing around the corner.

It was exactly this sort of trouble that led Exhibit ‘A’ Brewing Company to connect with Dell Technologies for a wireless solution. Specifically, Exhibit ‘A’ was having trouble with temperature control in several of its brewing vats. If you’re unfamiliar with beer brewing, it suffices to say that temperature and pressure control are critical to the fermentation process. A few degrees or PSI too high or too low can change a beer’s taste. This change of taste is not typically good, resulting in a lost batch of beer and, thus, thousands of dollars.

Crying Over Spilled Beer

To monitor beer production, Exhibit ‘A’ uses temperature and pressure sensors throughout its production floor. These sensors help the company control and maintain its entire production process, from fermentation to final canning/bottling. However, these physical sensors require a human to read them, which is problematic unless you plan to spend 24 hours on the brewing floor. Exhibit ‘A’ needed a way to “read” all these sensors and send out alerts when a temperature or pressure reading changed. And that’s where Dell Technologies enters the story.

The 5G Open Innovation Lab and Dell are dedicated to helping companies like Exhibit ‘A’ use technology to find innovative solutions to business problems. Also, there may have been the promise of free beer. Just kidding! We worked with Exhibit ‘A’ to build and implement a solution that could use 5G technology to collect readings from all the sensors securely and send alerts to the smartphone of Exhibit ‘A’ co-founder Matthew Steinberg when temperature and pressure readings went above or below the acceptable range. Now, Exhibit ‘A’ has the information they need, in real-time, to quickly address and correct problems on the production floor to minimize waste and save up to 3% of operating costs. The best part is that any manufacturer can do the same with our starter kit, which is a plug-and-play solution to monitor liquid temperature and pressure over 5G.

After implementing the fermentation tank edge and 5G solution, work was started to bring computer vision and AI to the canning line. Operating the canning line reliably has been an uphill task, and there has been a lot of beer loss. The canning line processes 72 cans per minute, and if just one goes down, the operator must halt production, work backwards, locate, identify and fix the problem. Halting production involves not just losing momentum but often means the loss of dozens to hundreds of cans and the cleanup of a significant mess.

By working with the canning line operators, Dell identified four locations in the line that needed camera or sensor monitoring to detect a can issue (dented, label misalignment, fallen, etc.).

Dell utilized AI and computer vision to identify and remedy the problem proactively. We have created a solution that monitors the canning line via sensors and cameras using Dell’s PowerEdge XR4000 paired with an NVIDIA L4 GPU and computer vision software from Telit Cinterion. The AI on their canning line saves 25% of labor costs during a manufacturing run and frees up their staff to brew more beer.

Private 5G Was Made for Manufacturing

Edge solutions like Dell has done at Exhibit ‘A’ depend on reliable connectivity. If the edge devices, such as cameras or temperature gauges, lose connectivity, the solution will quickly not work properly. Like many manufacturing environments, breweries feature a lot of metal, thick walls and plenty of moving parts. In these conditions, private 5G is typically a better wireless choice than Wi-Fi because it delivers seamless connectivity through solid objects. Private 5G is also more scalable and can easily cover large areas while eliminating connectivity dead zones.

Dell can now bring private 5G connectivity to edge solutions like Exhibit ‘A’ through its recently announced strategic partnership with Nokia. Dell and Nokia have built a demo integrating the Nokia Digital Automation Cloud (NDAC) private 5G solution with the canning line solution built for Exhibit ‘A.’ The Nokia private 5G solution provides the connectivity for the cameras that feed the computer vision capabilities. Dell and Nokia are also working together to integrate Nokia’s NDAC solution with Dell NativeEdge, the edge operations software platform that will provide a comprehensive, scalable solution for enterprises.

If you have a business problem that you think 5G can solve, reach out. If you’re in the Boston area, don’t forget to visit the Exhibit ‘A’ taproom in Framingham, Massachusetts, and check out the Dell solution in action. Tell them Dell sent you.

Source: dell.com

Thursday 23 May 2024

New Machines For A New World

New Machines For A New World

Not since the beginning of the Industrial Revolution nearly 300 years ago has the world experienced technological advancement as rapid and profound as today, at the beginning of the age of artificial intelligence.

Instead of the steam and steel that fired the industrial revolution, the raw materials of the AI revolution are data and information, said Dell Technologies Vice Chairman and COO Jeff Clarke during a keynote Tuesday at Dell Technologies World in Las Vegas.

“The new machines are GPUs capable of massive parallel processing performing trillions of floating-point operations per second,” Clarke said. “That, when coupled with high-speed AI fabrics, AI speed storage, the right models and data tools, transforms data and information into insights, knowledge and wisdom we’ve never had before.”

A new computing architecture is required to handle AI workloads. Traditional architectures are simply not equipped for the task, Clarke said. Dell calls this new architecture the Dell AI Factory, and as with any factory, the Dell AI Factory takes raw material, data, and turns it into something useful, insight and knowledge.

The strategy behind the Dell AI Factory is based on several key assumptions. Namely, that vast majorities of data sit on-premises, rather than in a public cloud, and that half of enterprise data is generated at the edge.

This means that for the sake of efficiency and security, AI must be brought to the data, rather than the other way around, Clarke said. It also means there’s no one-size-fits-all approach to AI, and that AI requires a broad, open ecosystem and open, modular architecture, he said.

Most enterprises will not train large language models themselves but turn to open-source models to use GenAI in their business, Clarke said. Over time, smaller, optimized open-source models will help enterprises achieve better performance and efficiency. Intelligent data pipelines will ensure AI is primed with correct, comprehensive data, and inferencing will be performed anywhere an AI-guided outcome is desired, he said.

Because of this, Clarke said, AI Factories must come in all shapes and sizes, from mobile workstations or a single server, to multiple data centers containing hundreds-of-thousands of GPUs connected as a single, cognitive computer.

The best way to put this new architecture to work is to separate it from legacy systems and optimize each for its respective workload, Clarke said. AI workloads require accelerated compute, optimized high-speed storage, high-throughput low latency network fabrics, data protection, integration into a common data pipeline and AI PCs.

This may seem like a lot. These are complex, highly technical, engineering-intensive systems that require high levels of software and networking expertise. Still, Clarke’s advice was simple: “Start getting ready now.”

In the coming years, the compute requirements are expected to rise astronomically. By 2030, only about 10% of compute demand will be earmarked for training. The lion’s share will be used for inferencing. Also by 2030, data center capacity will grow 8X, and by the end of the decade the PC install base will refresh and there will be two billion AI PCs in use, Clarke said.

To dig into the networking side of the AI equation, Clarke welcomed Broadcom President Charlie Kawwas to the stage, who said networking is essential to the communication required to bring large-scale AI to life, and discussed Broadcom components used in new Dell switches and servers.

Clarke was also joined on stage by Arthur Lewis, president of Dell’s Infrastructure Solutions Group, who walked through the company’s AI solutions portfolio in detail.

“The advancements we’ve seen in AI will not only accelerate the value the world’s data will bring to organizations of all sizes. It will forever change the architecture of data centers and data flows,” Lewis said. “Silos of the past will be dismantled. Everything will be connected.”

“We sit at the center of an AI revolution,” Lewis said, “and we have the world’s broadest AI solutions portfolio from desktop to datacenter to cloud, a growing and great ecosystem of partners and a full suite of professional and consulting services.”

Lewis gave a rundown of Dell’s AI portfolio and previewed future solutions. To talk more about that growing ecosystem of AI partners, Lewis brought Dell SVP and CTO for AI, Compute and Networking Ihab Tarazi to the stage. Tarazi was joined by Sy Choudhury, Director of AI partnerships at Meta.

Open-source models like Meta’s Llama 3 are driving “the most impactful innovations” in AI, Tarazi said.

By using open-source models, companies contribute to performance and security improvements, Choudhury said. Meta’s partnership with Dell means the companies can deliver well integrated and highly optimized solutions customers can leverage for a wide variety of use cases.

“And with our permissive license,” Choudhury said, “customers like those represented here today can easily leverage the state-of-the-art nature of the Llama 3 models as-is, or fine-tune them with their own data.”

Tarazi and Choudhury walked through a demo of Llama 3 working on a product development project, and Tarazi welcomed Hugging Face Product Head Jeff Boudier to the stage to talk about the company’s efforts to make it easy for customers to build their own secure, private AI systems using Dell and open-source models. The pair introduced and demonstrated the Dell Enterprise Hub.

Sam Burd, Dell President of Dell’s Client Solutions Group, took the stage to highlight the central role AI PCs will play in the broader AI revolution.

Dell’s AI PCs, Burd said, are “a true digital partner enabling software developers, content creators, knowledge workers, sales makers and everyone in between to be more efficient, solve problems faster, and focus on the more meaningful, strategic work.”

To illustrate the point, Burd welcomed Deloitte Chief Commercial Officer Dounia Senawi to the stage to talk about the increased productivity, speed and cost efficiency the company has realized with Dell AI PCs. He also teamed with Microsoft Corporate Vice President for Windows Matt Barlow to demo the capabilities of AI PCs.

Another of Dell’s high-profile customers is McLaren Racing, and CEO Zak Brown joined Clarke on stage to talk about how the team is putting Dell AI solutions to use in development and on-track.

Technology generally, and AI in particular, are the difference between success and failure for McLaren, Brown said. McLaren uses AI to run simulations, to help make critical decisions very quickly. “We see millions of simulations before we make a decision,” Brown said. “And we work in real time. We have a split second to make a decision.”

All that happens on a backdrop of furious competition. “Take the car that’s on pole for the first race,” Brown said. “If it’s untouched, by the end of the year it would be dead last. That’s the pace of development. We used to be able to spend our way out of problems. Now you have to make sure what starts in the digital world works on the racetrack.”

Dell is bringing this new world to life with a set of familiar strengths, namely its unique operating model, including an end-to-end portfolio, an industry-leading supply chain, the industry’s largest go-to-market engine and world-class services.

“Our strategy, in its simplest form,” Clarke said, “is to accelerate the adoption of AI.”

Tuesday 21 May 2024

Safeguarding AI Infrastructure with Dell Data Protection

Safeguarding AI Infrastructure with Dell Data Protection

As organizations rush to harness the transformative potential of AI, they seek comprehensive strategies to simplify the implementation of these solutions. Dell Technologies is facilitating this journey through the Dell AI Factory, which provides targeted and repeatable success for deploying AI solutions. Built on the industry’s pioneering end-to-end AI-optimized infrastructure portfolio and open partner ecosystem, Dell is propelling AI adoption forward.

Among the advantages of the Dell portfolio is the integration of data protection within the infrastructure stack. This is paramount given Dell’s Global Data Protection Index for 2024, which revealed that an overwhelming 88% of organizations acknowledge the imminent challenge of safeguarding the immense volumes of data generated by AI.

There exists a legitimate concern that components of AI applications, such as large language models (LLMs), could create new attack surfaces for cybercriminals. All organizations must grasp the consequences of compromised AI data that lacks protection. According to research by TechTarget’s Enterprise Strategy Group, Reinventing Backup and Recovery with AI and ML, 65% of organizations confess to regularly backing up only up to 50% of their total volume of AI-generated data.

Be it escalating cyber threats, human errors, system failures or natural disasters, the risk of data loss hangs over AI applications, as it would for any other mission-critical data.

Dell Solution for AI Data Protection and Dell AI Factory


The Dell Solution for AI Data Protection, an integral component of the Dell AI Factory infrastructure portfolio, ensures the security and recoverability of the data powering AI workloads. Organizations can leverage software and appliances, benefiting from the performance, efficiency and scalability that have established Dell as a leader in data protection.

Dell assists organizations in assessing their AI data protection requirements ensuring peace of mind regarding safeguarding before deployment into production. While AI workloads may vary, there are several critical areas to consider regarding data protection.

AI training data. In the case where training data is retrieved from multiple source locations, including internal sources and public cloud services, into one location for the purposes of consolidating a single set of training data, it may be advisable to back up that data set as a singular set related to the models built from that set.

AI models. It may be desirable to back up AI models along with the consolidated data set and its parameters together for consistency and recoverability.

AI compliance. For regulatory compliance and long-term retention, including Personally Identifiable Information (PII), chain of custody and audit purposes, archival backups of all assets related to model creation, including the model and its parameters, as well as output data may be necessary.

AI cyber resilience. Cyber resiliency to maintain data integrity with immutability through layers of security and controls, separation of critical data from attack surfaces through physical or logical data isolation and ensuring data can be recovered safely in the event of an attack through AI-based machine learning and analytics.

Dell Reference Design for AI Data Protection


Recognizing the pivotal role of data protection in preserving AI infrastructure integrity, Dell is introducing the Dell Reference Design for AI Data Protection, which is based on the Dell Scalable Architecture for retrieval-augmented generation (RAG) with NVIDIA Microservices. This blueprint can help guide organizations in seamlessly integrating robust data protection measures into their AI frameworks.

The reference design offers invaluable insights for planning and implementing enterprise-grade solutions to fortify AI workloads on Dell infrastructure. Encompassing backup and recovery, as well as cyber resilience, organizations can safeguard crucial AI components. The reference design will be available in Q2.

Innovate Securely with Dell Data Protection: Modern, Simple, Resilient


Dell Data Protection offers modern solutions designed to protect all workloads across multicloud, on-premises and edge environments cost-effectively. These solutions are simple to deploy, providing consumption flexibility, ease of deployment and streamlined operations. With resilient solutions, Dell builds the foundation for secure infrastructure, ensuring organizations can recover from destructive cyberattacks swiftly and confidently.

Source: dell.com

Saturday 18 May 2024

Accelerate Modern Workloads with AMD and Dell AI Innovation

Accelerate Modern Workloads with AMD and Dell AI Innovation

Driving value from one of the most disruptive paradigm shifts in our era, generative AI (GenAI), is a critical priority for leaders, as they wrestle to leverage this technology into their organizations and overcome business challenges. But deploying best-fit AI infrastructure to power the organization while enabling developers is limited by complexity of planning AI strategies, often dependent on proprietary, closed system solutions.

Dell Technologies and AMD are making available new “easy button” AI solutions allowing developers and IT flexibility to deploy architecture that enables innovation within an open ecosystem and open AI frameworks. Furthermore, customers can rely on proven methodologies to create a winning strategy that accelerates AI outcomes with Dell Services.

Scale Up with GenAI Value

Scientists and application developers often have a wealth of experience already in AI and ML. But with larger size LLMs, getting started with GenAI depends on powerful GPUs, more AI frameworks and integrated tools, often requiring a significant investment in platforms, AI software licensing and more. Furthermore, developers can face significant barriers to integrate home-grown tools and open-market models, customize drivers and, most importantly, add company IP because many all-in-one AI software suites limit custom integrations and can require additional investment.

This is where a suite of standards-based AI tools, open-source software and frameworks can allow developers to integrate and scale their workflows, code and familiar tools into the organization’s value chain to accelerate innovation.

Harness Open-source Flexibility for Your AI Starting Point

Enter Dell Technologies newest open-source AI framework approach to enable custom applications development while taking the limits off GenAI with open-framework foundations.

The new Dell Validated Design for Generative AI with AMD Instinct™ and ROCm™ powered AI frameworks extends the Dell ecosystem to help accelerate outcomes with a multi-node design based on the latest innovations, the AMD Instinct™ MI300X accelerators and AMD ROCm AI suite.

This solution is based on the fastest ramping solution in Dell history, the Dell PowerEdge XE9680 server, supporting eight AMD Instinct™ MI300X accelerators. With 192GB of memory per GPU (or a total capacity of 1.5 TB per server), the PowerEdge XE9680 with AMD further enables organizations to train larger models, lower TCO and gain a competitive edge. The PowerEdge XE9680 with AMD GPUs is already available for orders, with full availability in June.

Developers can also start from their own desk building applications leveraging AMD-based Precision workstations, such as the Precision 7875 Tower that features an AMD Ryzen™ Threadripper™ PRO processor with up to 96 cores and with scalabale professional GPUs.

Building Blocks to Drive Scale

With an on-premises approach, the new GenAI solution delivers a faster experience with optimized AI storage and AI fabric connectivity from the latest Dell PowerScale F710 and Dell PowerSwitch Z9664F-ON.

The new PowerScale F710 delivers faster time to AI insights with massive gains in streaming performance that accelerates all phases of the AI pipeline. It offers double the write throughput per RU over flash-only competitors2 and features up to 10 NVMe SSD drives in a compact 1U form factor to further enhance storage efficiency and minimize data center footprint. PowerScale leverages OneFS software and features the latest technology to bring multicloud agility with APEX File Storage integrations, federal-grade security and exceptional efficiency for AI infrastructure.

Network performance is critical to support AI operations between GPUs, servers and storage. The Dell PowerSwitch Z9664-ON, offering 64 ports of 400GbE, delivers low latency and high throughput fabrics for modern AI clusters along with Dell’s Enterprise SONiC open-source distribution. Upcoming enhancements this summer and new network interface cards from Broadcom will boost AI fabric performance. Dell’s participation in the Ultra Ethernet Consortium (UEC) ensures organizations can rely on open approach, standards-based networking solutions to scale out their tailored AI strategy.

Integrate Value with AI Open Software Platform and Tools

Accelerate Modern Workloads with AMD and Dell AI Innovation
Source: Accelerate HPC Innovation and
AI Insights with an Open Ecosystem.
Open standards software suites enhance application development and workflow automation. The open-source AMD ROCm suite is designed to unleash the power of AMD GPUs, with greater choice and interoperability from popular software frameworks and tools (including PyTorch and TensorFlow, and other AI applications). Built on open standards, AMD ROCm reduces the need for proprietary AI software suites, enabling developers to simplify development and freely customize their workflows. Furthermore, developers can readily develop with open-source LLM models from partners including Hugging Face and Meta.

Scaling GenAI development into production is enabled by Dell Omnia open‑source software which deploys and manages high performance clusters for AI workloads among others. Omnia installs open-source Kubernetes for managing jobs and services. Developers are continually extending Omnia to speed deployment of new infrastructure into resource pools that can easily be allocated to different workloads. With Omnia, IT can optimize further to rapidly provision their AI infrastructure. In addition, Dell provides enterprise support for Omnia, giving customers the confidence to deploy Omnia in mission-critical environments.

Accelerate Your AI Journey

To quickly put the power of the Dell Generative AI solutions with AMD to work, Dell Services—recently recognized as one of the world’s leading management consulting firms by Forbes—brings deep expertise across every stage of your AI journey, regardless of where development starts, on a dedicated workstation or in servers at the data center.

From aligning a winning AI strategy or getting data ready to power AI projects, to implementing the infrastructure needed to quickly realize a secure, optimized model for key use cases or fully operating the solution for you, we will meet you wherever you are. By minimizing time-consuming operational efforts and providing your team with skills, best practices and time to focus on innovation, you will realize maximized ROI now and into the future. Leverage our Accelerator Workshop to start developing a point of view for how your business will maximize benefits from GenAI.

Comprehensive Approach to GenAI with Open Foundations

Dell Technologies and AMD aim to enable developers and IT accelerate their AI initiatives with a proven, open solutions approach that:

  • Powers AI-assisted use-case and application development
  • Delivers secure, on-premises AI applications at scale
  • Reduces barriers to integrating company IP, custom development processes and tools
  • Lowers TCO and investments with best-fit infrastructure

Source: dell.com

Thursday 16 May 2024

Unlocking Tomorrow: HPC and AI in Higher Education

Unlocking Tomorrow: HPC and AI in Higher Education

My friend Maria, who is 22 years old, told me that she often decides if she is going to continue dating someone by asking an artificial intelligence (AI) tool to give her an opinion based on the content of their social media chats. If young, college-age people are asking AI for relationship advice, they’re likely asking AI for advice on which universities to apply to and attend. Their generation has grown up in a world where AI and computer science are not just fascinating futuristic concepts on TV, but instead an integrated part of everyday life. Universities that are not promoting access to AI resources to potential students may soon find themselves in as much jeopardy as Maria’s partner.

What’s the Difference Between HPC and AI?


Most universities have offered researchers high-performance computing (HPC) resources for years. HPC offers unmatched computational power for performance-intensive simulations that demand massive parallelism, such as simulating fluid dynamics, molecular dynamics, climate modeling, astronomy and more. AI thrives on large amounts of labeled data for machine learning models, neural networks and deep learning architectures. AI’s ability to process information efficiently and intelligently to extract patterns and insights from this data drives innovation.

HPC for AI


HPC technologies are pivotal for developing AI models, as they can handle the vast data repositories generated from traditional HPC tasks, turning them into valuable assets for AI. HPC not only aids in creating datasets for AI but also enhances financial simulations like Monte Carlo methods, where AI can step in to offer more precise and timely interpretations of the growing data.

AI for HPC


Conversely, AI can streamline the workload for HPC systems by pre-processing data, narrowing down possibilities for HPC to analyze more thoroughly and even substituting extensive HPC computations with AI-generated approximations. This synergy between HPC and AI leads to a powerful combination that boosts the accuracy and efficiency of both technologies, fostering significant advancements in computational tasks.

Combining the Power of HPC with the Efficiency of AI


The symbiotic relationship between HPC and AI is reshaping the technology landscape at universities, driving innovation at an unprecedented pace. AI is no longer a separate entity; it has become an integral part of HPC. AI is now considered one of the essential components of HPC workloads. Organizations need infrastructure that seamlessly supports both HPC and AI workloads to remain competitive in today’s rapidly changing landscape.

Unlocking Tomorrow: HPC and AI in Higher Education

PC and AI Trends in Higher Education


Academic supercomputing centers are hubs for scientific discovery. A 2024 study by industry analyst Intersect360 identifies a trend in amplifying scientific research with the latest in artificial intelligence. According to Intersect360 Research, over 90% of academic HPC centers have made investments to incorporate AI into their research.

This is most evident in the incorporation of graphics processing units (GPUs), the computational elements that have powered the revolution in AI. The study found that 92% of academic HPC centers have incorporated computational accelerators, and 91% of accelerated HPC workloads are leveraging GPUs from NVIDIA.

The major trend driving recent growth has been the promise of generative AI (GenAI)—AI capable of creating original content, whether in writing, art or computer code. The study found that 35% of academic HPC-AI sites are “actively using generative AI” while 37% are “looking at building their own generative AI models.”

The Year of AI at the University of Texas at Austin


The University of Texas at Austin (UT-Austin) is ranked in the top 10 nationally for AI. In February 2024, the University launched the Center for Generative AI and announced that 2024 is the “Year of AI” at the university.

UT-Austin has a rich history of innovation in technology through the Texas Advanced Computing Center (TACC). In fact, the U.S. National Science Foundation (NSF) selected TACC’s supercomputers, including the new AI-centric system Vista with 600 NVIDIA GH200 GPUs, to be part of the National Artificial Intelligence Research Resource pilot. The pilot provides U.S.-based researchers and educators with tools and data to advance AI research. Vista is a precursor to the Horizon system, set to be the largest NSF supercomputer, offering unparalleled computing power for scientific research by 2025.

“AI has become critically important to supercomputing centers, both as the importance of AI in science has grown, and as the infrastructure needed for large scale generative AI and high-performance computing has become virtually identical,” said Dan Stanzione, Executive Director, TACC / Associate Vice President for Research, UT-Austin. “Our newest system at TACC, Vista, has been designed with generative AI systems in mind. With 600 GH200 NVIDIA GPUs, it will vastly expand our capacity to support students and researchers in AI work. We’ve seen explosive demand for these resources, and more from the students than the faculty!”

Breaking Enrollment Records at UT-Austin


While it’s challenging to quantify the direct impact of AI investments on enrollment, UT-Austin’s commitment to AI research, faculty expertise and cutting-edge facilities likely contribute to its appeal among prospective students. UT-Austin received almost 73,000 undergraduate applications for the Fall 2024 semester, breaking the previous enrollment record for the UT System’s flagship university, set in 2022. In 2022, UT-Austin enrolled 66,109 students. The university also set all-time highs for graduation rates. The four-year graduation rate rose to 73.5% in 2022, an increase of 21 percentage points since 2012.

Universities must recognize the opportunity to incorporate HPC and AI, as failure to promote access to these resources could jeopardize their competitiveness in higher education. The exponential growth in AI integration, exemplified by UT-Austin’s commitment, not only amplifies scientific research but also attracts prospective students, evident in the university’s record-breaking enrollment figures.

Download the infographic from Intersect360 Research, “Powering Discovery: The rise of AI in academic supercomputing.”

Source: dell.com

Tuesday 14 May 2024

Dell and Red Hat Transform AI Complexity into Opportunity

Dell and Red Hat Transform AI Complexity into Opportunity

In the rapidly evolving landscape of artificial intelligence (AI), seizing opportunities amidst complexity is paramount. Collaboratively engineered with Red Hat, the Dell APEX Cloud Platform for Red Hat OpenShift offers a streamlined and automated turnkey solution that transforms how organizations run Red Hat OpenShift on-premises. Today, we’re thrilled to announce enhancements that further accelerate, transform and optimize how customers harness the power of APEX Cloud Platform for Red Hat OpenShift to address AI use cases.

This starts with offering an integrated, automated and purpose-built infrastructure with powerful GPUs and extends to providing validated designs, which serve as a trusted roadmap for organizations to quickly achieve tangible business outcomes. In addition to the groundbreaking technology, our commitment to driving value encompasses a comprehensive suite of services, ranging from strategic consultation to seamless implementation. 

“The rise of artificial intelligence has led customers to seek out hybrid cloud infrastructure that accelerates AI application development and delivers faster time to value,” said Stefanie Chiras, senior vice president, Partner Ecosystem Success, Red Hat. “Innovation is at the heart of our continued collaboration with Dell and the updates announced today showcase why Dell APEX Cloud Platform for Red Hat OpenShift can empower organizations with a more consistent, reliable integrated and automated platform for running Red Hat OpenShift AI on-premises.” 

Elevate Your AI Infrastructure


APEX Cloud Platform for Red Hat OpenShift is the first fully integrated application delivery platform purpose-built for Red Hat OpenShift. Jointly engineered with Red Hat, the APEX Cloud Platform transforms how organizations deploy, manage and run containers, alongside virtual machines, on-premises. APEX Cloud Platform now supports hosted control planes for Red Hat OpenShift, which helps reduce management costs, optimize cluster deployment time and separate management and workload concerns so customers can focus on their applications.

The APEX Cloud Platform is optimized for running Red Hat OpenShift AI, and we’re expanding the range of AI outcomes it can serve by supporting even more GPUs. This empowers you with the flexibility to tailor your infrastructure precisely to your unique requirements. Notably, we’re      introducing the NVIDIA L40S GPU to support even the most demanding AI applications.

Regardless of the size, AI solutions also have varying storage requirements based on the design, so the APEX Cloud Platform for Red Hat OpenShift supports both Dell PowerFlex and now Dell ObjectScale storage so that any AI workload can be deployed. Object storage is critical for both Red Hat OpenShift AI and AI workloads in general, as it is designed for scalable and cost-effective data management, which provides an effective solution for housing massive language models and large datasets.

Unleash Your AI Vision


To illustrate the potential of Red Hat OpenShift AI on the APEX Cloud Platform, we updated our Validated Design for deploying a digital assistant utilizing a large language model (LLM) and the retrieval-augmented generation (RAG) framework from a 7B parameter model to a 13B model. LLMs are highly advanced and can generate unique answers, yet by themselves can lack domain-specific information for your business and do not stay up to date on their own. With RAG, you can augment the LLM with your company’s data, rapidly training it with relevant information that stays current. We utilized a series of open-source operators to make it easy to replicate and tweak the design to fit your specific business’s needs. Red Hat OpenShift AI represents an alternative to prescriptive AI/ML suites, providing a set of collaborative open-source tools and a platform for building models without worrying about the infrastructure or lock-in from public cloud-specific tools. This update underscores our commitment to enabling customers to leverage the latest advancements in technology as both our capabilities and the AI ecosystem evolve.

We’re also excited to announce a new solution that uses AI to achieve automated speech recognition (ASR) and text-to-speech (TTS) capabilities. This innovative design extends the power of Red Hat OpenShift AI on APEX Cloud Platform with NVIDIA Riva, a microservice that builds GPU-accelerated speech AI applications. Through this integration, you can streamline the deployment of your own natural language processing (NLP) solution, unlocking new realms of possibility.

Fast-track Your AI Implementation


Our trusted professional services experts will work with you at every stage to drive tangible business value. Dell Services bring deep expertise to accelerate time to value of the Red Hat OpenShift AI platform. Leveraging Dell’s proven methodology, from ProConsult Advisory Services to Implementation Services, we craft a winning strategy aligned with high-value AI use cases. Utilizing RAG techniques, we tailor models with your data, seamlessly integrating them into AI Avatars, chatbots or other applications, yielding more relevant and impactful results. By minimizing data preparation and LLM training efforts, and equipping your team with essential skills and best practices, we ensure maximized ROI now and in the future, freeing you to focus on innovation.

Join us at Red Hat Summit


We’re delivering a transformative platform to utilize the power of Red Hat OpenShift AI, empowering customers to quickly get started with powerful AI tools on-premises on a fully integrated solution. We invite you to discover the full potential of Dell APEX Cloud Platform for Red Hat OpenShift at Red Hat Summit this week.

Source: dell.com

Monday 13 May 2024

GenAI Showdown: SAP Edition

GenAI Showdown: SAP Edition

In this corner, wearing the cloud-colored trunks, we have RISE with SAP S/4HANA, the heavyweight champion of enterprise resource planning. And in the opposite corner, sporting the any-premises jersey, it’s GenAI, the neural network ninja. Let the battle begin!

Round 1: The RISE of S/4HANA


S/4HANA RISE struts into the ring, flexing its cloud muscles. It promises seamless migration, simplified landscapes and a subscription model that makes CFOs do the cha-cha. But wait, what’s this? A challenger approaches—any-premises deployment! Why would anyone choose it when the cloud is throwing a party?

Round 2: The Any-premises Underdog Strikes Back


Listen up, SAP aficionados! Any-premises isn’t just a relic; it’s the secret sauce. Here’s why:

  1. Customization galore. S/4HANA RISE bundles services like a gourmet meal deal. But what if you’re a picky eater? Any-premises lets you mix and match—like a build-your-own pizza. Extra cheese? Sure! Hold the hyperscalers? Absolutely. Need industry-specific tweaks? Any-premises whispers, “I got you.” Finance, manufacturing, retail—customize ‘til your heart’s content.
  2. Data security tango. Imagine S/4HANA RISE as a masked ball. Your data waltzes in, wearing a tuxedo of compliance. GDPR, CCPA and friends nod approvingly. But any-premises? It’s the VIP section—strict access control, encrypted handshakes and no uninvited guests. Finance folks, listen up! Regulated sectors need data security like a Swiss bank vault. Any-premises ensures your secrets stay safe.
  3. Latency Paso Doble. GenAI twirls onto the dance floor. Chatbots, recommendation engines—they crave real-time moves. But the cloud? It’s doing the cha-cha across continents. Latency, darling, latency! Any-premises steps in. No long-distance relationships here. Data pirouettes within your walls, whispering sweet nothings to your servers. Faster response times, fewer dropped calls.

Round 3: Cost Control Tango


Ah, the budget—a tango partner with mood swings. S/4HANA RISE winks, “Pay as you go!” But GenAI raises an eyebrow. “And when usage spikes? Surprise bills?”

  1. Any-premises budget foxtrot. You’re the conductor. Control the orchestra. Any-premises lets you fine-tune resources. No cloud sticker shock. No “unicorn tears” line items. CFOs nod appreciatively. Predictable costs, no mythical creatures involved.
  2. Hybrid waltz. Picture this: S/4HANA RISE starts any-premises, like a cautious debutante. It trains, learns and pirouettes. Then—ta-da!—it waltzes into the cloud for a grand finale. Hybrid magic! Best of both worlds. Like a tech-savvy Cinderella with glass slippers and a backup pair of sneakers.

Final Round: The SAP Symphony


As the crowd roars, SAP leaders take notes. S/4HANA RISE isn’t just a cloud fling—it’s a symphony. Any-premises adds depth, like a cello solo in a Mozart concerto. Together, they harmonize. Compliance, agility, cost control—they dance in perfect rhythm.

So, SAP enthusiasts, choose your partner wisely. Cloud or any-premises? The winner? Innovation. And maybe, just maybe, a sprinkle of ABAP magic.

Source: dell.com

Saturday 11 May 2024

Diving Deep into the Liquid Server Cooling Choices

Diving Deep into the Liquid Server Cooling Choices

As Dell Technologies continues to create more technologies that drive human progress, there can be obstacles that slow the adoption of these new solutions. In the data center, there is no more important area to demonstrate these hurdles than AI workloads. AI and other demanding workloads mandate the use of the latest GPUs and CPUs to deliver the required application performance. This means thermal and power questions often arise during deployment planning. To help, Dell’s server thermal engineering team has been delivering Dell Smart Cooling, a customer-centric collection of innovations, for many years. Triton, for example, was an early liquid-cooled server product from 2016. If we fast forward to 2024, we’re supplying server cooling solutions, like the Dell DLC3000 DLC rack that Verne Global is using and the Dell modular data centers that offer up to 115 kW per rack.

Current Cooling Choices


Previous blogs have covered the cooling requirements of the latest CPUs and GPUs and the different cooling options supported by the PowerEdge portfolio. Deploying these latest high-powered servers can mean the amount of heat generated per rack exceeds cooling handled by traditional air-cooling. In addition, customers are looking to be more sustainable and more efficient with power usage in the data center. So, let’s look at data center cooling methodologies and strategies available to customers today to support these increasing cooling demands.

Here’s a quick overview of the most common technologies used as building blocks when architecting a data center cooling environment.

  • Direct liquid cooling (DLC) uses cold plates in direct contact with internal server elements such as CPUs and GPUs; liquid is then used to cool the cold plate and transport heat away from these processors.
  • In-row cooling solutions are designed to be deployed within a data center aisle alongside racks to cool and distribute chilled air to precise locations.
  • Rear door heat exchangers (RDHx) work by capturing the heat from servers’ hot exhaust air via a liquid-cooled heat exchanger installed on the rear of the server rack.
  • Enclosure refers to the concept of containment of heated exhaust air, cooling it and recirculating it, all completely isolated away from any another data center chilled air.

Each cooling technology supports different rack thermal density and efficiencies, giving customers choices to match the cooling solution to their requirements. These solutions can be deployed from one rack to multiple aisles. In-row coolers, combined with row or rack containment, captures 100% of the IT-generated heat at the rack(s). This means that the only air conditioning required in the data hall is for human comfort. RDHx also captures 100% of the IT-generated heat to facility water at the rack and condition the air in the space at the same time. Because of this air-conditioning function, the facility water temperature provided to RDHx must be cooler (up to approximately 20 C) than what can be used with in-row coolers (up to 32 C). Higher facility water temperatures allow the chillers that cool the water to operate with lower energy, which is desirable, but only part of the whole efficiency story.

Combining these 100% heat capture technologies with DLC increases efficiency even more by decreasing the fan power required to cool the IT equipment.

Diving Deep into the Liquid Server Cooling Choices
Figure 1. Customer Requirement with Dell Suggested Cooling Solutions.

Server Cooling Efficiency


These different solutions and methods consume differing amounts of power to deliver cooling. Figure 2 highlights annual energy usage for different cooling methods when used to cool a typical rack of dual-CPU servers. The bars show the IT energy and cooling energy for each cooling approach. IT energy consumed includes everything inside the server, including internal fans. Cooling energy represents cooling items outside the server starting at the CDUs (coolant distribution unit) or CRAHs (computer room air handler) and including an air-cooled chiller outside the data center. This model is specifically for a data center located in the Southern United States.

Diving Deep into the Liquid Server Cooling Choices
Figure 2 Energy usage by cooling methods.

The first bar represents a typical data center that uses air handlers stationed around the perimeter of the data hall blowing air towards the servers. Next, adding DLC to cool the CPUs in each server can save about 11% of the total energy consumed by air-cooling only with perimeter air handlers. Replacing the perimeter cooling with rear door heat exchangers (RDHx) on each rack can save 16% annually, and adding DLC saves another 2% beyond that. As noted above, deploying IT in an enclosure with an in-row cooler permits warmer water to be used, and this brings a 19% energy savings over perimeter air handlers. Finally, combining this enclosure with DLC saves 23% of the energy consumed by traditionally cooled racks.

The Benefits of Dell Technologies Solutions


There are several alternative cooling methods in the marketplace. For example, some vendors have chosen to use direct liquid cooling on additional internal server components including memory, network interfaces, storage, etc. meaning the DLC solution is in contact with almost all heat-producing components inside each server. Often these solutions require custom copper cold plates and additional piping internal to the server to put all the components in contact with the liquid. At Dell, we don’t believe costly complex copper cooling is the best approach. We believe organizations can achieve many benefits by combining both liquid and air cooling into a hybrid server cooling solution, including:

  • Much greater flexibility in server configurations. Customers can decide the server configuration (memory/PCIe cards/storage/etc.) without being bound to one server cold plate design.
  • Designs with far fewer hoses and joints where leaking may occur.
  • Simple on-site service procedures with easy access to replace server components.
  • Selection of a broad range of servers.

Dell’s hybrid approach is less complicated, enabling greater agility in cooling new and different processors and server platforms as they become available.

Analysis using Dell’s in-house models show that that the hybrid air + DLC cooled deployment in a well-designed, well-managed low water temperature solution can use just 3% to 4% more energy in cooling than the “cold plate everything approach” used by some other vendors and bring the benefits listed above.

Harness the Next Generation of Smart Cooling


Dell continues its cooling strategy of being open and flexible to offer customers choice rather than a one-size-fits-all approach. These advanced data center cooling methods are now moving from high performance compute clusters to mainstream deployments to enable delivering the next generation of peak performing servers supporting AI and other intense workloads. Dell’s smart cooling is already helping many PowerEdge customers enhance their overall server cooling, energy efficiency and sustainability. Come and talk with the cooling experts in the PowerEdge Expo area at Dell Technologies World or ask your account team for a session with data center cooling subject matter expert.

Source: dell.com

Thursday 9 May 2024

Dell Technologies and Red Hat Drive Joint AI Innovation

Dell Technologies and Red Hat Drive Joint AI Innovation

In an era defined by rapid change and digital transformation, harnessing the potential of AI has become imperative for organizations across every industry. It’s not just about adopting AI solutions; it’s about leveraging AI to revolutionize how we operate, innovate and create value. At the intersection of Dell’s cutting-edge technology and Red Hat’s open-source expertise lies a realm of endless possibilities.

A Quarter Century of Collaboration


Red Hat and Dell have a long-standing relationship—25 plus years—of driving innovation through a shared vision and delivering value to customers. By combining Dell’s innovative technology with Red Hat’s open-source software, together we’re enabling businesses to harness the power of AI in new and exciting ways. Our most recent example of this collaboration is APEX Cloud Platform for Red Hat OpenShift, the only application delivery platform purpose-built for OpenShift.

Joint AI Innovation at Red Hat Summit


This week at Red Hat Summit, we’re excited to unveil our latest joint AI innovations with Red Hat that will accelerate, transform and optimize AI initiatives leveraging our broad portfolio of offerings to:

  • Support the most demanding AI applications. With support for Dell ObjectScale storage and integration of additional GPUs, including the NVIDIA L40S on Red Hat OpenShift AI on APEX Cloud Platforms.
  • Advance adoption and deployment with new solutions and services for popular AI use cases.
  • Increase DevOps maturity with automation across your infrastructure lifecycle using Dell Ansible modules that bring GenAI to life.
  • Accelerate your move to open networks and AI.

Don’t miss our multiple breakout sessions where you can learn more about solving complex AI use cases.

  • Tuesday, May 7 at 12:15 PM MDT: Transforming OpenShift operations on-premises with an integrated approach
  • Wednesday, May 8 at 12:20 PM MDT: Maximize GenAI’s Impact with Dell and Red Hat OpenShift
  • Wednesday, May 8 at 2:15 PM MDT: Fine-tuning open GenAI models at scale (Panel: AI Sweden, Stability AI, Red Hat, Intel, Dell)
  • Wednesday, May 8 at 3:30 PM MDT: Implementing AI with APEX Cloud Platform and Red Hat OpenShift AI
  • Thursday, May 9 at 10:30 AM MDT: Transforming OpenShift operations on-premises with an integrated infrastructure approach

Stop by our Dell Technologies booth to explore these innovations firsthand or to sign up for a customer meeting where our Dell team of experts can meet with you one-on-one to tailor our solutions to meet your specific needs.

Network with Your Fellow IT Professionals


Join us at the best networking reception of the week. The Elevate Technology User Group—an independent community of technology enthusiasts—will be hosting a community event on Tuesday, May 7 at Ruth’s Chris Steakhouse—just a three-minute walk from the convention center. Engage with peers and industry experts, hear from Dell Technologies and Red Hat leaders and enjoy delicious food and drink in a relaxed atmosphere. Space is limited.

Source: dell.com