Tuesday, 16 May 2023

Data Data Everywhere, Not a Byte to Use

Dell EMC Career, Dell EMC Skills, Dell EMC Jobs, Dell EMC Jobs, Dell EMC Tutorial and Materials, Dell EMC Guides, Dell EMC Learning

The global data footprint is growing at an exponential rate. It’s expected to be a few hundred zettabytes (that’s 1,000,000,000,000,000,000,000 bytes and counting) in the next few years. Meanwhile, a typical enterprise today manages petabytes of data – spread across on-prem, cloud, edge and multiple systems ranging from relational databases, warehouses, data marts and data lakes.

Recognizing that data access is often the first (and most cumbersome) aspect of activating data, Dell recently announced a partnership with Starburst Data, whose industry-leading platform offers answers to many problems organizations are facing with their data.

The Data-driven Enterprise


There is no dearth of tools and platforms that can help you manage your data. Yet, the rate at which we create data far exceeds the rate at which we convert it to insights. So, what has changed, and how can you better prepare your enterprise to be truly data driven?

A modern enterprise data architecture needs to embrace some key tenets:

◉ Data will remain distributed across multiple data centers, public clouds and edge – often resulting in data silos. You need solutions that can help bridge these data silos.
◉ Data movements are expensive and error prone. It is more efficient to move the consumption tools close to data, rather than moving data near the tools.
◉ We must balance the flexibility and ease of democratizing data into the hands of data citizens against the increasing need for security and governance.
◉ An open data architecture, composed of open data formats, decoupled compute and storage, ensures there is no vendor lock-in and provides interoperability with the ever-changing ecosystem of tools and platforms.

The Journey of a Data Use Case


A data use case goes through three broad phases: exploration, engineering and operationalization. The challenges around data access vary for each of these phases. Here’s an example.

Dell EMC Career, Dell EMC Skills, Dell EMC Jobs, Dell EMC Jobs, Dell EMC Tutorial and Materials, Dell EMC Guides, Dell EMC Learning

◉ Exploration. James is a data analyst who wants to create a live report that shows revenue trends by product, region and week. As James starts his use case, he needs quick access to data across multiple systems to explore and experiment. This involves reaching out to the admins of each of the data sources to get individual access, which is often time-consuming. Once he receives access, there is no easy way to discover data relevant to his use case.
◉ Engineering. Once James knows exactly what data to use and the processing required, Sally from the data engineering team creates data pipelines to ingest this data into a data lake/ lakehouse, apply transformations and tune the workflow for high performance. The report is ready to be published.
◉ Operationalization. Once the report is published, Lee from the operations team ensures the data pipelines work reliably to refresh the data regularly, thus providing the latest insights to the business.

As you can see, activating a single use case often involves multiple user personas, environments, tools and outcomes. A modern data stack should embrace these interplays of people, process and technology and enable the most efficient path to access data and convert it into insights.

The Dell Technologies and Starburst Partnership


Dell EMC Career, Dell EMC Skills, Dell EMC Jobs, Dell EMC Jobs, Dell EMC Tutorial and Materials, Dell EMC Guides, Dell EMC Learning

Together, Dell and Starburst bring you a Multicloud Data Analytics solution with the following key capabilities:

◉ Built on the highly popular open-source federated query engine Trino, Starburst simplifies data discovery and accelerates data access by enabling in-situ consumption of data – no matter where it resides on-prem, cloud or edge.
◉ With more than 50 data source connectors supported, you can instantly connect to your operational systems, data warehouse, data lakes and run lightning-fast queries to get instant insights.
◉ When required, data from remote sources can be persisted into a data lake/lakehouse to overcome network/system latency and deliver production-grade SLAs. Dell Elastic Cloud Storage (ECS) provides best of breed object storage solutions for a modern-day data lake/ lakehouse.
◉ It provides a single point of access to data with unified security and governance to ensure the right authorization and access control.
◉ You can create and publish data products, which can be organized by domains – thus supporting a data mesh architecture.
◉ Power and scale enterprise-wide analytics and AIML experiences with industry-best Dell PowerEdge servers.

Source: dell.com

Related Posts

0 comments:

Post a Comment