Features of the end-to-end data engineering platform
Data sourcing is the process of getting data from the sources. Digazu provides a range of data connectors that can connect to all the most common data sources. The “read once, use multiple times” approach of Digazu guarantees a full history with minimal solicitation of the operational systems. Once the data has been read from the source systems, it is persisted in Digazu’s internal storage.
The data as available through Digazu is production-grade and real-time. This avoids the common pitfall of sourcing data from sources that have not been designed for that level of usage (in particular data warehouses and in many cases data lakes as well). Working cleanly from the start avoids bottlenecks and technical debt.
Digazu brings heterogeneous data sources together in a common representation. This makes it possible to combine all these sources and to transform the data in a very simple way: visual transformation for the most part and SQL queries for more complex transformations.
This low-code approach means that all the data engineering can be done with very basic data skills so that organizations can focus their scarce data resources on the high-value activities (machine learning, real time reporting…) that are coming downstream.
These flexible, iterative capabilities are particularly relevant for data science where in many cases you do not know upfront what data you need for your project.
Digazu maintains an end-to-end lineage of the data as a way to answer governance and data privacy concerns. It also provides GDPR-related features like the right to be forgotten and the ability to flush data that should not be used anymore.