Many IT departments are choosing to build their own customised data platform to help them in their digital transformation. Most of the time it is because they are under the pressure of the business lines (retail banking, payment…). Most of these companies are tempted to start building their own data platform including date lake, schemas, formats, historicization, scalability and governance processes. They try to rethink the way data should be acquired, stored, historised, distributed, analysed, and governed within their companies. While this is definitely possible, most of the companies are underestimating the complexity of the underlying technology of a modern data platform due to the minimisation of the following implementation challenges:

The underestimating of the above-mentioned challenges will cause the following results :

We will give some numbers to illustrate the costs of the above-mentioned results. We worked for a banking group that, under the pressure of the business lines (retail banking, payment), was rethinking the way data should be acquired, stored, historised, distributed, analysed, and governed. They launched a data hub program, where they defined (1) a target data architecture and (2) two first use cases (clickstream analytics for dynamic targeted web banners and customer 360° view for employee applications in the branches).

Like many significant banks, they thought they could cover the end-to-end implementation path from the design to the implementation. They set up a team of 20 developers for a one-year project.

After one year of work, three open-source modules were developed, but unfortunately, they were not good enough to be put in production. The mother company spent almost 4,5 Mi € on this project:

Due to the lack of results and the huge amount of costs, the data hub development program was abandoned. Each of the daughter companies had to develop its own components with a minimalist approach. The main reason for this failure is the complexity of the underlying big data technology (Hadoop ecosystem, Hbase, Hive, Spark, Cassandra, Flink, etc.).

What can we learn from it?

To avoid failures of implementation, try to take into account the following advice:

How can Digazu help?

Digazu helped many companies, during their data projects, set up their data project by implementing an off-the-shelf and low-code end-to-end data platform that orchestrates and automates the collection, storage, transformation and distribution (full streaming) of their data. Data assets are assembled back properly and distributed to the tools (reporting, data crunching, new apps) that you want to make available to the data consumers in a standardized and fully managed way. The platform will help you by :

Digazu will provide the business value that you want. You will optimize your Cost of Ownerships for your data projects and improve your value of investments. You will increase the agility of your work environment, by having scalable tools with your growing needs. Get actual control and visibility over your data and accelerate the journey from “data” to usable “data assets”.