Skip to main content

2 posts tagged with "data platform"

View All Tags

Is Unified Data and AI Platform Answer to Success of Data Science Projects?

· 5 min read
Rajesh Parikh

Background

More and more enterprises are embracing data science as a function and a capability. But the reality is many of them have not been able to consistently derive business value from their investments in big data, artificial intelligence, and machine learning. However, A surprising percentage of businesses fail to obtain meaningful ROI from their data science projects. Enumerous articles have been written on failure rate, root causes and how do we improve the success of such projects.

info

A few statistics on Data Science Project Failures

  • Failure rate of 85% was reported by a Gartner Inc. analyst back in 2017.

  • 87% was reported by VentureBeat in 2019 and

  • 85.4% was reported by Forbes in 2020.

The dichotomy of those numbers is that the outcome that enterprises are witnessing despite the breakthroughs in data science and machine learning, tons of wonderful articles and videos sharing experiences, enumerous number of open source/commercial libraries/tools.

info

Moreover, evidence suggests that the gap is widening between organizations successfully gaining value from data science and those struggling to do so. So what are the top reasons preventing data science projects from succeeding. Reasons for failure can be further categorized broadly as below:

Project Planning & Costs

  • Lack of clearly articulated business problem and documentation of it.

  • Lack of upfront articulation of business value/outcome expected from the project and therefore prioritization.

  • Lack of stackholder involvement and communication plan with them right from the beginning.

  • Unstated/Undefined deployment planning as part of project planning.

  • Not the right use case.

  • Cost of Experimentation often prohibitive and inhibits ROI.

People

  • Data Science & ML Skill Shortage.

  • Data Scientists often not trained in design patterns as programmers leads to sub-optimal , un-performant and short-lived model modeling code.

  • Data Scientists often interested in exploration and experimentation and stay away from productionizing efforts.

  • Lack of Cognizance that Data Science Model Training & Deployment often follows all the processes of a software project deployment, versioning, testing and iterations for fixing the quality. Organisation of the team often doesn’t constitute experts or trained staff who have understanding of the development, testing and CI/CD pipeline.

  • Lack/Absence of Data Culture/Maturity within the organisation.

Data Management & Data Quality Process/Tools

  • Siloed data in different repos and no clear plan of how this will work during successive iteration.

  • Insufficient or Unavailable data

  • Poor Quality of data

  • Unregulated/Unnoticed changes to schema and data distributions

Modelling

  • Model training/Experimentation often done outside of the production environment leads to completely redoing model training once the software engineers take it over for deployment

  • Lack of feedback loop from model deployment to model learning phase leads to deterioration.

  • Interpretability of the model compromised for model accuracy

  • Model Trustability with the business stackholder and many a times an unknown fear of a negative impact of model on business. Instances/Articles like Zillow substantiate potential damage that a model can do.

  • Lack of Trust/Apprehension (founded/unfounded) on model among business stackholder often leads to model not making it to deployment.

  • Lack of process for historical saving model artifacts, reason for changes etc over time leads to poor auditablity and lends itself to lack of trust.

Communication

  • Lack of coordination between business and data science teams on results/outcomes/changes.

Deployment

  • No real time auditing and logging of model results in actual deployment

  • No checks and bounds for data and concept drift and feeding the performance into the data science team and business stackholder.

  • Integration with Online Transaction systems and applications which often form the consumption layer often not planned. This leads to poor adoption of models.

While there are myriads of problems for a model to succeed through deployment and longitivity of such a deployment during the course of production usage, At Cynepia, we believe that Unified Data and AI Automation Platform and No Code Data Science and Continuous Productionization can significantly improve the chances of success modeling use cases by addressing many of the challenges above.

Solution

An End-to-End No Code/Low Code Data Science platform brings significant advantages, as listed below and can significantly address many of the data science pitfalls listed above. Unified Platform acts as a single hub for all your data, models and stackholder ensuring communication between business and data science team is near realtime ,both during the project execution and model monitoring phase.

Integrated Data Catalog and Data Pipelines ensure that data schema changes are notified and always available to the data science team, to understand if there are any upstream data quality changes.

Discovery of newer features/datasets published by data engineering team further helps create synergy on finding new useful features.

Visual Model Building helps data scientists focus on business outcome and experimentation than learning design patterns thereby improving longitivity of modeling effort.

Visual Data Exploration (EDA) and Model Interpretation enables faster socializing of data/model changes before deployment

Model Catalog further ensures model revisions are stored.

One click Model Deployment enables faster deployment of approved models to production.

Model Monitoring further helps track data/concept drift in running phase and helping ensure models are retrained.

Conclusion

End to End No/Low Code Unified Data and AI Platform offers a promising alternative to reducing data science project failures both by streamlining projects from implementation to production and monitoring as well as significantly reducing effort needed to upkeep code and data over time. Bringing all stack holders on the same page can reduce apprehensions and enhance trust among business stackholders by enhancing collaboration.

Picking Right Enterprise Data & AI Platform Strategy

· 8 min read
Rajesh Parikh

Background

A modern enterprise data and AI platform has become a non-negotiable need for a company of a certain size.

The central idea behind such a platform is that it serves as a central repository where all the data can then be converted into knowledge that can then be used by business users to deliver value across various functions.

Therefore, it is imperative that we state clearly and unambiguously that data is a first class citizen in any enterprise and therefore needs a place to thrive and grow. An enterprise data and AI platform is therefore vital to the enterprise transformation journey.

Various data processes such as data engineering, business intelligence, analytics, machine learning further help enhance the value of this data by performing various operations variety of opertions right from cleaning and preparing data, unifying data from different data sources and creating new metrics/measures that help organisations who believe in power of measurement to streamline their operations, perform root cause analysis and understand drivers for a certain entity, create a holistic understanding of customers, employees, processes and measure and improve each and every aspect of enterprise process.

As one finds out the canvas of generating potential business value from data is huge. But enterprises have to traverse through the key decision of what kind of modern data enterprise platform suits their needs. There are various ways enterprises can build a modern data and analytics platform for today.

Build your own

** Should you build your data and AI from the ground up? **

Thats potentially a multi-million dollar question often running into 10s of million depending on the scope of such a data project endeveour.

A decision to build on your own should have a much more stronger business value case, since it is coupled with long term maintainence and support costs, keeping up with the technological improvements over time to upgrade and modernize a home grown platform. In addition, newer technical requirements emanating from new forms of use cases and plathora of design choices have significantly increased the complexity of building such data architectures, often making a project execution of a build approach itself risky.

Hence our view is “Build your own” data and AI platform needs to emanate from a clean longer term business strategy and goal that is articulated well and differentiation spelt out. This needs to be further supplemented with a well defined execution scope, time, human capital and cost allocated to such initiative.

At Cynepia, We beleive the answer to above question is almost an absolute ‘NO’ for 99.9% of the enterprises, If the larger objective is to use the data to derive operational analytics and decision analytics value by using the data and analytics infrastructure.

** Buy various point SAAS solutions and integrate. **

Here again there are potentially 2 main approaches:

  • ** Would purchasing various Commercial SaaS solutions and integrating make a far more apt sense ? **

This has really been an approach on tear for past several years. Clubbed under a loosely defined term “Modern Data Stack”, A Lot has been written and discussed on this topic over last few years. A quick and apt summary of what is clubbed under such a terminology is discussed/reviewed by Approva Padhi from Foundation Capital hereModern Data Stack: Looking into the Crystal Ball.

The article also summarizes issues/areas of improvement for such a stack. At Cynepia, we beleive that there are many issues with the so called “modern data stack” approach. A few among them are as follows:

  1. Lack of finished and consistent user experience and multiple end application interfaces leading to sub-optimal user centric design and productivity gaps.

  2. Lack of thought to devolution of data and analytics skills leading to lessor scope for democratization.

  3. Higher human cost to keep the modern data stack up and running over time handing integration issues and dealing with sub-optimal architectural choices made by the different SAAS tool vendors.

Often this would need adding further layers of software applications.

  1. Cost conundrum due to piece meal approach and dependece on multiple SAAS vendors to deliver the unique SLA need of your organisation.

Leveraging Data APIs and apps from cloud vendors such as AWS, Azure or Google Cloud a better bet?

  • Another way to build the data and analytics platform is to using many of the tools, point services and programmable interfaces provided by players such as aws, azure and google cloud platform. While the usual benefits of cloud such as flexibility and scalability are available here too. In addition to the above mentioned disadavantage, there are a few additional disadvantages:
  1. Price variation: Often this services are priced by usage and cloud services bills tend to grow exponentially as the data grows. This is seen as a big irrant in services adoption on cloud beyond basic infrastructure services

  2. Governance: Data Security & Governance of many of these cloud platforms, tools and applications was designed to be generic and therefore is quite combursome, but often may need specialist to be hired to keep your data secure flawlessly.

** Buy a modern unified data and AI development platform **

There is yet another class of vendors who provide a bundled data and AI Development Platform. These platforms are usally a subset of data and AI applications built on a common architecture and user experience theme unlike unbundled SAAS Applications.

However, compared to the hype, many of these unified platforms/vendors have often lost the ability to shape the market because of variety of reasons:

  • Actual promise vs reality

Often incomplete or combined by acquiring different SAAS companies such as above and bundling the solution, completely ignoring the user centric design approach.

Often Costly and priced as per the needs of fortune 500/1000 companies making them prohibitive for the larger enterprise market.

  • Missing Pieces of the puzzle further leaves adding further layers of software applications.

Is Modern No Code/Low Code UI Unified Platform the future?

** Unified Low Code/No Code Enterprise Data & AI Platform **

At Cynepia, we see another category which is not just modern and futuristic and ensures data and AI are democratized and has the potential to reach far more users and use-cases and larger enterprise market. For name sake, we call this category ”Low Code/No Code Enterprise Data and AI Platform”

So what are the advantages of a “Low Code/No Code Enterprise Data and AI Platform”

First and foremost to set our orientation right, we strongly believe in a unified data and AI development platform, because thats the best way to make analytics affordable and expand foot print beyond fortune 500/1000 companies. Ofcourse the baseline is actual promise should match reality.

It transfers technical debt of architecture, integration/support of multitude of SAAS applications to the vendor instead of you as the customer.

We are believers in No Code/Low Code Platforms potential to deliver better value. Why not for code? It is not to suggest that for code platforms is a bad idea. Every no-code platform rides on code, so that would be an oxymoron in a way, if one is proposing no-code.

Here’s are top reasons to go low code/no code:

It’s hard to hire best of bread/smartest of the Data Architects, Software Engineers, Frontend Engineers, Data Scientists and Project Managers at the price one is willing to pay. If one can, one could potentially generate a similar or potentially better outcome as a No Code/Low Code Platform. So that leaves you with next grade of skilled developers, who then need to be trained, upskilled and hopefully you have an code output that fits your aspiration. However, most likely case is you are left with more data debt than code that you can call asset.

Most Enterprises are not software companies by business. We have seen often keeping focus on building code over extremely longer term is hard, since quick business outcomes often supercede long term value of such code assets. Historical data evidences suggest, for code data pipeline shelf life is much lower than one beleives it is.

While multitude of best of breed SAAS applications still remains an alternate choice, but if an enterprise is focused on business value and ROI. There is no reason for it to technical debt of integration, support and stitching architecture is transferred to the customer than the vendor. There is also an overall ROI question on the whole data and analytics investment as discussed earlier.

How do you strategize?

Data technologies are evolving quickly, making traditional “build your own” may be risky, time consuming and inefficient. If you haven’t already made investment in any architecture, we would strongly recommend to invest in a user centric Low Code/No Code Enterprise Data and AI Platform.

If you have already invested in one of the above given architecture and accruing technical debt which feels unmanageable, you may want to restrategize and rethink of investing in a data and analytics architecture that suits not just your today’s need but also future.

Get the power of futuristic Data & AI Platform for your enterprise.