Skip to main content

Introducing Jane, Data Analyst Agent for Business Teams

· 6 min read
Cynepia Product Marketing

One of the key trends with rise of Generative AI, is the rise of autonomous agents capable of performing a multi-step end-to-end task. The agents besides leveraging the generative AI capability of understanding the human language, are further equipped with tools to interact with external environment, memory for keeping track of short term and long term conversations and learnings and often a planning agent and a workflow engine.

Xceed AI agents are designed to integrate smoothly with existing enterprise systems and workflows, ensuring minimal disruption during implementation. To learn more about Xceed AI Agents, refer to Xceed AI Agents

Jane, AI Data Analyst Agent for business teams

Today we are pleased to announce the first preview of Jane, A personal data analyst agent for business teams. Jane further is built to leverage various Xceed Analytics data applications such as Xceed Data Lake, Xceed SQL Query Engine, Xceed Semantic Engine, Xceed Vector store and configured LLM provider to execute a multi-step analytics task. Jane can today do one or more of the following tasks:

  1. Identify relevant datasets for a given user query
  2. Explore and describe key technical and business metadata associated with a specific dataset.
  3. Generate summary statistics at a dataset or column level.
  4. Answer user query on key metric or business question and get back with an insight in the form of a table.
  5. Generate a visualization report from a natural language question.
  6. Intepret retrieved data and report the same as a natural language summary.
  7. Jane's Conversational User Interface further provides rich capabilities such as a) Topic based threads. b) User can ask new question, update an existing question, add a clarification to an existing question through the conversational experience. c) Pinup the data report to Xceed Workspace/Dashboard. d) Download the report along with user summary

Jane is a keen learner and adopts continuously

Jane learns from Business/Data Analyst and adopts to your enterprise context continuously. which helps her become reliable and trust worthy.

Business users can tune Jane's response for specific enterprise contexts by providing trusted examples. A new tab Trusted Examples is now part of Xceed Catalog dataset details view. Business Analyst/Data Analyst can add trusted examples of user questions and appropriate semantic query and sql query in Xceed Data Catalog. Jane continuously learns from the provided examples and makes appropriate use of relevant examples while responding to business user queries, improving accuracy and reliability of the answer.

Availability

Jane Xceed AI Data Analyst support is now available in private preview.

About Xceed Analytics

Xceed Analytics is an AI powered comprehensive intelligent data platform for enterprise unifies all your data, analytics, models and AI use cases and products in a single view. A comprehensive data and analytics Platform is therefore vital to success of business transformation journey as we ride the new wave of Generative AI and take advantages of it.

Problem that an Intelligent Data Platform addresses

Emergence of Machine Learning and AI, along side fast pace of digital and explosive growth of data that enterprises are experiencing has made them realize that an effective approach to managing and harnessing the power of data and AI can create significant competitive advantage. Landscape for data and AI has been constantly evolving over the past decade to address the challenge and oppurtunity of managing and harnessing this data that enterprises are inundiated with. Modernizing the data and AI platform has been a constant through out the past decade.

The fragmented toolchain and siloed data within enterprises are formidable barriers that hinder the full harnessing of their data assets. When various departments and teams rely on disparate tools and systems that don't communicate effectively, it leads to inefficiencies, duplicated efforts, and a lack of a unified view of the data. Siloed data exacerbates this problem by isolating valuable information within these disparate systems, preventing cross-functional collaboration and inhibiting data-driven decision-making. The result is a missed opportunity for enterprises to extract valuable insights, achieve operational excellence, and remain agile in an increasingly data-driven world.

An Intelligent data platform helps organizations to unlock the true potential of their data assets and drive AI innovation across use-cases.

Benefits of an Intelligent Data Platform

There are enumerous benefits of a comprehensive end-to-end Intelligent Data Platform

  1. Central repository for all the data, workflows and models.

  2. Seamlessly Discover, Manage Data Quality and Govern all your data products/artifacts through a single pane.

  3. Remove data silos, keep every stackholder engaged and notified.

  4. Accelerate deriving value from their most valuable asset which is data.

  5. Enables enterprises to cut/optimize costs via No Integration stack. You no longer need to stitch individual services from multiple vendors.

  6. Simplicity of overall architecture helps in streamlining of the overall data and analytics process.

Technical Capabilities

Some of the key data tools included in Xceed Data and Analytics Platform include:

  1. Versioned, Governed and Fully Integrated Data Lake based on open standards such as Apache Parquet.

  2. Unified abstraction for all data producers. Supports multiple OLAP and compute engines

    • Duckdb, Apache Spark, Pandas, Ray
  3. All common access methods supported. Access/Configure and Monitor with your prefered access method

    • SQL or Dataframe or CLI or Python SDK
  4. No-code Data Integration. Supports most common databases, cloud storages and SAAS applications.

  5. Integrated Data Catalog with Extensive Data Discovery, Governance and Data Quality Test Features.

  6. Xceed SQL Workbench Enables analyst to carry out exploratory analysis via a visual interface. Supported Engines include duckdb, Apache drill, Apache Spark

  7. Xceed Workflows for No/Low Code Interface data transformation pipelines. Supported Engines include Apache Spark, Duckdb, Apache Drill for SQL, Pandas, Pyspark for dataframes.

  8. Xceed AutoML - Enable onboarding every day ML use-cases across Classification, Regression and Forecasting.

  9. Xceed Business Intelligence & Reporting Provides all common dashboarding features to build beautiful datastories/dashboards.

  10. Xceed Notifications Ensure all stackholders are notified

  11. Xceed Model Catalog home to all ML and AI Models.

  12. Xceed Python SDK/CLI Data users can now work via Xceed APIs and Command Line Interface besides the user interface as an alternate choice for interacting with Xceed Analytics.

  13. Microservices architecture enables scalability while providing seamless integration.

For More details on Xceed Analytics Architecture, refer to Our Architecture Page

About Cynepia Technologies

Cynepia Technologies provides comprehensive end to end data stack to help enterprises organize, connect, make sense of their data, stay connected with their insights, make faster, real-time decisions and ultimately grow your business.

To learn more about Cynepia and Xceed Analytics, visit our website

For demo or product inquiry, write to us at Product Marketing


Announcing NLP and LLM Model support

· 8 min read
Cynepia Product Marketing

Xceed Model Catalog

At Cynepia, we continue to accelerate infusing intelligence layer in Xceed Analytics and bring the power of AI to enterprise data and analytics platform story. We are excited to announce the following features to the Xceed Analytics - Intelligent Data Platform for the enterprise.

  1. Announcing Support for Large language models in Xceed Model Catalog. Data scientists and AI engineers can now import popular text based language models from Hugging Face.

  2. Introducing Xceed Playground debug and test models before deploying the same in production. Data Scientists/AI engineers can now debug models, compare different text models with their specific prompts. Xceed Playground also supports debugging and testing classifical ML models.

  3. Xceed Deployment and Model Serving Endpoints further enable users to deploy LLM/text models on a public or private cloud instance.

  4. API Support(REST/Python) make it easy to importing OSS models from HuggingFace, registering with Xceed ML Catalog and Deploying the same are supported programmatically.

  5. Xceed CLI (xacli) further supports using a text based Command Line Interface to manage, govern and deploy models.

All of the new features inherit the Governance and Collaboration features already part of Xceed ML Catalog.

In the series of AI related announcement, last quarter, we had announced support for the following features.

  1. Xceed AI Assistant your data co-pilot covering various data use cases including text-to-SQL, auto-generation of table and column metadata and many more in the pipeline boosting productivity of data engineers and analysts.

  2. **Xceed Advance Intelligent Search ** combines vector search (using powerful OSS embedding model) with full-text search, significantly enhancing search relavance. This, along with unified control plane, ensures instant access to all your data and model assets. Xceed AI Search allows users to instantly discover and search across the entire data estate, including data connectors, datasets in the data catalog, model catalog, transformation workflows, SQL models and dashboards.

About Xceed Model Catalog

Xceed Model Catalog now support accessing, testing, governing and serving text based models, including Open source LLM models of all sizes(1B -70B parameters). This is in addition to classical ML models created using Xceed AutoML (Classification/Regression and Forecasting).

Deploy models through model catalog

Xceed Analytics is uniquely positioned to enhance the AI experience through our new Model Catalog and Playground additions, thanks to our unified approach to building an intelligent data platform. This capability aids in democratizing access to enterprise AI adoption while ensuring role-based governance and access.

About Xceed Playground

Xceed Model Playground

You can interact with both classifical ML and text based large language models using the Xceed Playground. The Xceed Playground enables data scientists and AI engineers to you can test and compare modelsright from within your Xceed Analytics workspace.

Xceed Analytics is uniquely positioned to enhance the AI experience through our new Model Catalog and Playground additions, further building upon our unique approach to building an intelligent data platform. This capability aids in democratizing access to enterprise AI models and accelarate AI deployment while ensuring role-based governance and access.

Availability

Xceed Large Language Model support in Xceed Model Catalog and Xceed Playground are now available in private preview.

About Xceed Analytics

Xceed Analytics is an AI powered comprehensive intelligent data platform for enterprise unifies all your data, analytics, models and AI use cases and products in a single view. A comprehensive data and analytics Platform is therefore vital to success of business transformation journey as we ride the new wave of Generative AI and take advantages of it.

Problem that an Intelligent Data Platform addresses

Emergence of Machine Learning and AI, along side fast pace of digital and explosive growth of data that enterprises are experiencing has made them realize that an effective approach to managing and harnessing the power of data and AI can create significant competitive advantage. Landscape for data and AI has been constantly evolving over the past decade to address the challenge and oppurtunity of managing and harnessing this data that enterprises are inundiated with. Modernizing the data and AI platform has been a constant through out the past decade.

The fragmented toolchain and siloed data within enterprises are formidable barriers that hinder the full harnessing of their data assets. When various departments and teams rely on disparate tools and systems that don't communicate effectively, it leads to inefficiencies, duplicated efforts, and a lack of a unified view of the data. Siloed data exacerbates this problem by isolating valuable information within these disparate systems, preventing cross-functional collaboration and inhibiting data-driven decision-making. The result is a missed opportunity for enterprises to extract valuable insights, achieve operational excellence, and remain agile in an increasingly data-driven world.

An Intelligent data platform helps organizations to unlock the true potential of their data assets and drive AI innovation across use-cases.

Availability

The Xceed Natural Language Model feature and Xceed Playground are currently available in private preview.

About Xceed Analytics

Xceed Analytics is an AI powered comprehensive intelligent data platform for enterprise unifies all your data, analytics, models and AI use cases and products in a single view. A comprehensive data and analytics Platform is therefore vital to success of business transformation journey as we ride the new wave of Generative AI and take advantages of it.

Problem that an Intelligent Data Platform addresses

Emergence of Machine Learning and AI, along side fast pace of digital and explosive growth of data that enterprises are experiencing has made them realize that an effective approach to managing and harnessing the power of data and AI can create significant competitive advantage. Landscape for data and AI has been constantly evolving over the past decade to address the challenge and oppurtunity of managing and harnessing this data that enterprises are inundiated with. Modernizing the data and AI platform has been a constant through out the past decade.

The fragmented toolchain and siloed data within enterprises are formidable barriers that hinder the full harnessing of their data assets. When various departments and teams rely on disparate tools and systems that don't communicate effectively, it leads to inefficiencies, duplicated efforts, and a lack of a unified view of the data. Siloed data exacerbates this problem by isolating valuable information within these disparate systems, preventing cross-functional collaboration and inhibiting data-driven decision-making. The result is a missed opportunity for enterprises to extract valuable insights, achieve operational excellence, and remain agile in an increasingly data-driven world.

An Intelligent data platform helps organizations to unlock the true potential of their data assets and drive AI innovation across use-cases.

Benefits of an Intelligent Data Platform

There are enumerous benefits of a comprehensive end-to-end Intelligent Data Platform

  1. Central repository for all the data, workflows and models.

  2. Seamlessly Discover, Manage Data Quality and Govern all your data products/artifacts through a single pane.

  3. Remove data silos, keep every stackholder engaged and notified.

  4. Accelerate deriving value from their most valuable asset which is data.

  5. Enables enterprises to cut/optimize costs via No Integration stack. You no longer need to stitch individual services from multiple vendors.

  6. Simplicity of overall architecture helps in streamlining of the overall data and analytics process.

Technical Capabilities

Some of the key data tools included in Xceed Data and Analytics Platform include:

  1. Versioned, Governed and Fully Integrated Data Lake based on open standards such as Apache Parquet.

  2. Unified abstraction for all data producers. Supports multiple OLAP and compute engines

    • Duckdb, Apache Spark, Pandas, Ray
  3. All common access methods supported. Access/Configure and Monitor with your prefered access method

    • SQL or Dataframe or CLI or Python SDK
  4. No-code Data Integration. Supports most common databases, cloud storages and SAAS applications.

  5. Integrated Data Catalog with Extensive Data Discovery, Governance and Data Quality Test Features.

  6. Xceed SQL Workbench Enables analyst to carry out exploratory analysis via a visual interface. Supported Engines include duckdb, Apache drill, Apache Spark

  7. Xceed Workflows for No/Low Code Interface data transformation pipelines. Supported Engines include Apache Spark, Duckdb, Apache Drill for SQL, Pandas, Pyspark for dataframes.

  8. Xceed AutoML - Enable onboarding every day ML use-cases across Classification, Regression and Forecasting.

  9. Xceed Business Intelligence & Reporting Provides all common dashboarding features to build beautiful datastories/dashboards.

  10. Xceed Notifications Ensure all stackholders are notified

  11. Xceed Model Catalog home to all ML and AI Models.

  12. Xceed Python SDK/CLI Data users can now work via Xceed APIs and Command Line Interface besides the user interface as an alternate choice for interacting with Xceed Analytics.

  13. Microservices architecture enables scalability while providing seamless integration.

For More details on Xceed Analytics Architecture, refer to Our Architecture Page

About Cynepia Technologies

Cynepia Technologies provides comprehensive end to end data stack to help enterprises organize, connect, make sense of their data, stay connected with their insights, make faster, real-time decisions and ultimately grow your business.

To learn more about Cynepia and Xceed Analytics, visit our website

For demo or product inquiry, write to us at Product Marketing


Announcing Xceed AI Search and Discovery Capability

· 7 min read
Cynepia Product Marketing

We are today announcing Xceed Smart AI Search capability in Xceed Analytics, which brings the next generation AI search capability to Xceed Analytics - A Comprehensive Unified Data and AI Platform for the enterprise. By Launching this capability, we are delivering one more milestone on our promise of bringing Xceed Analytics as a unique Intelligent end-to-end Data and AI Platform.

Amidst all the hype around LLMs and its applications in enterprise, At Cynepia we are bringing the power AI to Xceed Analytics in many ways. A couple weeks back we announced Xceed AI Assistant, A comprehensive AI assistant across your data use cases. Today I am pleased to announce advance AI search, which combines vector search (using powerful opensource embedding model) with full-text search to significantly improve search relavance. This coupled with unified control plane ensures that you can access all your data and model assets instantly. Xceed AI Search allows users to instantly discover and search across all your data estate including data connectors, datasets in data catalog, model registry, transformation workflows, SQL models and dashboards.

Data users can now discover all your data assets from an enhanced application search interface instantly and save time finding things. For example. A Data Scientist/ML Engineer trying to build a new model and wants to find out if there are existing tables or feature tables that may be relavant for his/her requirement, Xceed AI Search is the place to start. He/She can quickly get to the tables/datasets in catalog or existing feature tables in catalog. Likewise a Business User who is searching for relavant dashboards linked to a given data assets can again hope to Xceed Search and find the relevant tables/dashboards using a full text semantic search capability.

Vector search technology uses Large Language Models to perform semantic retriveal of knowledge thereby significantly improving the relevance of search results for an end user. This feature is useful when you are interested in results based on the meaning and context of the search text. It leverages natural language processing and artificial intelligence to interpret the nuances of language and retrieve results that match the user's intent. This capability goes far beyond traditional keyword-based searches, enabling users to discover relevant information even when they don't have precise search terms in mind. This allows users to describe what they are looking for in natural language with a near approximate meaning and yet find the relavant data. Often results from vector search are not optimal, because of various terms which are specific to a domain. Xceed AI Search capability brings in the benefit of both vector search and full text search together, ensuring most relavant results are sent back to the user.

Xceed AI Search & Discovery

Boost analyst/data engineers/data scientist productivity, with Xceed AI Search and Discovery

Xceed Analytics is uniquely positioned to improve experience with AI capabilities using our new AI Search and Discovery capability, given our unified approach to enterprise data and AI platform. It helps democratize access to enterprise data while ensuring role based governance/access.

Availability

The Xceed AI Search and Discovery feature is currently available in public preview.

About Xceed Analytics

Xceed Analytics is an AI powered comprehensive enterprise data platform unifies all your data, analytics and AI use cases and products under a single unified platform. A comprehensive data and analytics Platform is therefore vital to success of business transformation journey as we ride the new wave of Artificial Intelligence and take advantages of this new promising technology in the transformation journey.

Problem that a Comphrehensive DAta & AI Platform addresses

Emergence of Machine Learning and AI, along side fast pace of digital and explosive growth of data that enterprises are experiencing has made them realize that an effective approach to managing and harnessing the power of data and AI can create significant competitive advantage. Landscape for data and AI has been constantly evolving over the past decade to address the challenge and oppurtunity of managing and harnessing this data that enterprises are inundiated with. Modernizing the data and AI platform has been a constant through out the past decade.

The fragmented toolchain and siloed data within enterprises are formidable barriers that hinder the full harnessing of their data assets. When various departments and teams rely on disparate tools and systems that don't communicate effectively, it leads to inefficiencies, duplicated efforts, and a lack of a unified view of the data. Siloed data exacerbates this problem by isolating valuable information within these disparate systems, preventing cross-functional collaboration and inhibiting data-driven decision-making. The result is a missed opportunity for enterprises to extract valuable insights, achieve operational excellence, and remain agile in an increasingly data-driven world.

A Comprehensive Data & AI Platform helps breaking down these silos and streamlining the toolchain is essential for organizations to unlock the true potential of their data assets and drive innovation.

Benefits of a Comprehensive Data & AI Platform

There are enumerous benefits of a comprehensive end-to-end Data and AI Platform

  1. Central repository for all the data, workflows and models.

  2. Seamlessly Discover, Manage Data Quality and Govern all your data products/artifacts through a single pane.

  3. Remove data silos, keep every stackholder engaged and notified.

  4. Accelerate deriving value from their most valuable asset which is data.

  5. Enables enterprises to cut/optimize costs via No Integration stack. You no longer need to stitch individual services from multiple vendors.

  6. Simplicity of overall architecture helps in streamlining of the overall data and analytics process.

Technical Capabilities

Some of the key data tools included in Xceed Data and Analytics Platform include:

  1. Versioned, Governed and Fully Integrated Data Lake based on open standards such as Apache Parquet.

  2. Unified abstraction for all data producers. Supports multiple OLAP and compute engines

    • Duckdb, Apache Spark, Pandas, Ray
  3. All common access methods supported. Access/Configure and Monitor with your prefered access method

    • SQL or Dataframe or CLI or Python SDK
  4. No-code Data Integration. Supports most common databases, cloud storages and SAAS applications.

  5. Integrated Data Catalog with Extensive Data Discovery, Governance and Data Quality Test Features.

  6. Xceed SQL Workbench Enables analyst to carry out exploratory analysis via a visual interface. Supported Engines include duckdb, Apache drill, Apache Spark

  7. Xceed Workflows for No/Low Code Interface data transformation pipelines. Supported Engines include Apache Spark, Duckdb, Apache Drill for SQL, Pandas, Pyspark for dataframes.

  8. Xceed AutoML - Enable onboarding every day ML use-cases across Classification, Regression and Forecasting.

  9. Xceed Business Intelligence & Reporting Provides all common dashboarding features to build beautiful datastories/dashboards.

  10. Xceed Notifications Ensure all stackholders are notified

  11. Xceed Model Registry home to all ML Models.

  12. Xceed Python SDK/CLI Data users can now work via Xceed APIs and Command Line Interface besides the user interface as an alternate choice for interacting with Xceed Analytics.

  13. Microservices architecture enables scalability while providing seamless integration.

For More details on Xceed Analytics Architecture, refer to Our Architecture Page

About Cynepia Technologies

Cynepia Technologies provides comprehensive end to end data stack to help enterprises organize, connect, make sense of their data, stay connected with their insights, make faster, real-time decisions and ultimately grow your business.

To learn more about Cynepia and Xceed Analytics, visit our website

For demo or product inquiry, write to us at Product Marketing


Introducing Data Quality Monitor

· 7 min read
Cynepia Product Marketing

Background

In the era of Language Models and Advanced Artificial Intelligence Applications, need for reliable and accurate data has never been more important than now. Having a Comprehensive data and analytics platform has become non-negotiable need for a company of a certain size to acheive goals and benefits of these formidable new capabilities. Inability to access Data and Metadata seamlessly in a single pane is a major source of frustration in carrying out data driven digital transformation. Cobbled up point solutions often sold as best of breed have only added to challenges with integrating these solutions within one's data platform architecture. A Comprehensive data and analytics platform is therefore one of the key elements to success with data driven digital transformation.

Problem

In additon to platform challenges, Data teams face a variety of challenges in ensuring quality of the data products built by them and made available to the downstream users through the life cycle of the individual data assets/products. These data assets are often accumulated using 100s of upstream sources via source databases, SaaS systems via AP, Cloud Storages and more. The dynamic nature of the data itself along with movement from variety of systems have made troubleshooting data issues almost impossible, leading to longer down-times, frustrated data teams and loss of trust on data products.

One of the key challenges in trouble shooting such issues is lack of visibility of data changes often caused by upstream changes at source systems or somewhere during the journey of transformation. Effectively visibility can help ensure a better baseline reference profile for every data asset and mechanism to test for specific data tests (both syntax and semantics) of the new incoming data can help data teams react faster to the impending issue.

Solution

We are today introducing Xceed Dataset Monitors right within Xceed Data Catalog to help data teams get back in control over their data challenges. Data Engineers can now set data quality monitors for every incoming data and ensure that the necessary checks/tests are carried out every time new data arrives. Data Teams can create monitors using an easy to use GUI right from within the dataset details page. Data Teams can create multiple suites for individual downstream data product impact (for example dashboards created by downstream analyst or the data being used by a downstream data science team for an ML model).

Real-time monitoring and keeping all the stack holders informed ensures reduced downtime in event of upstream changes and ensures trust on end data products is never broken

Data Quality Monitoring Dashboard enables data teams track trends over time both at dataset as well as individual test levels. This further helps spot repetitive non-reliable tables/columns over time, helping stackholder teams to prioritize and take effective actions to improve the overall quality.

In Summary, Some of the key benefits of our approach to data observability/monitoring are as below:

  1. Inline with the data arrival critical to reduce actual downtime.

  2. Support for No code interface drop in right within the data catalog, lowers the bar to add/modify data quality tests/monitoring rules.

  3. Integrated approach ensures, you don't need another out-of-band data observability or monitoring tool.

  4. Single interface to bring all data users together. Keep every one informed in real time as data is refreshed.

  5. 360 view of all data artifacts and operations right from within the single application interface. Data teams now have ability to monitor datasets/columns with consistent issues

Key Features

  1. Cynepia Data Quality Monitors are Engine Independent, it works with all the supported engines including Spark, Pandas.

  2. Leverages Existing Data Profile for the dataset thereby optimizing compute usage.

  3. Support for exhautive list of monitor rules both at dataset level and column level.

  4. Support for multiple notification channels including In-App Notification, Slack and Emails.

  5. Run History with Data Quality Metrics Trends to monitoring trends at an overall suite level and individual monitor/test level.

How It Works

To Create a Data Quality Monitoring Suite, You first need to first define a Monitoring Suite from the dataset details page in your Data Catalog. Defining a Monitoring Suite for a dataset is a three step process as shown below:

  1. Create a New Monitoring Suite

Create a New Monitoring Suite

  1. Add individual monitors/tests to the suite

Add Tests to the suite

  1. Add a list of slack channels/users to notify on every run

Add channels/users to Notify

  1. Click Finish to create a new monitoring suite. You have successfully created a new data quality monitoring suite. You can click run manually to trigger a fresh run from Existing tab.

Run Tests

Once the run is completed, results are now available via the Run History tab as seen below:

Run History

About Xceed Analytics

Xceed Analytics is an AI powered comprehensive enterprise data platform unifies all your data, analytics and AI use cases and products under a single unified platform. A comprehensive data and analytics Platform is therefore vital to success of business transformation journey as we ride the new wave of Artificial Intelligence and take advantages of this new promising technology in the transformation journey.

Benefits of a Comprehensive Data & AI Platform

There are enumerous benefits of a comprehensive end-to-end Data and AI Platform

  1. Central repository for all the data, workflows and models.

  2. Seamlessly Discover, Manage Data Quality and Govern all your data products/artifacts through a single pane.

  3. Remove data silos, keep every stackholder engaged and notified.

  4. Accelerate deriving value from their most valuable asset which is data.

  5. Enables enterprises to cut/optimize costs via No Integration stack. You no longer need to stitch individual services from multiple vendors.

  6. Simplicity of overall architecture helps in streamlining of the overall data and analytics process.

Technical Capabilities

Some of the key data tools included in Xceed Data and Analytics Platform include:

  1. Versioned, Governed and Fully Integrated Data Lake based on open standards such as Apache Parquet.

  2. Unified abstraction for all data producers. Supports multiple OLAP and compute engines

    • Duckdb, Apache Spark, Pandas, Ray
  3. All common access methods supported. Access/Configure and Monitor with your prefered access method

    • SQL or Dataframe or CLI or Python SDK
  4. No-code Data Integration. Supports most common databases, cloud storages and SAAS applications.

  5. Integrated Data Catalog with Extensive Data Discovery, Governance and Data Quality Test Features.

  6. Xceed SQL Workbench Enables analyst to carry out exploratory analysis via a visual interface. Supported Engines include duckdb, Apache drill, Apache Spark

  7. Xceed Workflows for No/Low Code Interface data transformation pipelines. Supported Engines include Apache Spark, Duckdb, Apache Drill for SQL, Pandas, Pyspark for dataframes.

  8. Xceed AutoML - Enable onboarding every day ML use-cases across Classification, Regression and Forecasting.

  9. Xceed Business Intelligence & Reporting Provides all common dashboarding features to build beautiful datastories/dashboards.

  10. Xceed Notifications Ensure all stackholders are notified

  11. Xceed Model Registry home to all ML Models.

  12. Xceed Python SDK/CLI Data users can now work via Xceed APIs and Command Line Interface besides the user interface as an alternate choice for interacting with Xceed Analytics.

  13. Microservices architecture enables scalability while providing seamless integration.

For More details on Xceed Analytics Architecture, refer to Our Architecture Page

About Cynepia Technologies

Cynepia Technologies provides comprehensive end to end data stack to help enterprises organize, connect, make sense of their data, stay connected with their insights, make faster, real-time decisions and ultimately grow your business.

To learn more about Cynepia and Xceed Analytics, visit our website

For demo or product inquiry, write to us at Product Marketing


Introducing Xceed AI Assistant

· 4 min read
Cynepia Product Marketing

In the era of Language Models and Generative AI applications, Xceed AI Assistant aims to offer a comprehensive AI Assistant across all the data tasks and functionalities within Xceed Analytics.

Xceed AI Assistant cut across all the roles and tasks, be it Business Analyst exploring datasets using SQL or creating a report, or a Data Engineer updating/exploring catalog for a given dataset,

A Data Scientist/ML Engineer trying to build a new model or the Business User who has a business question. Some of the common tasks supported with this preview and upcoming releases of Xceed AI Assitant shall include the following:

  • Auto-Generate SQL from a given business analyst english prompt.

  • Semantic Search enabling superior natural language search to discover the most relevant, reliable data assets

  • Asking data questions in Natural language to get answers to one's business quesion.

  • Create Natural Language Summary for a given insight

Boost analyst/data engineers productivity, with Xceed AI Assistant

Xceed Analytics is uniquely positioned to improve experience with AI capabilities provided by Language Models, given our unified approach to enterprise data and AI platform. It helps democratize access to enterprise data while ensuring role based governance/access.

Availability

The Xceed AI Assistant is currently available in private preview.

About Xceed Analytics

Xceed Analytics is an AI powered comprehensive enterprise data platform unifies all your data, analytics and AI use cases and products under a single unified platform. A comprehensive data and analytics Platform is therefore vital to success of business transformation journey as we ride the new wave of Artificial Intelligence and take advantages of this new promising technology in the transformation journey.

Benefits of a Comprehensive Data & AI Platform

There are enumerous benefits of a comprehensive end-to-end Data and AI Platform

  1. Central repository for all the data, workflows and models.

  2. Seamlessly Discover, Manage Data Quality and Govern all your data products/artifacts through a single pane.

  3. Remove data silos, keep every stackholder engaged and notified.

  4. Accelerate deriving value from their most valuable asset which is data.

  5. Enables enterprises to cut/optimize costs via No Integration stack. You no longer need to stitch individual services from multiple vendors.

  6. Simplicity of overall architecture helps in streamlining of the overall data and analytics process.

Technical Capabilities

Some of the key data tools included in Xceed Data and Analytics Platform include:

  1. Versioned, Governed and Fully Integrated Data Lake based on open standards such as Apache Parquet.

  2. Unified abstraction for all data producers. Supports multiple OLAP and compute engines

    • Duckdb, Apache Spark, Pandas, Ray
  3. All common access methods supported. Access/Configure and Monitor with your prefered access method

    • SQL or Dataframe or CLI or Python SDK
  4. No-code Data Integration. Supports most common databases, cloud storages and SAAS applications.

  5. Integrated Data Catalog with Extensive Data Discovery, Governance and Data Quality Test Features.

  6. Xceed SQL Workbench Enables analyst to carry out exploratory analysis via a visual interface. Supported Engines include duckdb, Apache drill, Apache Spark

  7. Xceed Workflows for No/Low Code Interface data transformation pipelines. Supported Engines include Apache Spark, Duckdb, Apache Drill for SQL, Pandas, Pyspark for dataframes.

  8. Xceed AutoML - Enable onboarding every day ML use-cases across Classification, Regression and Forecasting.

  9. Xceed Business Intelligence & Reporting Provides all common dashboarding features to build beautiful datastories/dashboards.

  10. Xceed Notifications Ensure all stackholders are notified

  11. Xceed Model Registry home to all ML Models.

  12. Xceed Python SDK/CLI Data users can now work via Xceed APIs and Command Line Interface besides the user interface as an alternate choice for interacting with Xceed Analytics.

  13. Microservices architecture enables scalability while providing seamless integration.

For More details on Xceed Analytics Architecture, refer to Our Architecture Page

About Cynepia Technologies

Cynepia Technologies provides comprehensive end to end data stack to help enterprises organize, connect, make sense of their data, stay connected with their insights, make faster, real-time decisions and ultimately grow your business.

To learn more about Cynepia and Xceed Analytics, visit our website

For demo or product inquiry, write to us at Product Marketing


Is Unified Data and AI Platform Answer to Success of Data Science Projects?

· 5 min read
Rajesh Parikh

Background

More and more enterprises are embracing data science as a function and a capability. But the reality is many of them have not been able to consistently derive business value from their investments in big data, artificial intelligence, and machine learning. However, A surprising percentage of businesses fail to obtain meaningful ROI from their data science projects. Enumerous articles have been written on failure rate, root causes and how do we improve the success of such projects.

info

A few statistics on Data Science Project Failures

  • Failure rate of 85% was reported by a Gartner Inc. analyst back in 2017.

  • 87% was reported by VentureBeat in 2019 and

  • 85.4% was reported by Forbes in 2020.

The dichotomy of those numbers is that the outcome that enterprises are witnessing despite the breakthroughs in data science and machine learning, tons of wonderful articles and videos sharing experiences, enumerous number of open source/commercial libraries/tools.

info

Moreover, evidence suggests that the gap is widening between organizations successfully gaining value from data science and those struggling to do so. So what are the top reasons preventing data science projects from succeeding. Reasons for failure can be further categorized broadly as below:

Project Planning & Costs

  • Lack of clearly articulated business problem and documentation of it.

  • Lack of upfront articulation of business value/outcome expected from the project and therefore prioritization.

  • Lack of stackholder involvement and communication plan with them right from the beginning.

  • Unstated/Undefined deployment planning as part of project planning.

  • Not the right use case.

  • Cost of Experimentation often prohibitive and inhibits ROI.

People

  • Data Science & ML Skill Shortage.

  • Data Scientists often not trained in design patterns as programmers leads to sub-optimal , un-performant and short-lived model modeling code.

  • Data Scientists often interested in exploration and experimentation and stay away from productionizing efforts.

  • Lack of Cognizance that Data Science Model Training & Deployment often follows all the processes of a software project deployment, versioning, testing and iterations for fixing the quality. Organisation of the team often doesn’t constitute experts or trained staff who have understanding of the development, testing and CI/CD pipeline.

  • Lack/Absence of Data Culture/Maturity within the organisation.

Data Management & Data Quality Process/Tools

  • Siloed data in different repos and no clear plan of how this will work during successive iteration.

  • Insufficient or Unavailable data

  • Poor Quality of data

  • Unregulated/Unnoticed changes to schema and data distributions

Modelling

  • Model training/Experimentation often done outside of the production environment leads to completely redoing model training once the software engineers take it over for deployment

  • Lack of feedback loop from model deployment to model learning phase leads to deterioration.

  • Interpretability of the model compromised for model accuracy

  • Model Trustability with the business stackholder and many a times an unknown fear of a negative impact of model on business. Instances/Articles like Zillow substantiate potential damage that a model can do.

  • Lack of Trust/Apprehension (founded/unfounded) on model among business stackholder often leads to model not making it to deployment.

  • Lack of process for historical saving model artifacts, reason for changes etc over time leads to poor auditablity and lends itself to lack of trust.

Communication

  • Lack of coordination between business and data science teams on results/outcomes/changes.

Deployment

  • No real time auditing and logging of model results in actual deployment

  • No checks and bounds for data and concept drift and feeding the performance into the data science team and business stackholder.

  • Integration with Online Transaction systems and applications which often form the consumption layer often not planned. This leads to poor adoption of models.

While there are myriads of problems for a model to succeed through deployment and longitivity of such a deployment during the course of production usage, At Cynepia, we believe that Unified Data and AI Automation Platform and No Code Data Science and Continuous Productionization can significantly improve the chances of success modeling use cases by addressing many of the challenges above.

Solution

An End-to-End No Code/Low Code Data Science platform brings significant advantages, as listed below and can significantly address many of the data science pitfalls listed above. Unified Platform acts as a single hub for all your data, models and stackholder ensuring communication between business and data science team is near realtime ,both during the project execution and model monitoring phase.

Integrated Data Catalog and Data Pipelines ensure that data schema changes are notified and always available to the data science team, to understand if there are any upstream data quality changes.

Discovery of newer features/datasets published by data engineering team further helps create synergy on finding new useful features.

Visual Model Building helps data scientists focus on business outcome and experimentation than learning design patterns thereby improving longitivity of modeling effort.

Visual Data Exploration (EDA) and Model Interpretation enables faster socializing of data/model changes before deployment

Model Catalog further ensures model revisions are stored.

One click Model Deployment enables faster deployment of approved models to production.

Model Monitoring further helps track data/concept drift in running phase and helping ensure models are retrained.

Conclusion

End to End No/Low Code Unified Data and AI Platform offers a promising alternative to reducing data science project failures both by streamlining projects from implementation to production and monitoring as well as significantly reducing effort needed to upkeep code and data over time. Bringing all stack holders on the same page can reduce apprehensions and enhance trust among business stackholders by enhancing collaboration.

Picking Right Enterprise Data & AI Platform Strategy

· 8 min read
Rajesh Parikh

Background

A modern enterprise data and AI platform has become a non-negotiable need for a company of a certain size.

The central idea behind such a platform is that it serves as a central repository where all the data can then be converted into knowledge that can then be used by business users to deliver value across various functions.

Therefore, it is imperative that we state clearly and unambiguously that data is a first class citizen in any enterprise and therefore needs a place to thrive and grow. An enterprise data and AI platform is therefore vital to the enterprise transformation journey.

Various data processes such as data engineering, business intelligence, analytics, machine learning further help enhance the value of this data by performing various operations variety of opertions right from cleaning and preparing data, unifying data from different data sources and creating new metrics/measures that help organisations who believe in power of measurement to streamline their operations, perform root cause analysis and understand drivers for a certain entity, create a holistic understanding of customers, employees, processes and measure and improve each and every aspect of enterprise process.

As one finds out the canvas of generating potential business value from data is huge. But enterprises have to traverse through the key decision of what kind of modern data enterprise platform suits their needs. There are various ways enterprises can build a modern data and analytics platform for today.

Build your own

** Should you build your data and AI from the ground up? **

Thats potentially a multi-million dollar question often running into 10s of million depending on the scope of such a data project endeveour.

A decision to build on your own should have a much more stronger business value case, since it is coupled with long term maintainence and support costs, keeping up with the technological improvements over time to upgrade and modernize a home grown platform. In addition, newer technical requirements emanating from new forms of use cases and plathora of design choices have significantly increased the complexity of building such data architectures, often making a project execution of a build approach itself risky.

Hence our view is “Build your own” data and AI platform needs to emanate from a clean longer term business strategy and goal that is articulated well and differentiation spelt out. This needs to be further supplemented with a well defined execution scope, time, human capital and cost allocated to such initiative.

At Cynepia, We beleive the answer to above question is almost an absolute ‘NO’ for 99.9% of the enterprises, If the larger objective is to use the data to derive operational analytics and decision analytics value by using the data and analytics infrastructure.

** Buy various point SAAS solutions and integrate. **

Here again there are potentially 2 main approaches:

  • ** Would purchasing various Commercial SaaS solutions and integrating make a far more apt sense ? **

This has really been an approach on tear for past several years. Clubbed under a loosely defined term “Modern Data Stack”, A Lot has been written and discussed on this topic over last few years. A quick and apt summary of what is clubbed under such a terminology is discussed/reviewed by Approva Padhi from Foundation Capital hereModern Data Stack: Looking into the Crystal Ball.

The article also summarizes issues/areas of improvement for such a stack. At Cynepia, we beleive that there are many issues with the so called “modern data stack” approach. A few among them are as follows:

  1. Lack of finished and consistent user experience and multiple end application interfaces leading to sub-optimal user centric design and productivity gaps.

  2. Lack of thought to devolution of data and analytics skills leading to lessor scope for democratization.

  3. Higher human cost to keep the modern data stack up and running over time handing integration issues and dealing with sub-optimal architectural choices made by the different SAAS tool vendors.

Often this would need adding further layers of software applications.

  1. Cost conundrum due to piece meal approach and dependece on multiple SAAS vendors to deliver the unique SLA need of your organisation.

Leveraging Data APIs and apps from cloud vendors such as AWS, Azure or Google Cloud a better bet?

  • Another way to build the data and analytics platform is to using many of the tools, point services and programmable interfaces provided by players such as aws, azure and google cloud platform. While the usual benefits of cloud such as flexibility and scalability are available here too. In addition to the above mentioned disadavantage, there are a few additional disadvantages:
  1. Price variation: Often this services are priced by usage and cloud services bills tend to grow exponentially as the data grows. This is seen as a big irrant in services adoption on cloud beyond basic infrastructure services

  2. Governance: Data Security & Governance of many of these cloud platforms, tools and applications was designed to be generic and therefore is quite combursome, but often may need specialist to be hired to keep your data secure flawlessly.

** Buy a modern unified data and AI development platform **

There is yet another class of vendors who provide a bundled data and AI Development Platform. These platforms are usally a subset of data and AI applications built on a common architecture and user experience theme unlike unbundled SAAS Applications.

However, compared to the hype, many of these unified platforms/vendors have often lost the ability to shape the market because of variety of reasons:

  • Actual promise vs reality

Often incomplete or combined by acquiring different SAAS companies such as above and bundling the solution, completely ignoring the user centric design approach.

Often Costly and priced as per the needs of fortune 500/1000 companies making them prohibitive for the larger enterprise market.

  • Missing Pieces of the puzzle further leaves adding further layers of software applications.

Is Modern No Code/Low Code UI Unified Platform the future?

** Unified Low Code/No Code Enterprise Data & AI Platform **

At Cynepia, we see another category which is not just modern and futuristic and ensures data and AI are democratized and has the potential to reach far more users and use-cases and larger enterprise market. For name sake, we call this category ”Low Code/No Code Enterprise Data and AI Platform”

So what are the advantages of a “Low Code/No Code Enterprise Data and AI Platform”

First and foremost to set our orientation right, we strongly believe in a unified data and AI development platform, because thats the best way to make analytics affordable and expand foot print beyond fortune 500/1000 companies. Ofcourse the baseline is actual promise should match reality.

It transfers technical debt of architecture, integration/support of multitude of SAAS applications to the vendor instead of you as the customer.

We are believers in No Code/Low Code Platforms potential to deliver better value. Why not for code? It is not to suggest that for code platforms is a bad idea. Every no-code platform rides on code, so that would be an oxymoron in a way, if one is proposing no-code.

Here’s are top reasons to go low code/no code:

It’s hard to hire best of bread/smartest of the Data Architects, Software Engineers, Frontend Engineers, Data Scientists and Project Managers at the price one is willing to pay. If one can, one could potentially generate a similar or potentially better outcome as a No Code/Low Code Platform. So that leaves you with next grade of skilled developers, who then need to be trained, upskilled and hopefully you have an code output that fits your aspiration. However, most likely case is you are left with more data debt than code that you can call asset.

Most Enterprises are not software companies by business. We have seen often keeping focus on building code over extremely longer term is hard, since quick business outcomes often supercede long term value of such code assets. Historical data evidences suggest, for code data pipeline shelf life is much lower than one beleives it is.

While multitude of best of breed SAAS applications still remains an alternate choice, but if an enterprise is focused on business value and ROI. There is no reason for it to technical debt of integration, support and stitching architecture is transferred to the customer than the vendor. There is also an overall ROI question on the whole data and analytics investment as discussed earlier.

How do you strategize?

Data technologies are evolving quickly, making traditional “build your own” may be risky, time consuming and inefficient. If you haven’t already made investment in any architecture, we would strongly recommend to invest in a user centric Low Code/No Code Enterprise Data and AI Platform.

If you have already invested in one of the above given architecture and accruing technical debt which feels unmanageable, you may want to restrategize and rethink of investing in a data and analytics architecture that suits not just your today’s need but also future.

Getting Customer Risk Assessment Modeling Right for Micro Lenders

· 5 min read
Rajesh Parikh

Objective

A Good Customer Risk Assessment Model is to keep track of borrower risk during the tenure of the loan, post disbursement by monitoring risk profile change of a borrower periodically.

A robust model can enable a lender keep track and understand adequately potential changes in risk profile of a borrower.

This can further lead to variety of benefits in terms of optimization of various operations including customer relationship, collection process optimization, cross-sell/up-sell efforts and portfolio risk concentration and thereby minimizing financial losses due to lost capital.

Challenges

Microfinance Lenders face quite a few challenges in achieving a robust customer risk assessment model. But the most important ones can be divided into following sub categories:

  • ** Availability of Quality Primary Data **

    • Large segment of customers are first time borrowers with no credit history
    • Lack of Demographics data including Income, Saving, Wealth data since most of the customers are bottom of the pyramid and come under priority sector/income generation loan.
    • Lack of industry wide framework or data for Group Risk Assessment either by center/area/pincode.
    • Unavailability of detailed customer payment data of loans taken from other MFI/Institution in a cost efficient manner to source periodically.
  • ** Inefficient Data Management **

    • In-ability to keep data quality from internal/external quality sources sanitized/validated month on month in a repeatable way impacts model quality over time.
    • Single source of truth for all customer features from across various data sources. (App, LOS, LMS, Field Feedback Systems, Past Payment Pattern, Credit Bureau data etc)
    • Challenges of data size/growth
    • Avoid cumbersome and inefficient spreadsheet based tools or developer scripts.
  • ** Model Trust, Explainability, Communiation and Presentability **

    This is a significantly under-rated challenge not understood by most companies deploying a model like customer risk assessment. Building trust on the model outcome can be time-consuming and at times draining.

    • Taking all stakeholders along the process of model thinking and establishing trust on model needs ability to produce reproducible results, ability to explain factors leading to a particular customer or group level risk assessment in simple explainable way. Inability to go through this phase leads to model not being deployed.

    • Ability to iterate through and help explain the business well and eliminate any bias/errors.

    • Ability to provide model outcome/updates periodically directly through BI and Data Visualization in the hands of relevant stackholders.

  • ** Automation by impacting relevant lending processes impacted by customer risk assessment outcome **

    • Integrate with various other IT systems and/or impact various downstream processes such as collection, customer relationship, cross-sell/up-sell efforts and portfolio strategy as required.
  • ** Model Deployment **

    • Keep periodic results of models informed to various stakeholders
    • Sync outcomes with Other IT systems and field force facing systems.

Technologies used at an MFI

Cynepia Technologies flagship product, Xceed Analytics, a unified end to end data & AI platform came make it easy for micro-lenders to overcome some of the above challenges. Key capabilities of Xceed analytics that come handy for the above include seemless data integration across various source IT system, Data Management including version management/governance capabilites, Data Workflows enabling data preparation/transformation, Data Visualization capabilities including Dashboards providing Visualization and Advanced Predictive/ML Modelling capabilities supporting 35+ algorithms and model catalog supporting version management and explainability.

Above capabilties can enable Microlenders achieve the following key implementation benefits important for above model.

  • ** Data Workflow Automation ** allowed for seamless automation enabled customer to build Customer 360, which brings in 200 plus loan and customer level attributes from various source systems including manual uploads, Credit Bureau Data, Loan Origination System(LOS), Loan Management System, Other Customer Facing IT Systems as well as other upstream data workflows including Customer Payment Behavior Analysis for existing loans.

  • ** Data Catalog ** with readily available feature profile enabled Data modellers to quickly explore and bring features needed for this effort together as well as keeping track of data quality month on month.

  • ** Cynepia AutoML Subsystem ** enabled the following:

    • Automatic feature engineering, feature scaling and hyper-parameter tuning using Bayesian Optimization Techniques,
    • Automatic selection of algorithm is made of the various available options including Logistic Regression, Random Forest, Logistic Regression, CatBoost, Adaboost and X Gradient Boost (xgboost)
    • Model Catalog helped keep track of model versions and results over months enabling comparison of model results across revisions, validation of results by monitoring new delinquencies month on month as well as signs of model deterioration leading to retraining.
    • Model explain-ability using model agnostic approaches and Shapley values.
  • ** Xceed Analytics BI Subsystem ** Insights communicated with Credit/Business stackholders.

  • ** API Integration ** with Credit Facing Internal Application using ** Xceed Catalog APIs **.

Business Benefits

  • Availability of Robust Risk Assessment for existing customers.
  • Model/Data Catalog and Operations ensures audit-ability of model and model outcomes over time..
  • Integrated data and visualization infrastructure from Xceed Analytics enabled ease of building and presenting the model outcome to relevant stackholders seemlessly without need for elaborate manual processes avoiding manual spreadsheets.
  • Integration with credit team facing application enabled collection of field feedback on specific risk assessment providing a feedback loop.

Integrations, use case & future possibilities

Lenders can further redesign and integrate model outcome with source systems via ** API integration ** and further acheive end to end automation of such a model

  • Real time integration of customer risk with next cycle loan decision and cross sell/up sell
  • Use of generated KPIs in other customer related outcomes such as portfolio risk concentration etc.
  • Redesign of collection processes based on customer risk profile.
  • Redesign of cross-sell/upsell process based on understanding of customer risk profile.
  • Portfolio Optimization/Reallocation based on future risk concentration thereby acting in advance.

Get the power of futuristic Data & AI Platform for your enterprise.