Skip to main content

Chronic Kidney Disease Prediction

Kidney Disease Banner

Background

Chronic kidney disease (CKD) is a long-term disorder in which the kidneys do not function properly.It's a common ailment that's frequently linked to growing older. It has the potential to affect anyone. In the early stages of kidney disease, there are usually no symptoms. It can only be detected if you have a blood or urine test for another reason and the findings reveal a suspected kidney disease. Symptoms of a more advanced stage of the disease include:

  • fatigue
  • ankles
  • feet or hands that are swollen
  • breathing difficulty
  • nausea
  • blood in urine

Causes of Chronic kidney disease

It is frequently the outcome of a combination of issues.Other illnesses that put a load on the kidneys are frequently the cause of chronic renal disease.

Causes:

  • High blood pressure
  • Diabetes
  • High cholestrol
  • Kidney infections
  • glomerulonephritis
  • polycystic kidney disease
  • Obstructions in the urine flow
  • use of various medicines on a regular basis over a long period of time

Blood and urine tests can be used to diagnose CKD. These tests check for abnormally high levels of particular compounds in your blood and urine, which indicate that your kidneys aren't functioning properly.Blood and urine test results can be used to determine the stage of the kidney disease. This is a number that indicates the severity of kidney disease, with a larger value signifying more severe CKD.Early identification of CKD should be advantageous since it allows doctors to begin effective treatment of mild disease, limiting renal function loss and delaying or preventing kidney failure progression.It's critical to have reliable tools for predicting CKD early on. In the prediction of CKD, machine learning technologies are efficient.

Objective

This usecase is an attempt to determine if a person has chronic kidney disease based on their health conditions and health report information. The main objective is to recognize and analyse the information available,test it using different classification models,evaluate and improve the best model and present the findings and discuss what needs to be done next.

Relevance of Xceed Analytics

Xceed Analytics provides a single integrated data and AI platform that reduces friction in bring data and building machine models rapidly. It further empowers everyone including Citizen Data Engineers/Scientist to bring data together and build and delivery data and ml usecases rapidly. It's Low code/No code visual designer and model builder can be leveraged to bridge the gap and expand the availability of key data science and engineering skills.

This usecase showcases how to create, train/test, and deploy a chronic kidney disease detection/prediction classification model. The dataset is obtained from UCI Machine Learning Repository. CKD dataset is used for this purpose .Xceed will provide a NO-CODE environment for the end-to-end implementation of this project, starting with the uploading of datasets from numerous sources to the deployment of the model at the end point. All of these steps are built using Visual Workflow Designer, from analyzing the data to constructing a model and deploying it.

Data Requirements

The dataset that is used here includes :

  • Chronic Kidney Disease dataset : contains patient's health report information.

Columns of interest in the dataset

Model Objective

Understanding trends in chronic kidney disease from the health report data and predicting if a person is likely to have chronic kidney disease by analysing the underlying data, constructing a classification machine learning model, and implementing it after defining the model's major features.

Steps followed to develop and deploy the model

  1. Upload the data to Xceed Analytics and create a dataset
  2. Create the Workflow for the experiment
  3. Perform initial exploration of data columns.
  4. Perform Cleanup and Tranform operations
  5. Build/Train a classification model
  6. Review the model output and Evaluate the model
  7. Improve on the metrics which will be useful for the productionizing
  8. Deploy/Publish the model

Upload the data to Xceed Analytcs and Create the dataset

  • From the Data Connections Page, upload the the dataset to Xceed Analytics. For more information on Data Connections refer to Data Connections

  • Create a dataset for each dataset from the uploaded datasource in the data catalogue. Refer to Data Catalogue for more information on how to generate a dataset.

Create the Workflow for the experiment

  • Create a Workflow by going to the Workflows Tab in the Navigation.Refer Create Workflow for more information.

To navigate to the workflow Details Page, double-click on the Workflow List Item and then click Design Workflow. Visit the Workflow Designer Main Page for additional information.

  • By clicking on + icon you can add the Input Dataset to the step view. The input step will be added to the Step View.

Perform initial exploration of data columns.

  • Examine the output view with Header Profile, paying special attention to the column datatypes. for more information refer to output window

  • Column Statistics Tab (Refer to Column Statistics for more details on individual KPI)

Perform Cleanup and Transform Operations

  1. Drop Unecessary Columns.

  2. Find and Replace Column values.

  3. Fill Null values.

  4. Find and Replace target Column values

Build/Train a classification Model

  • You have a dataset to work with in order to create a classification model. Some of the actions to take before developing a model are listed below.
  1. Feature Selection
  2. Feature Encoding
  3. Choose the algorithm and train the model.

Feature Selection

  1. Go to the Column Profile View and select Multi-variate profile to construct a correlation matrix to manually identify the features of interest. The peason correlation is shown by Xceed Analytics. Select all of the columns that are strongly correlating to the target feature.

  2. Some of the features to chose that can explain the target variable based on the observed correlation are:

Feature Encoding

  • Take all of the categorical columns and encode them based on the frequency with which they occur. for more infomation on this processor, refer to Feature Encoding

Choose the algorithm and train the model.

  • You are estimating a categorical variable- genetic disorder for the prediction model. From the Transformer View, select Classification(auto pilot) andfill in the relevant information.

Review the model output and Evaluate the model

After you finish building the model, it is time to review the model output. Look at the output window to first review your predicted results .You will get a new column in the view like the one below.

When you finish building your model you will see another tab in the view called Ml explainer . Click on that to evaluate your model.

  • The first view you see when you click on ML explainer is the Summary view

  • The second view under Ml explainer is configuration view

The configuration view will give you the information about the step you filled in the Classification step . The view would look like the one below.

The third view under Ml explainer is Performance View . You can see confusion matrix ,ROC Curve, Precision vs Recall and Cumulative Gain Curve. Look at the built charts and decide if the charts are good enough for your model. The confusion matrix is a good indicator to understand how well your model was trained.

  • The fourth view under Ml explainer is Leaderboard . In this view you can see the number of algorithms trained and all the feature engineering done on the algorithms used with ranking system to rank the best algorithm trained.

  • The last view you see under ML explainer is Interpretability . In this view you will be able to interpret your model in simple terms where you will be getting results pertaining to feature importance , PDP Plots , Sub Population Analysis , Independant Explanation , Interactive Scoring . for more infomation on these results , refer to Interpretability . The Interpretability tab and the results under this tab would look like the one below.

Conclusion

Get the power of futuristic Data & AI Platform for your enterprise.