Menu

Examination of Diabetic Patient Health Records by using Machine Learning computer programmers and open source software platform

0 Comment

Examination of Diabetic Patient Health Records by using Machine
Learning computer programmers and open source software
platform (Hadoop)
ABSTRACT
In recent days the large volume of data has generating from many health care industries.so
It is very essential to collect the available data, then store and process to explore
knowledge and optimize it to take fruitful decisions. Diabetic Mellitus (DM) comes under
Non Communicable Diseases (NCD), and unfortunately many people are suffering baldly
from it. Developing countries like India, DM has become a big challenge for health care
industries and people. It is one of the horrible disease which has long term adverse effects
on human body.With the help of recent evolution in technology, it is mandatory to develop
a system which can store and analyze the diabetic data and assume a possible risks on
body accordingly. The anticipating analysis is the method which collects many data mining
techniques, algorithms and statistics which use prensent and previous data sets to get drift
and assume future risks. In this type of work machine learning algorithm in Hadoop Map
reduce environment are implanted. Pima Indian diabetes data is all set to search out the
missing values within in it and to invent patterns. These kind of work will be able to
assume types of diabetes,related future risks which can happen and confer to the level of
risk of the patient and can be treated accordingly.

1. I NTRODUCTION
This method (like Big Data) is an emerging as the solution to the problems associated
with large amount of data. The large amount of data generated can now be used in order
to provide an inner view of what is really taking place and spot the emerging trends.it
can also be able to use in the health sector in order to make the system more effective. It
refers to the extense amount of data which can be either structured or can be unstructured
and cannot be processed using a relational database model. Unstructured data refers to the
data that cannot be stored in a particular row and column format. Big Data also goes
beyond the processing capacity of the conventional database systems , Health care sector
data is rising beyond the distributing volume of the health care administrations and is
predictable to increase in the forthcoming years. Most of times the Health care data is
regularly formless, and resides in imaging structures, medical preparation notes,
insurance privileges figures, Electric Longsuffering Record etc. Combining structured
and unstructured data for advanced analytics is perilous to advance health care related
outcomes. Because of statistics which are isolated in unlike or dissenting setups or owing
to the absence in handling ability to load and analyse the large data sets in a frequently
timely way the Health care organizations are not in the place to influence the aids of the
huge set of Health care data. With the help of innovative calculating and several Big Data
skills like Cloud Computing , Hadoop, and Machine Learning algorithms it is very easy
to reach high concert, in minimum cost. This type of data solutions frequently arise with
usual of advanced data managing solutions and various logical tools, when successfully
executed.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

1.1 PROJECT DESCRIPTIO N
.
Because of data that are inaccessible in unlike or dissenting set-ups or because of the lack
in handling ability to load and request big datasets in a sensible way the Healthcare
administrations are not in the great place to influence the aids of the big set of Healthcare
record . by the use of progressive computing and abundant Big Data methods like Cloud
Computing , Hadoop, and Machine Learning algorithms it is likely to achieve great
performance, scalability within our economy. This type of solutions rottenly come with
advanced data running solutions and logical equipments, when commendably applied can
alter the health care effects .
1.1.1 Problem Statement
Diabetes is like a illness which outcomes in extreme sugar in the blood which exists in
human body. Prevalence is increasing worldwide, particularly poor and middle class
countries. Entrance in high level. For these people makes diabetes a dangerous disease.
The traditional system involves a tedious process of multiple lab assessments and
assumptions based on certain guidelines by the doctor. The existing testing techniques do
not extensively cover all the required aspects to diagnose the condition of diabetes and
are often time consuming.
1.1.2 Objectives of the study
Big Data is emerging as a solution to the problems associated with large amount of data.
The large amount of data generated can now be used in order to provide an inner view of
what is really taking place and spot the emerging trends. Big Data can also be used in the

field of healtcare in order to make the system more effective. Unstructured data refers to
the data that cannot be stored in a particular row and column format. Big Data also goes
beyond the processing capacity of the conventional database systems. By using Big Data,
it is possible to predict the risk involved for the patient using his/her previous medical
history. Healthcare providers are digitising their databases which pave way for the
emergence of Big Data analytics. Using algorithms like Naive-Bayes and k-means the
prediction of risk involved can be done. The prediction would enable the healthcare
providers to quickly assess the patient’s situation and also provide an insight into
patient’s future if the current situation prevails as diabetes is a disease which affects the
patient. The risk involved can be assessed by doctors and can base their treatment and
also the patient can be advised for lifestyle changes. The main goal of this analysis study
is predict the diabetes disease and compare the algorithm which algorithm provide high
accuracy .finally select the best algorithm to predict the diabetes disease at early stage.
Examine how patients’ characteristics as well as measurements disturb diabetes cases.
1.1.3 Scope of the study
Healthcare segment data is rising above the trading capacity of the health care
administrations and also projected to increase in up coming days. Maximum of the
Healthcare data is frequently unstructured, and exist in the imaging systems, medicine
notes and in the insurance claims data, Electric Patient previous data etc. by combining
both the structured and unstructured statistics for progressive analytics is vital to improve
healthcare outputs. Due to of data that are remoted in different or dissenting formats or
due to the lack in treating ability to load and query large datasets in a sensible method the
Healthcare organizations are not in a place to control the assistances of the big set of
Healthcare data. By the aid of innovative computing and abundant Big Data technologies
like Cloud Computing , Hadoop, and Machine Learning algorithms it is likely to be easy
to achieve good performance, within our economy. These methods often come with
usual set of inventive data management solutions and logical equipments when
effectively executed and can alter the healthcare results.

1.1.4 Methodology used
The modern healthcare provider is equipped with an Electronic Healthcare Records and
an automation tool has enabled enormous amounts of data generation. This collected data
can be used to implement big data analytics. Apart from basic data being collected
modern systems also collect complex data from clinical trials, research and diagnostic
tests. Map Reduce is one of the encoding model for similar processing of big volume of
the availaible data. These type of data can be all but it is precisely designed to procedure
the slant of the data. The key refrain idea of Map Reduce is to alter list of data to
production. It provides flexibility to write Procedures virtually in any programming
languages. Completes job according to optimised scheduled priority .
1.2 COMPANY PROFILE
V.K.Computers is the IT education institute. Its vision is to train the students. We
attempt to get large number of our students worldwide , career prepared and severely
competitive. By coaching the newest programme, our scholars are informed industry
applicable programs. Since the year 1995, we have maximum strength of students and
best faculty available. We are developer in new concepts and computer education.
Basically it is established in Gulbarg a as well as in Bangalore for education support
services and to start advanced courses like Java, ASP, .Net, Maya and 3D Max Animation
Courses.
We have trained 5000+ students from different courses like B.E, M.C.A, B.Sc, PGDCA,
BCA, Polytechnic etc., till now on various fields of emerging IT industries. The students
trained in our esteemed organization have been working in many Multi National
Companies and IT Companies among which Infosys, IBM, TCS, Satyam are some of the

few names. We have trained many students especially in JAVA, .NET,Matlab, ASP
and .NET
As of now we are maintaining the software’s developed to our clients in different fields
of markets like Hotel, Dall Industries, Finance’s, Pharmaceuticals Distributors,
Departmental Stores, etc. We are also the leading web developers maintaining the web
authorization of many organization’s.

2. LITERATURE SURVEY
2.1 CURRENT AND PROPOSED SYSTEM
Current System
Diabetes is one of the major disease which may results extreme sugar in the blood of the
body, or high blood glucose. Prevalence is increasing worldwide, particularly in
developing or poor or countries. Entrance to worth health care for these people makes
diabetes a dangerous disease. The traditional system involves a tedious process of
multiple lab assessments and assumptions based on certain guidelines by the doctor. The
existing testing techniques do not extensively cover all the required aspects to diagnose
the condition of diabetes and are often time consuming.
Proposed System
In this paper we are proposing a analysis and prediction of new patient record with the
large set of other patients records using Big Data and Machine Learning algorithms like
Naive-Bayes and k-means. With the severity of risk the doctors can advise the further
course of action.

Advantages of using machine learning in health care are More accurate
diagnosis.
? Early involvement to prevent diseases.
? If the predicted risk is high, necessary steps can be taken to avoid the disease.
? Patients can use this system for information for self.
Modules
Information Extraction: Here patients health care records are collected from various
sources in hospitals and then organised as a Structured Electronic Healthcare Record (S-
EH )
Feature Selection: From the collected patient’s record, the most important features
required for diabetic prediction and modelling is extracted .
Predictive Modelling: We have constructed a predictive model to predict whether the
patient have diabetes or not. Regression model is used to provide the effect and dosage of
insulin. Cluster the users into groups based on the similarity, so that it is easy for doctors
to analyse and recommend medicine to them.
2.2 PROBABILITY STUDY
The basic study is to examine venture probability, the same like the system will be very
useful.The vital moto of this study is to examine the Technical, Working and Inexpensive
possibility for totalling new units and correcting old organisation system. All organisation
is possible if they are limitless capitals and infinite with respect to the time. The
probability learning is a management related things. The basic moto of a this study is to
search out if an info system project can be complete and to suggest likely other solutions.
Many aspects in this study part of the primaru examination

? Technical Possibility
? Operational Possibility
? Economical Possibility.
2.3 NECESSARY REQUIREMENTS OF HARDWARE AND
SOFTWARE
Hardware
? Processor : Pentium Duel Core and Higher
? RAM : 1GB or more.
? Hard Disk : 20GB or more.
? Monitor : 15 inch Color Monitor
? Keyboard : 102/104 Keys
? Mouse : Optical Mouse
Software
? Operating System :Windows XP/7/8/10

? Front end :MATLAB
? Back end : MYSQL
3. SOFTWARE SPECIFICATION
3.1 USERS
3.2 FUNCTIONAL REQUIREMENT
.These includes
? Accounts of data to be arrived in the system
? Accounts of operations done by every screen
? Accounts of work-flows done by the system
? Accounts of system informations or other results
? Who should enter te data.
? How the system encounters appropriate necessities

3.3 NON-FUNCTIONAL NECESSITIES
In count to the clear features and roles that will provide in your system, there are other
necessities that don't really DO anything, but which might be vital characteristics.These
are known as "non-functional requirements". For example, points such as presentation,
security,
Every necessity is simply termed in english. And must be impartial,There might be few
accountable methods to evaluate whether the condition has been met or not.
4. SYSTEM DESIGN (High Level or Architectural design)
4.1 SYSTEM PERSPECTIVE
Based on the existing system and understanding the user requirement system has been
designed. System designed into architectural design, component design and the interface
design of the system.
1. Architectural design
2. Logical design
3. Physical design

1) Architectural design: In architectural design it is designed based on the
behavior of the system, structural design and the analysis of the system.
Fig 1 : System Architectural Design
2) Logical design: Logical design concept will comes in the modeling where we
have designed the abstract model of the system. Here, the logical design of
system can be representation of data flow, providing input to the system and
output of the system. Diagram like ER-Diagram and other where shows the
Entity and their relationships.
3) Physical Design: In the Physical design it will take as the input and it will
gives the output and it will verify and validate the each and every field. In this
system it should fulfill the system requirement such as input/output, storage
and other requirements.
4.2 FRAMEWORK DIAGRAM

In the Framework Diagram it describes the system which considerses as single high-level
procedure and then after shows the relationship between the system has with further
external objects ( which are may be organizational groups, systems, external data stores,
etc.).
Framework Diagram also known as by the name of Context Level Data Flow Drawing or a
Level-0 Data Flow Drawing. As framework Diagram is a prescribed version of Data-Flow
Drawing, which may helpful in understanding the Data-Flow Drawings.
Data-Flow Drawing (DFD) can be described as a graphical picturing of the programme of
data through of the evidence system. These are one among the three crucial components of
the structured systems study and design process (SSADM).It is a method centric and
describes 4 key components.
methods (circle)
? Peripheral or outwards Objects (rectangle)
? Statistics Stores ( either two adjacent, parallel lines or may be ellipse)
? Statistics Flows (either curved or may be straight line including arrowhead
representing flow path)
. DFD Level 1 -Diagram DFD Level 2-Diagram

5. DETAILED DESIGN
5.1 USAGE CASE DRAWING
The Usage Case can be described as a set of scenarios which relating an communication
between the user and system. Usage Case drawing shows the relationship between actors

and the Cases. Basically there are two main mechanisms of a Usage Case drawing and they
are one is Use Cases and another is actors.
Fig 5: Usage Case Drawing
5.2 SEQUENCE DRAWINGS
Sequence drawings are also known as by the name of event diagrams or event
circumstances. An sequence drawing displays, as equivalent vertical lines or unlike
processes or objects that concurrently live, and, also as the parallel arrows, the
communications interchanged between them are in the direction.
.

Fig 6: Sequence drawing
5.3 COLLABORATION DRAWING
After the sequence drawings next is collaboration drawing and these can also be
known as communication drawing or interface drawing , and this is the example of the
relations and communications of software entities in the Unified Demonstrating
Language (UDL)

.
Fig 7: Collab o ration drawing

5.4 MOVEMENT DRAWING
Movement drawing is one more key drawing . It helps to define the active features
of the system. Movement drawing can be described as an flowchart which signify
the flow from one movement into another movement. This movement can also be
described as a action of the system.

Fig 8: Movement diagram
5.5 Class diagram
Fig 9: Class Diagram
5.5 DATABASE DESIGN
In the database table it will store the system data with attribute with its data type. Below are
the database table used in this project.
5.7.1 Database Table For Add Hospital

Sl No. Variable Data Types Description
01 Hosp_ID nvarchar(MAX) Hosp_ID is used for
Hospital Identity
02 Hosp_Name nvarchar(MAX) Hosp_Name is used
for Hospital Name
03 Address nvarchar(MAX) Address is used
address
04 Contact Int (10) Contact is used for
contact
05 Website nvarchar(MAX) Website is used for
website
06 Email nvarchar(MAX) Email is used for
email
07 Country nvarchar(MAX) Country is used for
country
08 Hosp_Type nvarchar(MAX) Hosp_Type is used
for Hospital Type
09 Un nvarchar(MAX) Un is used for User
Name
10 Pw nvarchar(MAX) Pw is used for
Password

Fig 15: Database Table for Add Hospital
5.7.2 Database Table For Add Receiptionst

Sl No. Variable Data Types Description
01 Fname Nvarchar(MAX) Fname is used for
First Name
02 Contact Int(10) Contact is used for
Contact
03 Email Nvarchar(MAX) Email is used for
email address
04 Hname Nvarchar(MAX) Hname is used for
Hospital Name
05 Stime Nvarchar(MAX) Stime is used for Shift
Time
06 Un Nvarchar(MAX) Un is used for User
Name
07 Pw Nvarchar(MAX) Pw is used for
Password
Fig 16: Database Table for Add Receiptionst
5.7.3 Database Table for Doctor
Sl No. Variable Data Types Description
01 Fname Nvarchar(MAX) Fname is used for
First Name

02 Contact Int(10) Contact is used for
Contact
03 Sp Nvarchar(MAX) Sp is used for
Specialist
04 Hname Nvarchar(MAX) Hname is used for
Hospital Name
05 Stime Nvarchar(MAX) Stime is used for Shift
Time
06 Un Nvarchar(MAX) Un is used for
UserName
07 Pw Nvarchar(MAX) Pw is used for
Password
Fig 17: Database Table for Doctor
5.7.4 Database Table for Doct temp
Sl No. Variables Data Types Description
01 Fname Nvarchar(MAX) Fname is used for
First Name
02 Lname Nvarchar(MAX) Lname is used for
Last Name

Fig 18: Database Table for Doct temp
5.7.5 Database Table for Hosp temp
Sl No. Variable Data Types Description
01 Hospid Nvarchar(MAX) Hospid is used for
Hospital Identity
02 Hospname Nvarchar(MAX) Hospname is used for
Hospital name
Fig 19: Database Table for Hosp temp
5.7.6 Database Table for Rect temp
Sl No. Variables Data Types Description
01 Fname Nvarchar(MAX) Fname is used for
First Name
02 Lname Nvarchar(MAX) Lname is used for

Last Name
Fig 20: Database Table for Rect temp
5.7.6 Database Table for Patient register
Sl No. Variable Data Types Description
01 Pid Int Pid is used for
Patient ID
02 Fname Nvarchar(MAX) Fname id used for
First Name
03 Address Nvarchar(MAX) Address is used for
address
04 Contact Nvarchar(MAX) Contact is used for
contact
05 Email Nvarchar(MAX) Email is used for
email ID
06 Dob Nvarchar(MAX) Dob is used for Date
Of Birth
07 Age Nvarchar(MAX) Age is used for age
08 Gender Nvarchar(MAX) Gender is used for
gender
09 MS Nvarchar(MAX) MS is used for
Marriage Status
10 Height Nvarchar(MAX) Height is used for
height
11 Weight Nvarchar(MAX) Weight is used for
weight

12 H1 Nvarchar(MAX) H1 is used for habit
13 Refdoctor Nvarchar(MAX) Ref doctor is used for
Referenced doctor
14 Hosp_Name Nvarchar(MAX) Hosp_Name is used
for Hospital Name
15 Dis1 Nvarchar(MAX) Dis 1 is used for
diseases
16 H2 Nvarchar(MAX) H2 is used for habit
17 H3 Nvarchar(MAX) H3 is used for habit
18 H4 Nvarchar(MAX) H4 is used for habit
19 Dis2 Nvarchar(MAX) Dis 2 is used for
diseases
20 Dis3 Nvarchar(MAX) Dis 3 is used for
diseases
Fig 21: Database Table for Patient register
5.7.7 Database Table for P temp
Sl No. Variables Data Types Description
01 Pid Nvarchar(MAX) Pid is used for patient
ID
02 Pname Nvarchar(MAX) Pname is used for
Patient name

Fig 22: Database Table for P temp
5.7.8 Database Table for test for patient
Sl. No. Variable Data Types Description
01 Pid Int Pid is used for patient
ID
02 Pname Nvarchar(MAX) Pname is used for
Patient name
03 Tname Nvarchar(MAX) Tname is used for test
name
04 Ndata Nvarchar(MAX) Ndata is used for
normal data
05 Tdata Nvarchar(MAX) Tdata is used for test
data
Fig 23: Database Table for test for patient
6. IMPLEMENTATION
Coding
Main
function varargout = main(varargin)
% MAIN MATLAB code for main.fig
% MAIN, by itself, creates a new MAIN or raises the existing
% singleton*.

%
% H = MAIN returns the handle to a new MAIN or the handle to
% the existing singleton*.
%
% MAIN('CALLBACK',hObject,eventData,handles,…) calls the local
% function named CALLBACK in MAIN.M with the given input arguments.
%
% MAIN('Property','Value',…) creates a new MAIN or raises the
% existing singleton*. Starting from the left, property value pairs are
% applied to the GUI before main_OpeningFcn gets called. An
% unrecognized property name or invalid value makes property application
% stop. All inputs are passed to main_OpeningFcn via varargin.
%
% *See GUI Options on GUIDE's Tools menu. Choose "GUI allows only one
% instance to run (singleton)".
%
% See also: GUIDE, GUIDATA, GUIHANDLES

% Edit the above text to modify the response to help main

% Last Modified by GUIDE v2.5 23-May-2018 20:17:29

% Begin initialization code – DO NOT EDIT
gui_Singleton = 1;
gui_State = struct( 'gui_Name' , mfilename, …
'gui_Singleton' , gui_Singleton, …
'gui_OpeningFcn' , @main_OpeningFcn, …
'gui_OutputFcn' , @main_OutputFcn, …

'gui_LayoutFcn' , , …
'gui_Callback' , );
if nargin ;; ischar(varargin{1})
gui_State.gui_Callback = str2func(varargin{1});
end

if nargout
varargout{1:nargout} = gui_mainfcn(gui_State, varargin{:});
else
gui_mainfcn(gui_State, varargin{:});
end
% End initialization code – DO NOT EDIT

% — Executes just before main is made visible.
function main_OpeningFcn(hObject, eventdata, handles, varargin)
% This function has no output args, see OutputFcn.
% hObject handle to figure
% eventdata reserved – to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% varargin command line arguments to main (see VARARGIN)

% Choose default command line output for main
handles.output = hObject;

% Update handles structure
guidata(hObject, handles);

% UIWAIT makes main wait for user response (see UIRESUME)
% uiwait(handles.figure1);
axes(handles.axes1)
matlabImage = imread( 'back.jpg' );
image(matlabImage)
axis off
axis image

%% K-means Segmentation (option: K Number of Segments)
% Alireza Asvadi
% http://www.a-asvadi.ir
% 2012
% Questions regarding the code may be directed to [email protected]
%% initialize

% — Executes on button press in pushbutton3.
function pushbutton3_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton3 (see GCBO)
% eventdata reserved – to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
menu

6.1 SCREEN SHOTS
Mai n

A dminlogin

Men u

Patient data loaded

Imput

Missing values filled

Result

6. ABOUT SOFTWARE TESTING
8. THE VARIOUS SOFTWARE TESTING APPROACHES
Software testing is known as one of the software excellence guarantee and signifies final
evaluation of the requirement, scheming and also coding.
TESTING GOALS
1. It is the process of finding an error or mistake with the help of programme
2. The inner action of the product, tests can also be conducted to make sure
the”all g An better test case plan is one which has a chance of searching an as yet
exposed error.
3. An fruitful test is one that exposes an as yet to be inventible error.
Above shown goals suggest a intense alteration in vision port.
It cannot display the lack of faults, but helps in finding errors which are already
present.
CASE PROJECT DESIGN
There are two methods to test an product.
1. White box testing : In this testing, by knowing the specified function that
a product has been designed to perform test can be conducted that demonstrates
each function is fully operation at the same time searching for errors in each
function. It is a test case design method that uses the control structure of the

procedural design to derive test cases. Basis path testing is a white box testing.
Basis Path Testing:
i. Drift graph representation
ii. Rota tion a mate Complication
iii. Developing test cases
iv. Grid or chart matrices.
Controller Construction Testing:
i. Circumstance testing
ii. Statistics drift testing
iii. Hoop or ring testing
2. Black box testing: It basically emphases on the practical necessities of the
software. In these type of testing by determining tears mesh”, which is inner action
does conferring to requirement and all inner workings have been sufficiently work
out. Basic stages in this type are:
i. Chart built testing approaches
ii. Correspondence segregating
iii. Border price study
iv. Assessment trying

7.3 Cases
STEP
S CONTRIBUTION PREDICTABLE RESULT REAL RESULT Pass/Fail
FORM1: Login
Step
1 : Enter User Name Username Should be entered Actual result should be
testing on a machine Pass
Step
2 : Enter Password Password should be entered
Step
3: Click on Login Button License Page should be Opened Pass
Verification of User If user already logged in then it should not
open Fail
FORM2: Register
Step
1 Enter First Name First Name should be entered Actual result should be
testing on a machine Pass
Step
2 Enter Last Name Last Name should be entered
Step
3 Enter Contact Address should be entered
Step
4 Enter User Name User Name should be entered
Step
5 Enter password Password should be entered
Step
7 Enter emailed Email should be entered
Step
8 Enter Type Server or client should be entered
FORM3: Send Message
Step
1 Enter Message Entering Message to send Actual result should be
testing on a machine Pass
Step
2 Enter Last Name Enter Last Name
Step
3 Enter First Name Enter Email ID to whom the message
should send
Step
4 Detection method Select detection method
Step
5 Encrypted message Encrypt message
Step
6 Date Date of sending message
FORM4:Receive Message
Step Select M Selecting Mail ID Actual result should be Pass

1 testing on a machine
Step
2 Enter Key Entering Key
Step
3 Decrypt Decrypting message and getting back
original message
8.CONCLUSION
This analysis or examination is the process which combines different statactics methods,
machine learning flowcharts and data which use both present and previous to invent
knowledge from it and helps to imagine the upcoming occurrences.We have
implemented Hadoop Map Reduce based systems for Pima Indian diabetes statistics
set to search the lost values within it and to determine designs from it. This type of
work recommends that applied algorithms are capable to assign lost values and to
identify outlines from the statistics set. In forthcoming work design corresponding

will be engaged by smearing these exposed designs on testing statistics set to forecast
diabetic dominant and hazard level related with it.

9. FUTURE ENHANCEMENTS
BIBLIOGRAPHY
1 Dr Saravanakumar , Eswari, Sampath, Lavanya “Predictive Methodology for Diabetic Data
Analysis in Big Data,” ELSEVIER, ISBCC 2015. 2 V. H. Bhat, P. G. Rao, P. D. Shenoy, “An
Efficient Prediction Model for Diabetic Database Using Soft Computing
Techniques,Architecture,” Springer-Verlag Berlin Heidelberg, pp. 328- 335, 2009. 3 Aiswarya
Iyer, S. Jeyalatha, Ronak Sumbaly “Diagnosis of Diabetes Using Classification Mining
Techniques,” IJDKP Vol.5, No.1, January 2015. 4 Sabibullah M, Shanmugasundaram V, Raja
Priya K, “Diabetes Patient’s Risk through Soft Computing Model,”International Journal of
Emerging Trends Technology in Computer Science, vol 2(6), 2013. 5 K. Rajesh, V. Sangeetha,
“Application of Data Mining Methods and Techniques for Diabetes Diagnosis,” in International
Journal of Engineering and Innovative Technology (IJEIT) Vol 2(3), 2012. 6 Apache Hadoop
and its ecosystems : http://hadoop.apache.org/ 7 Rajnik L. Vaishnav , Dr. K. M. Patel,
“Analysis of Various Techniques to Handling Missing Value in Data set,” International Journal
of Innovative and Emerging Research in Engineering Volume 2, Issue 2, 2015 8 Wei Dai, Wei
Ji, “A MapReduce Implementation of C4.5 Decision Tree Algorithm,” International Journal of
Database Theory and Application Vol.7, No.1 (2014), pp.49-60 9 Machine Learning tutorials
and examples https://www.toptal.com/machine-learning/machinelearningtheory- an-
introductory-primer 10 Anish Talwar, Yogesh Kumar, “Machine Learning: An artificial
intelligence methodology,” International Journal Of Engineering And Computer Science
ISSN:2319-7242 Volume 2 Issue 12, Dec.2013 11 Brona Brejova, Tomas Vina, Ming Li,
“Pattern Discovery: Methods and Software,” Technical Report CS-2000-22, Dept. of Computer
Science, University of Waterloo. 12 Dr.Rajni Jain, “Rule Generation Using Decision Trees,”
IASRI 13 Md. Geaur Rahman, Md. Zahidul Islam, “A Decision Tree-based Missing Value
Imputation Technique for Data Pre-processing,” Proceedings of the 9-th Australasian Data
Mining Conference (AusDM’11), Ballarat, Australia. 14 Gauri D.Kalyankar, Shivananda R
Poojara, N V Dharwadkar,”Weblog Analysis Using Hadoop,” National Research Symposium on
Computing – RSC 2016, ISBN: 978-81- 931456-1-8, Dec. 19-20, 2016 15 Sadhana, Savitha
Shetty, “Analysis of Diabetic Data Set Using Hive and R,” International Journal of Emerging
Technology and Advanced Engineering, vol 4(7), 2014. 16 A.Ravishankar Rao, Atul Chhabra,
Rajarshi Das, Vikash Ruhil, “A framework for analyzing publicly available healthcare data,”
IEEE 2015.

x

Hi!
I'm Kim!

Would you like to get a custom essay? How about receiving a customized one?

Check it out