You are on page 1of 9

Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks

Pranav Rajpurkar PRANAVSR @ CS . STANFORD . EDU


Awni Y. Hannun AWNI @ CS . STANFORD . EDU
Masoumeh Haghpanahi MHAGHPANAHI @ IRHYTHMTECH . COM
Codie Bourn CBOURN @ IRHYTHMTECH . COM
Andrew Y. Ng ANG @ CS . STANFORD . EDU

Abstract
arXiv:1707.01836v1 [cs.CV] 6 Jul 2017

We develop an algorithm which exceeds the per-


formance of board certified cardiologists in de-
tecting a wide range of heart arrhythmias from
electrocardiograms recorded with a single-lead
wearable monitor. We build a dataset with more
than 500 times the number of unique patients
than previously studied corpora. On this dataset,
we train a 34-layer convolutional neural network
which maps a sequence of ECG samples to a se-
quence of rhythm classes. Committees of board-
certified cardiologists annotate a gold standard
test set on which we compare the performance of
our model to that of 6 other individual cardiolo-
gists. We exceed the average cardiologist perfor-
mance in both recall (sensitivity) and precision Figure 1. Our trained convolutional neural network correctly de-
(positive predictive value). tecting the sinus rhythm (SINUS) and Atrial Fibrillation (AFIB)
from this ECG recorded with a single-lead wearable heart moni-
tor.
1. Introduction

Arrhythmia detection from ECG recordings is usually per-


We develop a model which can diagnose irregular heart formed by expert technicians and cardiologists given the
rhythms, also known as arrhythmias, from single-lead ECG high error rates of computerized interpretation. One study
signals better than a cardiologist. Key to exceeding ex- found that of all the computer predictions for non-sinus
pert performance is a deep convolutional network which rhythms, only about 50% were correct (Shah & Rubin,
can map a sequence of ECG samples to a sequence of ar- 2007); in another study, only 1 out of every 7 presentations
rhythmia annotations along with a novel dataset two orders of second degree AV block were correctly recognized by
of magnitude larger than previous datasets of its kind. the algorithm (Guglin & Thatai, 2006). To automatically
Many heart diseases, including Myocardial Infarction, AV detect heart arrhythmias in an ECG, an algorithm must im-
Block, Ventricular Tachycardia and Atrial Fibrillation can plicitly recognize the distinct wave types and discern the
all be diagnosed from ECG signals with an estimated 300 complex relationships between them over time. This is dif-
million ECGs recorded annually (Heden et al., 1996). We ficult due to the variability in wave morphology between
investigate the task of arrhythmia detection from the ECG patients as well as the presence of noise.
record. This is known to be a challenging task for com- We train a 34-layer convolutional neural network (CNN)
puters but can usually be determined by an expert from a to detect arrhythmias in arbitrary length ECG time-series.
single, well-placed lead. Figure 1 shows an example of an input to the model. In

Authors contributed equally. addition to classifying noise and the sinus rhythm, the
Project website at https://stanfordmlgroup. network learns to classify and segment twelve arrhythmia
github.io/projects/ecg types present in the time-series. The model is trained end-
Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks

to-end on a single-lead ECG signal sampled at 200Hz and Input


a sequence of annotations for every second of the ECG
as supervision. To make the optimization of such a deep
model tractable, we use residual connections and batch- conv
normalization (He et al., 2016b; Ioffe & Szegedy, 2015). BN
The depth increases both the non-linearity of the compu-
ReLU
tation as well as the size of the context window for each
classification decision.
We construct a dataset 500 times larger than other datasets
conv
of its kind (Moody & Mark, 2001; Goldberger et al., 2000).
One of the most popular previous datasets, the MIT-BIH BN
corpus contains ECG recordings from 47 unique patients. ReLU max pool
In contrast, we collect and annotate a dataset of about Dropout
30,000 unique patients from a pool of nearly 300,000 pa-
tients who have used the Zio Patch monitor1 (Turakhia conv
et al., 2013). We intentionally select patients exhibiting ab-
normal rhythms in order to make the class balance of the 15
dataset more even and thus the likelihood of observing un-
BN
usual heart-activity high.
ReLU
We test our model against board-certified cardiologists. A Dropout
committee of three cardiologists serve as gold-standard an-
conv
notators for the 336 examples in the test set. Our model
max pool
exceeds the individual expert performance on both recall BN
(sensitivity), and precision (positive predictive value) on ReLU
this test set. Dropout

conv
2. Model
Problem Formulation
BN
The ECG arrhythmia detection task is a sequence-to-
ReLU
sequence task which takes as input an ECG signal X =
[x1 , ..xk ], and outputs a sequence of labels r = [r1 , ...rn ], dense
such that each ri can take on one of m different rhythm
Softmax
classes. Each output label corresponds to a segment of the
input. Together the output labels cover the full sequence.
Figure 2. The architecture of the network. The first and last layer
For a single example in the training set, we optimize the are special-cased due to the pre-activation residual blocks. Over-
cross-entropy objective function all, the network contains 33 layers of convolution followed by a
n fully-connected layer and a softmax.
1X
L(X, r) = log p(R = ri | X)
n i=1

where p() is the probability the network assigns to the i-th sampled at 200Hz, and the model outputs a new prediction
output taking on the value ri . once every second. We arrive at an architecture which is 33
layers of convolution followed by a fully connected layer
Model Architecture and Training and a softmax.

We use a convolutional neural network for the sequence-to- In order to make the optimization of such a network
sequence learning task. The high-level architecture of the tractable, we employ shortcut connections in a similar man-
network is shown in Figure 2. The network takes as input ner to those found in the Residual Network architecture (He
a time-series of raw ECG signal, and outputs a sequence et al., 2015b). The shortcut connections between neural-
of label predictions. The 30 second long ECG signal is network layers optimize training by allowing information
to propagate well in very deep neural networks. Before
1
iRhythm Technologies, San Francisco, California the input is fed into the network, it is normalized using a
Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks

Patch which has a wear period up to 14 days (Turakhia


et al., 2013). Each ECG record in the training set is 30
seconds long and can contain more than one rhythm type.
Each record is annotated by a clinical ECG expert: the ex-
pert highlights segments of the signal and marks it as cor-
responding to one of the 14 rhythm classes.
The 30 second records were annotated using a web-based
ECG annotation tool designed for this work. Label anno-
tations were done by a group of Certified Cardiographic
Technicians who have completed extensive training in ar-
rhythmia detection and a cardiographic certification exam-
ination by Cardiovascular Credentialing International. The
technicians were guided through the interface before they
could annotate records. All rhythms present in a strip were
Figure 3. Evaluated on the test set, the model outperforms the
labeled from their corresponding onset to offset, resulting
average cardiologist score on both the Sequence and the Set F1
metrics. in full segmentation of the input ECG data. To improve
labeling consistency among different annotators, specific
rules were devised regarding each rhythm transition.
robust normalization strategy. The network consists of 16 We split the dataset into a training and validation set. The
residual blocks with 2 convolutional layers per block. The training set contains 90% of the data. We split the dataset
convolutional layers all have a filter length of 16 and have so that there is no patient overlap between the training and
64k filters, where k starts out as 1 and is incremented every validation sets (as well as the test set described below).
4-th residual block. Every alternate residual block subsam-
ples its inputs by a factor of 2, thus the original input is Testing
ultimately subsampled by a factor of 28 . When a resid-
ual block subsamples the input, the corresponding shortcut We collect a test set of 336 records from 328 unique
connections also subsample their input using a Max Pool- patients. For the test set, ground truth annotations for
ing operation with the same subsample factor. each record were obtained by a committee of three board-
certified cardiologists; there are three committees respon-
Before each convolutional layer we apply Batch Normal- sible for different splits of the test set. The cardiologists
ization (Ioffe & Szegedy, 2015) and a rectified linear acti- discussed each individual record as a group and came to a
vation, adopting the pre-activation block design (He et al., consensus labeling. For each record in the test set we also
2016a). The first and last layers of the network are special- collect 6 individual annotations from cardiologists not par-
cased due to this pre-activation block structure. We also ticipating in the group. This is used to assess performance
apply Dropout (Srivastava et al., 2014) between the convo- of the model compared to an individual cardiologist.
lutional layers and after the non-linearity. The final fully
connected layer and softmax activation produce a distribu-
Rhythm Classes
tion over the 14 output classes for each time-step.
We identify 12 heart arrhythmias, sinus rhythm and noise
We train the networks from scratch, initializing the weights
for a total of 14 output classes. The arrhythmias are char-
of the convolutional layers as in (He et al., 2015a). We use
acterized by a variety of features. Table 2 in the Appendix
the Adam (Kingma & Ba, 2014) optimizer with the default
shows an example of each rhythm type we classify. The
parameters and reduce the learning rate by a factor of 10
noise label is assigned when the device is disconnected
when the validation loss stops improving. We save the best
from the skin or when the baseline noise in the ECG makes
model as evaluated on the validation set during the opti-
identification of the underlying rhythm impossible.
mization process. [ht]
The morphology of the ECG during a single heart-beat as
3. Data well as the pattern of the activity of the heart over time de-
termine the underlying rhythm. In some cases the distinc-
Training tion between the rhythms can be subtle yet critical for treat-
ment. For example two forms of second degree AV Block,
We collect and annotate a dataset of 64,121 ECG records
Mobitz I (Wenckebach) and Mobitz II (here referred to as
from 29,163 patients. The ECG data is sampled at a fre-
AVB TYPE2) can be difficult to distinguish. Wenckebach
quency of 200 Hz and is collected from a single-lead, non-
is considered benign and Mobitz II is considered patholog-
invasive and continuous monitoring device called the Zio
Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks

Seq Set In both the Sequence and the Set case, we compute the
Model Cardiol. Model Cardiol. F1 score for each class separately. We then compute the
overall F1 (and precision and recall) as the class-frequency
Class-level F1 Score
weighted mean.
AFIB 0.604 0.515 0.667 0.544
AFL 0.687 0.635 0.679 0.646
AVB TYPE2 0.689 0.535 0.656 0.529 Model vs. Cardiologist Performance
BIGEMINY 0.897 0.837 0.870 0.849 We assess the cardiologist performance on the test set. Re-
CHB 0.843 0.701 0.852 0.685
EAR 0.519 0.476 0.571 0.529 call that each of the records in the test set has a ground
IVR 0.761 0.632 0.774 0.720 truth label from a committee of three cardiologists as well
JUNCTIONAL 0.670 0.684 0.783 0.674 as individual labels from a disjoint set of 6 other cardiolo-
NOISE 0.823 0.768 0.704 0.689 gists. To assess cardiologist performance for each class, we
SINUS 0.879 0.847 0.939 0.907 take the average of all the individual cardiologist F1 scores
SVT 0.477 0.449 0.658 0.556
TRIGEMINY 0.908 0.843 0.870 0.816 using the group label as the ground truth annotation.
VT 0.506 0.566 0.694 0.769 Table 1 shows the breakdown of both cardiologist and
WENCKEBACH 0.709 0.593 0.806 0.736
model scores across the different rhythm classes. The
Aggregate Results model outperforms the average cardiologist performance
Precision (PPV) 0.800 0.723 0.809 0.763 on most rhythms, noticeably outperforming the cardiolo-
Recall (Sensitivity) 0.784 0.724 0.827 0.744 gists in the AV Block set of arrhythmias which includes
F1 0.776 0.719 0.809 0.751 Mobitz I (Wenckebach), Mobitz II (AVB Type2) and com-
plete heart block (CHB). This is especially useful given
Table 1. The top part of the table gives a class-level comparison of the severity of Mobitz II and complete heart block and the
the expert to the model F1 score for both the Sequence and the Set importance of distinguishing these two from Wenckebach
metrics. The bottom part of the table shows aggregate results over which is usually considered benign.
the full test set for precision, recall and F1 for both the Sequence
and Set metrics. Table 1 also compares the aggregate precision, recall and
F1 for both model and cardiologist compared to the ground
truth annotations. The aggregate scores for the cardiolo-
gist are computed by taking the mean of the individual car-
ical, requiring immediate attention (Dubin, 1996). diologist scores. The model outperforms the cardiologist
Table 2 in the Appendix also shows the number of unique average in both precision and recall.
patients in the training (including validation) set and test
set for each rhythm type. 5. Analysis
The model outperforms the average cardiologist score on
4. Results both the sequence and the set F1 metrics. Figure 4 shows
Evaluation Metrics a confusion matrix of the model predictions on the test set.
Many arrhythmias are confused with the sinus rhythm. We
We use two metrics to measure model accuracy, using the expect that part of this is due to the sometimes ambiguous
cardiologist committee annotations as the ground truth. location of the exact onset and offset of the arrhythmia in
Sequence Level Accuracy (F1): We measure the aver- the ECG record.
age overlap between the prediction and the ground truth Often the mistakes made by the model are understand-
sequence labels. For every record, a model is required to able. For example, confusing Wenckebach and AVB Type2
make a prediction approximately once per second (every makes sense given that the two rhythms in general have
256 samples). These predictions are compared against the very similar ECG morphologies. Similarly, Supraventric-
ground truth annotation. ular Tachycardia (SVT) and Atrial Fibrillation (AFIB) are
Set Level Accuracy (F1): Instead of treating the labels for often confused with Atrial Flutter (AFL) which is under-
a record as a sequence, we consider the set of unique ar- standable given that they are all atrial arrhythmias. We also
rhythmias present in each 30 second record as the ground note that Idioventricular Rhythm (IVR) is sometimes mis-
truth annotation. Set Level Accuracy, unlike Sequence taken as Ventricular Tachycardia (VT), which again makes
Level Accuracy, does not penalize for time-misalignment sense given that the two only differ in heart-rate and are
within a record. We report the F1 score between the unique difficult to distinguish close to the 100 beats per minute de-
class labels from the ground truth and those from the model lineation.
prediction.
Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks

AFIB 1.0 beat detection (Coast et al., 1990). Artificial neural net-
AFL 0.9 works have also been used for the task of beat detection
AVB_TYPE2 0.8 (Melo et al., 2000). While these models have achieved
BIGEMINY 0.7
CHB high-accuracy for some beat types, they are not yet suffi-
EAR 0.6
True label

cient for high-accuracy heart arrhythmia classification and


IVR 0.5 segmentation. For example, (Artis et al., 1991) train a
JUNCTIONAL
NOISE 0.4 neural network to distinguish between Atrial Fibrillation
SINUS 0.3 and Sinus Rhythm on the MIT-BIH dataset. While the
SVT 0.2 network can distinguish between these two classes with
TRIGEMINY
VT 0.1 high-accuracy, it does not generalize to noisier single-lead
WENCKEBACH 0.0 recordings or classify among the full range of 15 rhythms
AFIB
AFL
AVB_TYPE2
BIGEMINY
CHB
EAR
IVR
JUNCTIONAL
NOISE
SINUS
SVT
TRIGEMINY
VT
WENCKEBACH
available in MIT-BIH. This is in part due to insufficient
training data, and because the model also discards critical
information in the feature extraction stage.
The most common dataset used to design and evaluate ECG
Predicted label algorithms is the MIT-BIH arrhythmia database (Moody
& Mark, 2001) which consists of 48 half-hour strips of
ECG data. Other commonly used datasets include the
Figure 4. A confusion matrix for the model predictions on the test MIT-BIH Atrial Fibrillation dataset (Moody & Mark, 1983)
set. Many of the mistakes the model makes are not surprising.
and the QT dataset (Laguna et al., 1997). While useful
For example, confusing second degree AV Block (Type 2) with
Wenckebach makes sense given the often similar expression of
benchmarks for R-peak extraction and beat-level annota-
the two arrhythmias in the ECG record. tions, these datasets are too small for fine-grained arrhyth-
mia classification. The number of unique patients is in the
single digit hundreds or fewer for these benchmarks. A
One of the most common confusions is between Ectopic recently released dataset captured from the AliveCor ECG
Atrial Rhythm (EAR) and sinus rhythm. The main distin- monitor contains about 7000 records (Clifford et al., 2017).
guishing criteria for this rhythm is an irregular P wave. This These records only have annotations for Atrial Fibrillation;
can be subtle to detect especially when the P wave has a all other arrhythmias are grouped into a single bucket. The
small amplitude or when noise is present in the signal. dataset we develop contains 29,163 unique patients and 14
classes with hundreds of unique examples for the rarest ar-
rhythmias.
6. Related Work
Machine learning models based on deep neural networks
Automatic high-accuracy methods for R-peak extraction have consistently been able to approach and often exceed
have existed at least since the mid 1980s (Pan & Tomp- human agreement rates when large annotated datasets are
kins, 1985). Current algorithms for R-peak extraction tend available (Amodei et al., 2016; Xiong et al., 2016; He et al.,
to use wavelet transformations to compute features from 2015c). These approaches have also proven to be effective
the raw ECG followed by finely-tuned threshold based clas- in healthcare applications, particularly in medical imaging
sifiers (Li et al., 1995; Martnez et al., 2004). Because ac- where pretrained ImageNet models can be applied (Esteva
curate estimates of heart rate and heart rate variability can et al., 2017; Gulshan et al., 2016). We draw on work in au-
be extracted from R-peak features, feature-engineered al- tomatic speech recognition for processing time-series with
gorithms are often used for coarse-grained heart rhythm deep convolutional neural networks and recurrent neural
classification, including detecting tachycardias (fast heart networks (Hannun et al., 2014; Sainath et al., 2013), and
rate), bradycardias (slow heart rate), and irregular rhythms. techniques in deep learning to make the optimization of
However, such features alone are not sufficient to distin- these models tractable (He et al., 2016b;c; Ioffe & Szegedy,
guish between most heart arrhythmias since features based 2015).
on the atrial activity of the heart as well as other features
pertaining to the QRS morphology are needed.
7. Conclusion
Much work has been done to automate the extraction of
other features from the ECG. For example, beat classifica- We develop a model which exceeds the cardiologist perfor-
tion is a common sub-problem of heart-arrhythmia classifi- mance in detecting a wide range of heart arrhythmias from
cation. Drawing inspiration from automatic speech recog- single-lead ECG records. Key to the performance of the
nition, Hidden Markov models with Gaussian observation model is a large annotated dataset and a very deep convolu-
probability distributions have been applied to the task of tional network which can map a sequence of ECG samples
Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks

to a sequence of arrhythmia annotations. with deep neural networks. Nature, 542(7639):115118,


2017.
On the clinical side, future work should investigate extend-
ing the set of arrhythmias and other forms of heart disease Goldberger, Ary L, Amaral, Luis AN, Glass, Leon, Haus-
which can be automatically detected with high-accuracy dorff, Jeffrey M, Ivanov, Plamen Ch, Mark, Roger G,
from single or multiple lead ECG records. For example we Mietus, Joseph E, Moody, George B, Peng, Chung-
do not detect Ventricular Flutter or Fibrillation. We also do Kang, and Stanley, H Eugene. Physiobank, phys-
not detect Left or Right Ventricular Hypertrophy, Myocar- iotoolkit, and physionet components of a new research
dial Infarction or a number of other heart diseases which do resource for complex physiologic signals. Circulation,
not necessarily exhibit as arrhythmias. Some of these may 101(23):e215e220, 2000.
be difficult or even impossible to detect on a single-lead
ECG but can often be seen on a multiple-lead ECG. Guglin, Maya E and Thatai, Deepak. Common errors
Given that more than 300 million ECGs are recorded an- in computer electrocardiogram interpretation. Interna-
nually, high-accuracy diagnosis from ECG can save expert tional journal of cardiology, 106(2):232237, 2006.
clinicians and cardiologists considerable time and decrease
Gulshan, Varun, Peng, Lily, Coram, Marc, Stumpe, Mar-
the number of misdiagnoses. Furthermore, we hope that
tin C, Wu, Derek, Narayanaswamy, Arunachalam, Venu-
this technology coupled with low-cost ECG devices en-
gopalan, Subhashini, Widner, Kasumi, Madams, Tom,
ables more widespread use of the ECG as a diagnostic tool
Cuadros, Jorge, et al. Development and validation
in places where access to a cardiologist is difficult.
of a deep learning algorithm for detection of diabetic
retinopathy in retinal fundus photographs. JAMA, 316
Acknowledgements (22):24022410, 2016.
We thank Geoffrey H. Tison MD, MPH of UCSF for help- Hannun, Awni Y., Case, Carl, Casper, Jared, Catanzaro,
ful feedback on the experiments and references. Bryan, Diamos, Greg, Elsen, Erich, Prenger, Ryan,
Satheesh, Sanjeev, Sengupta, Shubho, Coates, Adam,
References and Ng, Andrew Y. Deep speech: Scaling up end-to-
end speech recognition. abs/1412.5567, 2014. URL
Amodei, Dario, Anubhai, Rishita, Battenberg, Eric, Case, http://arxiv.org/abs/1412.5567.
Carl, Casper, Jared, Catanzaro, Bryan, Chen, JingDong,
Chrzanowski, Mike, Coates, Adam, Diamos, Greg, et al. He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Sun,
Deep speech 2: End-to-end speech recognition in english Jian. Delving deep into rectifiers: Surpassing human-
and mandarin. In Proceedings of The 33rd International level performance on imagenet classification. CoRR,
Conference on Machine Learning, pp. 173182, 2016. abs/1502.01852, 2015a. URL http://arxiv.org/
abs/1502.01852.
Artis, Shane G, Mark, RG, and Moody, GB. Detection
of atrial fibrillation using artificial neural networks. In He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and
Computers in Cardiology 1991, Proceedings., pp. 173 Sun, Jian. Deep residual learning for image recogni-
176. IEEE, 1991. tion. CoRR, abs/1512.03385, 2015b. URL http:
//arxiv.org/abs/1512.03385.
Clifford, GD, Liu, CY, Moody, B, Lehman, L, Silva, I, Li,
Q, Johnson, AEW, and Mark, RG. Af classification from He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Sun,
a short single lead ecg recording: The physionet comput- Jian. Delving deep into rectifiers: Surpassing human-
ing in cardiology challenge 2017. 2017. level performance on imagenet classification. In Pro-
ceedings of the IEEE international conference on com-
Coast, Douglas A, Stern, Richard M, Cano, Gerald G, and puter vision, pp. 10261034, 2015c.
Briller, Stanley A. An approach to cardiac arrhythmia
analysis using hidden markov models. IEEE Transac- He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and
tions on biomedical Engineering, 37(9):826836, 1990. Sun, Jian. Identity mappings in deep residual net-
works. CoRR, abs/1603.05027, 2016a. URL http:
Dubin, Dale. Rapid Interpretation of EKGs. USA: Cover //arxiv.org/abs/1603.05027.
Publishing Company, 1996, 1996.
He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Sun,
Esteva, Andre, Kuprel, Brett, Novoa, Roberto A, Ko, Jian. Deep residual learning for image recognition. In
Justin, Swetter, Susan M, Blau, Helen M, and Thrun, Se- Proceedings of the IEEE Conference on Computer Vi-
bastian. Dermatologist-level classification of skin cancer sion and Pattern Recognition, pp. 770778, 2016b.
Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks

He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Sun, Shah, Atman P and Rubin, Stanley A. Errors in the
Jian. Identity mappings in deep residual networks. In computerized electrocardiogram interpretation of car-
European Conference on Computer Vision, pp. 630645. diac rhythm. Journal of electrocardiology, 40(5):385
Springer, 2016c. 390, 2007.
Heden, Bo, Ohlsson, Mattias, Holst, Holger, Mjoman, Mat- Srivastava, Nitish, Hinton, Geoffrey E, Krizhevsky, Alex,
tias, Rittner, Ralf, Pahlm, Olle, Peterson, Carsten, and Sutskever, Ilya, and Salakhutdinov, Ruslan. Dropout:
Edenbrandt, Lars. Detection of frequently overlooked a simple way to prevent neural networks from overfit-
electrocardiographic lead reversals using artificial neu- ting. Journal of Machine Learning Research, 15(1):
ral networks. The American journal of cardiology, 78 19291958, 2014.
(5):600604, 1996.
Turakhia, Mintu P, Hoang, Donald D, Zimetbaum, Peter,
Ioffe, Sergey and Szegedy, Christian. Batch normalization: Miller, Jared D, Froelicher, Victor F, Kumar, Uday N,
Accelerating deep network training by reducing internal Xu, Xiangyan, Yang, Felix, and Heidenreich, Paul A.
covariate shift. arXiv preprint arXiv:1502.03167, 2015. Diagnostic utility of a novel leadless arrhythmia moni-
Kingma, Diederik and Ba, Jimmy. Adam: A toring device. The American journal of cardiology, 112
method for stochastic optimization. arXiv preprint (4):520524, 2013.
arXiv:1412.6980, 2014. Xiong, Wayne, Droppo, Jasha, Huang, Xuedong, Seide,
Laguna, Pablo, Mark, Roger G, Goldberg, A, and Moody, Frank, Seltzer, Mike, Stolcke, Andreas, Yu, Dong,
George B. A database for evaluation of algorithms for and Zweig, Geoffrey. Achieving human parity in
measurement of qt and other waveform intervals in the conversational speech recognition. arXiv preprint
ecg. In Computers in Cardiology 1997, pp. 673676. arXiv:1610.05256, 2016.
IEEE, 1997.
Li, Cuiwei, Zheng, Chongxun, and Tai, Changfeng. De-
tection of ECG characteristic points using wavelet trans-
forms. IEEE Transactions on biomedical Engineering,
42(1):2128, 1995.
Martnez, Juan Pablo, Almeida, Rute, Olmos, Salvador,
Rocha, Ana Paula, and Laguna, Pablo. A wavelet-
based ECG delineator: evaluation on standard databases.
IEEE Transactions on biomedical engineering, 51(4):
570581, 2004.
Melo, SL, Caloba, LP, and Nadal, J. Arrhythmia analysis
using artificial neural network and decimated electrocar-
diographic data. In Computers in Cardiology 2000, pp.
7376. IEEE, 2000.
Moody, George B and Mark, Roger G. A new method for
detecting atrial fibrillation using RR intervals. Comput-
ers in Cardiology, 10(1):227230, 1983.
Moody, George B and Mark, Roger G. The impact of
the MIT-BIH arrhythmia database. IEEE Engineering
in Medicine and Biology Magazine, 20(3):4550, 2001.
Pan, Jiapu and Tompkins, Willis J. A real-time QRS detec-
tion algorithm. IEEE transactions on biomedical engi-
neering, (3):230236, 1985.
Sainath, Tara N, Mohamed, Abdel-rahman, Kingsbury,
Brian, and Ramabhadran, Bhuvana. Deep convolutional
neural networks for lvcsr. In Acoustics, speech and sig-
nal processing (ICASSP), 2013 IEEE international con-
ference on, pp. 86148618. IEEE, 2013.
Appendix
Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks

Train + Val Test


Class Description Example Patients Patients

Atrial Fibrilla-
AFIB 4638 44
tion

AFL Atrial Flutter 3805 20

Second degree
AVB TYPE2 AV Block Type 1905 28
2 (Mobitz II)

Ventricular
BIGEMINY 2855 22
Bigeminy

Complete Heart
CHB 843 26
Block

Ectopic Atrial
EAR 2623 22
Rhythm

Idioventricular
IVR 1962 34
Rhythm
Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks

Train + Val Test


Class Description Example Patients Patients

Junctional
JUNCTIONAL 2030 36
Rhythm

NOISE Noise 9940 41

SINUS Sinus Rhythm 22156 215

Supraventricular
SVT 6301 34
Tachycardia

Ventricular
TRIGEMINY 2864 21
Trigeminy

Ventricular
VT 4827 17
Tachycardia

Wenckebach
WENCKEBACH 2051 29
(Mobitz I)

Table 2. A list of all of the rhythm types which the model classifies. For each rhythm we give the label name, a more descriptive name
and an example chosen from the training set. We also give the total number of patients with each rhythm for both the training and test
sets.

You might also like