Interesting article on big data in diagnosis

Jackson, Brian brian.jackson at ARUPLAB.COM
Wed Jun 8 16:10:00 UTC 2016

I'm disappointed.  NYTimes is usually one of the good guys in terms of high quality science journalism, but this article is the sort of uncritical hype that I usually see from lesser sources.  I suspect this is because the reporter in this case, John Markov, specializes in covering the IT industry.  I doubt any of NYTimes' excellent health and medicine reporters would have touched this story, because there's really not much there in terms of newsworthy science:

1.       The authors basically just demonstrated that you can find a signal within web searches that correlates diagnosis-related searches with previous symptom-related searches.  Which is sort of interesting in the sense that you could use this approach to study consumer health behaviors, etc.

2.       The initial symptoms used in the study were cherry-picked to be somewhat specific for pancreatic cancer, esp. when seen in combination.  In principle you could do the same for other cancers, e.g. "large hard lump in my breast" or "worsening constipation and blood in stool" but by the time such symptoms become diagnostically specific, the cancers are often quite advanced.

3.       The study provides zero new information about pancreatic cancer diagnosis, let alone ways to improve diagnosis.

4.       It's not clear to me whether they tested their model on a different data set than they used to train the model.  Unless I'm missing something in their methodology, this would be a pretty blatant violation, and would invalidate their calculated performance measures (true-positive, etc.)

5.       Cancer screening only works under a pretty narrow set of assumptions.  These assumptions are satisfied for cervical and colorectal cancer, but not for the vast majority of cancers at this point in time.  Framing the journal article around screening discredits the authors, and the same could be said about the news article and journalist.

--Brian Jackson

From: David L Meyers [mailto:dm0015 at COMCAST.NET]
Sent: Tuesday, June 07, 2016 8:51 PM
Subject: [IMPROVEDX] Interesting article on big data in diagnosis
David L Meyers, MD FACEP
Listserv Moderator/Board member
Society to Improve Diagnosis in Medicine
Save the Date: Diagnostic Error in Medicine, November 6-8, 2016, Los Angeles, CA
Save the Date: DEM-Europe, June30-July 1, 2016, Rotterdam, The Netherlands



To unsubscribe from IMPROVEDX: click the following link:

Visit the searchable archives or adjust your subscription at:

Moderator:David Meyers, Board Member, Society for Improving Diagnosis in Medicine

To learn more about SIDM visit:

The information transmitted by this e-mail and any included
attachments are from ARUP Laboratories and are intended only for the
recipient. The information contained in this message is confidential
and may constitute inside or non-public information under
international, federal, or state securities laws, or protected health
information and is intended only for the use of the recipient.
Unauthorized forwarding, printing, copying, distributing, or use of
such information is strictly prohibited and may be unlawful. If you
are not the intended recipient, please promptly delete this e-mail
and notify the sender of the delivery error or you may call ARUP
Laboratories Compliance Hot Line in Salt Lake City, Utah USA at (+1
(800) 522-2787 ext. 2100

Moderator: David Meyers, Board Member, Society to Improve Diagnosis in Medicine

HTML Version:
URL: <../attachments/20160608/f428d033/attachment.html>

More information about the Test mailing list