data scientists

See the following -

Big Data Systems Are Making A Difference In The Fight Against Cancer

Ben Lorica | Forbes | January 17, 2014

As open source, big data tools enter the early stages of maturation, data engineers and data scientists will have many opportunities to use them to “work on stuff that matters”. Along those lines, computational biology and medicine are areas where skilled data professionals are already beginning to make an impact. [...] Read More »

Continuum Analytics Teams Up with Intel for Python Distribution Powered by Anaconda

Press Release | Continuum Analytics | September 8, 2016

Continuum Analytics, the creator and driving force behind Anaconda, the leading Open Data Science platform powered by Python, is pleased to announce a technical collaboration with Intel resulting in the Intel® Distribution for Python powered by Anaconda. Intel Distribution for Python powered by Anaconda was recently announced by Intel and will be delivered as part of Intel® Parallel Studio XE 2017 software development suite. With a common distribution for the Open Data Science community that increases Python and R performance up to 100X, Intel has empowered enterprises to build a new generation of intelligent applications that drive immediate business value...

Read More »

Data Scientists Need Their Own GitHub. Here are Four of the Best Options

Jordan Novet | Venture Beat | March 1, 2016

Imagine if a company’s three highly valued data scientists can happily work together without duplicating each other’s efforts and can easily call up the ingredients and results of each other’s previous work.That day has come. As the data scientist arms race continues, data scientists might want to join forces. Crazy idea, right?...

Read More »

DocGraph Launches Linea

Press Release | DocGraph | June 1, 2015

DocGraph is launching a new web based portal Linea (http://www.docgraph.org/linea) to enable the health data science community to discover, aggregate and enrich new open healthcare datasets. DocGraph Linea is based on technology developed and contributed by Merck (known as MSD outside the United States and Canada). DocGraph Linea will provide data scientists a socially-enabled community open data platform that collects details about disparate healthcare datasets, and further allows the community to extend what data is available. Users will be able to search datasets, understand data lineage, view relationship matrices, add metadata, and see community algorithms.

Read More »

How open source software is fighting COVID-19

Since the end of January, the [open source] community has contributed to thousands of open source repositories that mention coronavirus or COVID-19. These repositories consist of datasets, models, visualizations, web and mobile applications, and more, and the majority are written in JavaScript and Python. Previously, we shared information about several open hardware makers helping to stop the spread and suffering caused by the coronavirus. Here, we're sharing four (of many) examples of how the open source software community is responding to coronavirus and COVID-19, with the goal of celebrating the creators and the overall impact the open source community is making on the world right now.

Read More »

IBM Announces Major Commitment to Advance Apache®Spark™, Calling it Potentially the Most Significant Open Source Project of the Next Decade

Press Release | IBM | June 15, 2015

IBM today announced a major commitment to Apache®Spark™, potentially the most important new open source project in a decade that is being defined by data. At the core of this commitment, IBM plans to embed Spark into its industry-leading Analytics and Commerce platforms, and to offer Spark as a service on IBM Cloud. IBM will also put more than 3,500 IBM researchers and developers to work on Spark-related projects at more than a dozen labs worldwide; donate its breakthrough IBM SystemML machine learning technology to the Spark open source ecosystem; and educate more than one million data scientists and data engineers on Spark.

Read More »

Machine Learning in Healthcare: Part 3 - Time for a Hands-On Test

Every inpatient and outpatient EHR could theoretically be integrated with a machine learning platform to generate predictions, in order to alert clinicians about important events such as sepsis, pulmonary emboli, etc. This approach may become essential when genetic information is also included in the EHR which would mandate more advanced computation. However, using machine learning and artificial intelligence (AI) in every EHR will be a significant undertaking because not only do subject matter experts and data scientists need to create and validate the models, they must be re-tested over time and tested in a variety of patient populations. Models could change over time and might not work well in every healthcare system. Moreover, the predictive performance must be clinically, and not just statistically significant, otherwise, they will be another source of “alert fatigue.”

Read More »

The Appeal of Graph Databases for Health Care

A lot of valuable data can be represented as graphs. Genealogical charts are a familiar example: they represent people as boxes, connected by lines that represent parent/child or marriage relationships. In mathematics and computer science, graphs have become a discipline all their own. Now their value for health care is emerging. Graph computing made a significant advance this past February in the form of a Graph Data Science (GDS) library for the free and open source Neo4j graph database. Graph databases are proving their value in clinical research and public health; I wonder whether they can also boost analytics for providers. This article explains what's special about graph databases, and some applications in health care highlighted by recent webinars offered by the Neo4j company.

Read More »

University of Chicago Awarded $20 Million To Host COVID-19 Medical Imaging Center

Press Release | University of Chicago | August 7, 2020

A new center hosted at the University of Chicago-co-led by the largest medical imaging professional organizations in the country-will help tackle the ongoing COVID-19 pandemic by curating a massive database of medical images to help better understand and treat the disease. Led by Prof. Maryellen Giger of UChicago Medicine, the Medical Imaging and Data Resource Center (MIDRC) will create an open-source database with medical images from thousands of COVID-19 patients. The center will be funded by a two-year, $20 million contract from the National Institute of Biomedical Imaging and Bioengineering at the National Institutes of Health (NIH).

Read More »

Why Data Scientists Love Kubernetes

Let's start with an uncontroversial point: Software developers and system operators love Kubernetes as a way to deploy and manage applications in Linux containers. Linux containers provide the foundation for reproducible builds and deployments, but Kubernetes and its ecosystem provide essential features that make containers great for running real applications...What you may not know is that Kubernetes also provides an unbeatable combination of features for working data scientists. The same features that streamline the software development workflow also support a data science workflow! To see why, let's first see what a data scientist's job looks like...

Read More »