R Language

See the following -

12 Open Source Tools for Natural Language Processing

Natural language processing (NLP), the technology that powers all the chatbots, voice assistants, predictive text, and other speech/text applications that permeate our lives, has evolved significantly in the last few years. There are a wide variety of open source NLP tools out there, so I decided to survey the landscape to help you plan your next voice- or text-based application. For this review, I focused on tools that use languages I'm familiar with, even though I'm not familiar with all the tools. (I didn't find a great selection of tools in the languages I'm not familiar with anyway.) That said, I excluded tools in three languages I am familiar with, for various reasons.

Read More »

3 Open Source Alternatives to MATLAB

For many students in mathematics, physical sciences, engineering, economics, and other fields with a heavy numeric component, MATLAB is their first introduction to programming or scientific computing in general. It can be a good tool for learning, although in my experience many of the things that students and researchers alike use MATLAB for are not particularly demanding calculations that easily could be conducted with any number of basic scripting tools, with or without statistical or math-oriented packages. However, it does have a near ubiquity in many academic settings, bringing with it a large community of users familiar with the the language, plugins, and capabilities in general...

Read More »

5 Eclipse Tools for Processing and Visualizing Data

Gone are the days of scientists processing data by hand. Scientific tools are rapidly scaling to meet the increasing demands of their users, both in terms of complexity and sheer volumes of data. In various domains, highly sophisticated scientific workbenches have been developed to enable scientists and researchers to quickly make sense of their data in a reproducible way. Several scientific workbenches have been built on top of the Eclipse Rich Client Platform (RCP) framework and offer up open source environments for processing and visualizing data. The companies and institutions behind these workbenches got together to collaborate on these tools, and so the Eclipse Science Working Group was born...

Big Data Right Now: Five Trendy Open Source Technologies

Tim Gasper | Tech Crunch News | October 28, 2012

Big Data is on every CIO’s mind this quarter, and for good reason. Companies will have spent $4.3 billion on Big Data technologies by the end of 2012. Big Data is presently synonymous with technologies like Hadoop, and the “NoSQL” class of databases including Mongo (document stores) and Cassandra (key-values).

Read More »

Data Science Jobs Report 2019: Python Way Up, Tensorflow Growing Rapidly, R Use Double SAS

In my ongoing quest to track The Popularity of Data Science Software, I've just updated my analysis of the job market. To save you from reading the entire tome, I'm reproducing that section here.One of the best ways to measure the popularity or market share of software for data science is to count the number of job advertisements that highlight knowledge of each as a requirement. Job ads are rich in information and are backed by money, so they are perhaps the best measure of how popular each software is now. Plots of change in job demand give us a good idea of what is likely to become more popular in the future. Read More »

How SAP Embraced R.

Ajay Ohri | Jigsaw Academy | April 17, 2012

SAP has joined the list of big companies embracing the R language. SAP has committed it’s latest products including the in-memory device HANA and the newly launched Business Objects Predictive Analytics  to be tightly integrated with the algorithms and statistical libraries available in R... Read More »

Is Scholarly Use of R Beating SPSS Already?

One of us (Muenchen) has been tracking The Popularity of Data Science Software using a variety of different approaches. One approach is to use Google Scholar to count the number of scholarly articles found each year for each software. He chose Google Scholar since it searches "across many disciplines and sources: articles, theses, books, abstracts, and court opinions, from academic publishers, professional societies, online repositories, universities, and other web sites." Figure 1 shows the results from 1995 through 2016. Data collected in 2018 showed that while SPSS use dropped 39% drop from 2017 to 2018, its use was still 66% higher than R in 2018. Read More »

Machine Learning in Healthcare: Part 1 - Learn the Basics

This article is the first in a three-part series that will discuss how machine learning impacts healthcare. The first article will be an overview defining machine learning and explaining how it fits into the larger fields of data science and artificial intelligence. The second article will discuss machine learning tools available to the average healthcare worker. The third article will use a common open source machine learning software application to analyze a healthcare spreadsheet. Part I was written to help healthcare workers understand the fundamentals of machine learning and to make them aware that there are simple and affordable programs available that do not require programming skills or mathematics background...

Read More »

Machine Learning in Healthcare: Part 2 - Tools Available to the Average Healthcare Worker

A variety of machine learning tools are now available that can be part of the armamentarium of many industries, to include healthcare. Users can choose from commercial expensive applications such as Microsoft Azure Machine Learning Studio, SAS Artificial Intelligence Solutions or IBM SPSS Modeler. Academic medical centers and universities commonly have licenses for commercial statistical/machine learning packages so this may be their best choice. The purpose of this article is to discuss several free open source programs that should be of interest to anyone trying to learn more about machine learning, without the need to know a programming language or higher math.

Read More »

Open Source Among Top 10 Insurance Technology Trends in Health IT for 2016

Press Release | X by 2 | February 2, 2016

Healthcare technology is shaking things up faster than ever before. Whether it’s the quicker pace or technology-resistant providers, it’s crucial for leaders to stay educated and up-to-speed on the industry’s top developments. Here are 10 insurance technology trends that should be top of mind for 2016...Open-source will continue to make inroads: Microsoft's recent acceptance of open-source technologies such as Hadoop, Spark and D3.js in its DBMS and BI offerings is a clear indication that vendors are having a hard time keeping closed-source software competitive.

Read More »

Open Source Libraries for Health Analytics

Andy Oram | EMR & HIPAA | December 19, 2016

According to Health Catalyst’s Director of Data Science Levi Thatcher, the main author of the project, these tools are tried and tested. Many of them are based on popular free software libraries in the general machine learning space: he mentions in particular the Python Scikit-learn library and the R language’s caret and and data.table libraries. The contribution of Health Catalyst is to build on these general tools to produce libraries tailored for the needs of health care facilities, with their unique populations, workflows, and billing needs. The company has used the libraries to deploy models related to operational, financial, and clinical questions. Eventually, Thatcher says, most of Health Catalyst’s applications will use predictive analytics based on healthcare.ai, and now other programmers can too...

Read More »

Robert A. Muenchen

Robert A. Muenchen is the author of R for SAS and SPSS Users, and co-author of R for Stata Users and Introduction to Biomedical Data Science. He is also the creator of r4stats.com, a popular web site devoted to analyzing trends in data science software, reviewing such software, and helping people learn the R language.
Read More »

Share Your Genetic Story with openSNP

With personal genomics services like 23andMe and deCODEme, we can ship away a cotton swab with some spit on it, and explore our genetic connections even more closely. If we open up and share that genetic data with one another, there's a lot we could discover about human phenotypes: how our height, eye color, and preferences for certain foods connect us and shape our lives and health. Read More »

The Open-Source Answer to Big Data

Brian Bloom | PCWorld | May 29, 2012

Open-source source platforms for big data have exploded in popularity. And in the past few months, it seems like nearly everyone is feeling the fallout. Read More »

The Radical Potential Of Open Source Programming In Healthcare

Nicholas Filler | Healthcare IT News | May 21, 2015

...electronic health records pose interesting problems related to sorting through vast amounts of patient data. This is where open source programming languages come in, and they have the ability to radically change the medical landscape. So why aren’t EHRs receiving the same care that patients expect from their doctor? There are a variety of answers, but primarily it comes down to how the software interprets certain types of data within each record. There are a variety of software languages designed to calculate and sort through large amounts of data that have been out for years, and one of the most prominent language is referred to as “R”.

Read More »