Open Source Projects Are Transforming Machine Learning and AI

Machine learning and artificial intelligence have quickly gained traction with the public through applications such as Apple’s Siri and Microsoft’s Cortana. The true promise of these disciplines, though, extends far beyond simple speech recognition performed on our smartphones.  New, open source tools are arriving that can run on affordable hardware and allow individuals and small organizations to perform prodigious data crunching and predictive tasks.

Case in Point:, formerly known as Oxdata, has carved out a unique niche in the machine learning and artificial intelligence arena because its primary tools are free and open source. You can get the main H2O platform and Sparkling Water, a package that works with Apache Spark, by simply downloading them.

These tools operate under the Apache 2.0 license, one of the most flexible open source licenses available, and you can even run them on clusters powered by Amazon Web Services (AWS) and others for just a few hundred dollars. Never before has this kind of data sifting power been so affordable and easy to deploy.’s Vinod Iyengar oversees product strategy at the company. In an interview, he discussed how we have reached a tipping point where anyone can wield the same kind of machine learning and artificial intelligence muscle that is used for everything from drug discovery to deep data analytics.

Iyengar emphasizes that hardware trends — not just software development — are making machine learning and artificial intelligence applications accessible for everyone. “In the last five years the cost of storage has come down dramatically, as has the cost of memory,” he said. “Additionally, anyone can leverage an advanced computing cluster on, say, Amazon Web services, for a few hundred dollars. All of this means that organizations or individuals can take a whole lot of data and produce powerful predictions and insights from the large data sets without facing huge costs.”

“We are working to bring the power of AI to businesses,” Iyenger added. “Our machine learning platform features advanced algorithms that can be applied to specialized use cases and the wide variety of problems that organizations face. We really want to enable business transformation for our customers by building smart applications. Smart applications will require a platform that can lubricate the entire data science workflow.”

As an example of how the H2O platform is working in the field, Cisco uses it to analyze its huge data sets that track when customers have bought particular products — such as routers — and when they might logically be due for an upgrade or checkup.

Iyengar noted that is also working on a data science hub called Steam, which will eliminate all the DevOps work required to build and deploy artificial intelligence models. With Steam, developers and data scientists will be encouraged to compare models across teams and take them into production without the need for heavy engineering work on the backend.

Tech Giants are Delivering Free, Open Tools is definitely not the only company delivering free, open source machine learning and artificial intelligence tools. In fact, both Facebook CEO Mark Zuckerberg and Google CEO Sundar Pichai have been vocal about their recent contributions of open source artificial intelligence and machine learning tools.

In Google’s annual Founders' Letter to stockholders, Pichai said, “[Artificial Intelligence] can help us in everything from accomplishing our daily tasks and travels to eventually tackling even bigger challenges like climate change and cancer diagnosis."

Here are a few of the most notable recent open source contributions in this space made by companies including Google and Facebook:

  • Facebook has open sourced its machine learning system designed for artificial intelligence tasks at large scale. It's a proven platform in use at Facebook.
  • Google has open sourced a program called TensorFlow that it has spent years developing to support its AI software and other predictive and analytics programs. You can find out more about TensorFlow at its site, and it is the engine behind several Google tools you may already use, including Google Photos and the speech recognition found in the Google app.
  • Yahoo has released its key artificial intelligence software (AI) under an open source license. Its CaffeOnSpark tool is based on deep learning, a branch of artificial intelligence particularly useful in helping machines recognize human speech, or the contents of a photo or video.
  • IBM has announced that its proprietary machine learning program known as SystemML is freely available to share and modify through the Apache Software Foundation.
  • Microsoft has open sourced the artificial intelligence framework it uses to power speech recognition in its Cortana digital assistant and Skype Translate applications. It released its Computational Network Toolkit (CNTK) as an open source project on GitHub.

“The biggest thing that we’re focused on with artificial intelligence is building computer services that have better perception than people,” said Zuckerberg, on a recent conference call. “I think it’s possible to get to that point in the next five to 10 years.”

To learn more about the promise of machine learning and artificial intelligence, watch a video featuring David Meyer, Chairman of the Board at OpenDaylight, a Collaborative Project at The Linux Foundation.

Open Source Projects Are Transforming Machine Learning and AI was authored by Sam Dean and published in It is being republished by Open Health News under the terms of the Creative Commons Attribution 3.0 License (CC BY 3.0). The original copy of the article can be found here.