The Pistoia Alliance, a global, not for profit alliance that works to lower barriers to innovation in life sciences R&D, is calling upon the industry to improve collaborative efforts to use patient data to its full effect. In a series of keynote speeches delivered at The Pistoia Alliance’s annual member conference in London, speakers from Amgen, Accenture and AstraZeneca, discussed the need to more closely connect outcomes data with the R&D process – to help pharmaceutical companies focus their research efforts and deliver real benefits to patients. Building machine learning and deep learning systems, and incorporating data from therapeutic interventions or diagnostics into R&D is technologically challenging, and would benefit significantly from industry-wide pre-competitive collaboration...
In early July 2023, the World Health Organization (WHO) issued its 2023 report on Emerging Technologies and Scientific Innovations: A Global Public Health Perspective. This insightful and detailed report is the result of strategic engagement with a panel of global health experts through the use of an online Delphi method, roundtable discussions, and key informant interviews. The purpose of this report is to identify innovations in research and emerging technologies that have the potential to impact global health in the next five to ten years.
NumFOCUS, a nonprofit supporting better science through open code, and Tidelift today announced a partnership to support open source libraries critical to the Python data science and scientific computing ecosystem. NumPy, SciPy, and pandas-sponsored projects within NumFOCUS-are now part of the Tidelift Subscription. Working in collaboration with NumFOCUS, Tidelift financially supports the work of project maintainers to provide ongoing security updates, maintenance and code improvements, licensing verification and indemnification, and more to enterprise engineering and data science teams via a managed open source subscription from Tidelift.
Machine learning has been around for a long time. But in late 2022, recent advancements in deep learning and large language models started to change the game and come into the public eye. And people started thinking, “We love Open Source software, so, let’s have Open Source AI, too.” But what is Open Source AI? And the answer is: we don’t know yet. Machine learning models are not software. Software is written by humans, like me. Machine learning models are trained; they learn on their own automatically, based on the input data provided by humans. When programmers want to fix a computer program, they know what they need: the source code. But if you want to fix a model, you need a lot more: software to train it, data to train it, a plan for training it, and so forth. It is much more complex. And reproducing it exactly ranges from difficult to nearly impossible.
In this article, I review some of the top open source business intelligence (BI) and reporting tools. In economies where the role of big data and open data are ever-increasing, where do we turn in order to have our data analysed and presented in a precise and readable format? This list covers tools which help to solve this problem. Two years ago I wrote about the top three. In this article, I will expand that list with a few more tools that were suggested by our readers. Note that this list is not exhaustive, and it is a mix of both business intelligence and reporting tools...
Artificial intelligence (AI) technologies are quickly transforming almost every sphere of our lives. From how we communicate to the means we use for transportation, we seem to be getting increasingly addicted to them. Because of these rapid advancements, massive amounts of talent and resources are dedicated to accelerating the growth of the technologies. Here is a list of 8 best open source AI technologies you can use to take your machine learning projects to the next level.
In 1998, I was part of SGI when we started moving to open source and open standards, after having been a long-time proprietary company. Since then, other companies also have moved rapidly to working with open source, and the use and adoption of open source technologies has skyrocketed over the past few years. Today company involvement in open source technologies is fairly mature and can be seen in the following trends...
One of the most dramatic shifts in recent years that is empowering epidemiologists to be more effective at their jobs is occurring due to improvements in data technologies. In the past, the old "relational" data model dictated that data had to be highly structured, and as a result treated in distinct silos. This made it difficult, if not impossible, to analyze data from multiple sources to find correlations. Epidemiologists would spend many minutes or even hours on each query they ran to get results back, which is unacceptable when you need to test dozens of hypotheses to try to understand and contain a fast-moving outbreak. (Imagine how you would feel if each one of your Google searches took 45 minutes to return!) By contrast, using newer technologies, the same queries on the same hardware can run in seconds.
Whether it's Google's headline-grabbing DeepMind AlphaGo victory, or Apple's weaving of "using deep neural network technology" into iOS 10, deep learning and artificial intelligence are all the rage these days, promising to take applications to new heights in how they interact with us mere mortals. To go deeper (yes, I went there) on the subject, I reached out to the team at the deep learning-focused company Skymind, creators of Deep Learning For Java (DL4J), and authors of the recently released O'Reilly book Deep Learning: A Practitioner's Approach, Josh Patterson and Adam Gibson...
Today, researchers and leaders from the Allen Institute for AI, Chan Zuckerberg Initiative (CZI), Georgetown University's Center for Security and Emerging Technology (CSET), Microsoft, and the National Library of Medicine (NLM) at the National Institutes of Health released the COVID-19 Open Research Dataset (CORD-19) of scholarly literature about COVID-19, SARS-CoV-2, and the Coronavirus group. Requested by The White House Office of Science and Technology Policy, the dataset represents the most extensive machine-readable Coronavirus literature collection available for data and text mining to date, with over 29,000 articles, more than 13,000 of which have full text.
The value of open and interoperable metadata of scientific articles is increasingly being recognized, as demonstrated by the work of organizations such as Crossref, DataCite, and OpenCitations and by initiatives such as Metadata 2020 and the Initiative for Open Citations. At the same time, scientific articles are increasingly being made openly accessible, stimulated for instance by Plan S, AmeliCA, and recent developments in the US, and also by the need for open access to coronavirus literature. In this post, we focus on a key issue at the interface of these two developments: The open availability of abstracts of scientific articles. Abstracts provide a summary of an article and are part of an article's metadata. We first discuss the many ways in which abstracts can be used and we then explore the availability of abstracts. The open availability of abstracts is surprisingly limited. This creates important obstacles to scientific literature search, bibliometric analysis, and automatic knowledge extraction.