The ‘Unsolved’ Problems in Machine Learning

2022-09-23 20:57:28 By : Ms. Anne zhang

While artificial intelligence and machine learning are solving a lot of real world problems, a complete comprehension of a lot of the “unsolved” problems in these fields is hindered due to fundamental limitations that are yet to be resolved with finality. There are various domains in the field of machine learning that developers dive deep into and come up with small incremental improvements. However, challenges to further advancement in these fields persist. 

A recent discussion on Reddit brought in several developers of the AI/ML landscape to talk about some of these “important” and “unsolved” problems which, when solved, are likely to pave the way for significant improvements in these fields.

Arguably, the most important aspect of creating a machine learning model is gathering information from reliable and abundant sources. Beginners in the field of machine learning, who formerly worked as computer scientists, face the difficulty of working with imperfect or incomplete information—which is inevitable in the field. 

“Given that many computer scientists and software engineers work in a relatively clean and certain environment, it can be surprising that machine learning makes heavy use of probability theory,” said Andyk Maulana in his book series—‘Adaptive Computation and Machine learning’.

Three major sources of uncertainty in machine learning are:

Check out a research paper by Francesca Tavazza on uncertainty prediction for machine learning models here.

Optimising the process of training and then inferring data requires a large amount of resources. The problems of reducing the convergence time of neural networks and requiring low-resource systems are countering each other. Developers might be able to build tech that is groundbreaking in applications but requires huge amounts of resources like hardware, power, storage, and electricity. 

For example, language models require vast amounts of data. The ultimate goal of reaching human-level interaction in the models requires training on a massive scale. This means a longer convergence time and requirement of higher resources for training. 

A key factor in the development of machine learning algorithms is scaling the amount of input data that, arguably, increases the accuracy of a model. But in order to achieve this, the recent success of deep learning models shows the importance of stronger processors and resources, thus resulting in continuous juggling of the two problems.

Click here to learn how to converge neural networks faster.

Recent text-to-image generators like DALL-E or Midjourney showcase possibilities of what overfitting of input and training data can look like.

DALL-E 2 🤯 "Teddy bears working on new AI research on the moon in the 1980s" Deep learning continues to overfit… https://t.co/AVPpJv9hPz pic.twitter.com/lDXmr9Eg8O

Overfitting, also a result of noise in data, is when a learning model picks up random fluctuations in the training data and treats them like concepts of the model resulting in errors and impacting the model’s ability to generalise.

To counter this problem, most non-parametric and non-linear models include techniques and input guiding parameters to limit the reach of learning of the model. Even then, in practice, fitting a perfect dataset into a model is a difficult task. Two suggested techniques to limit overfitting data are:

Causal inferences come to humans naturally. Machine learning algorithms like deep neural networks are great for analysing patterns in huge datasets but struggle to make causal inferences. This occurs in fields like computer vision, robotics, and self-driving cars where models—though capable of recognising patterns—do not comprehend physical environmental properties of objects, resulting in making predictions about the situations and not actively dealing with novel situations.

Researchers from Max Planck Institute for Intelligent Systems along with Google Research published a paper—Towards Causal Representation Learning, which talks about the challenges in machine learning algorithms due to the lack of causal representation. According to the researchers, to counter the absence of causality in machine learning models, developers try to increase the amount of datasets on which the models are trained, but fail to understand that this eventually leads to models recognising patterns and not independently “thinking”.

The introduction of “inductive bias” into models is believed to be a step towards building causality into machines. But that, arguably, can be counter productive in building AI that is free of bias.

Help! What precisely is "inductive bias"? Some ML researchers are in the opinion that the machine learning category of ‘inductive biases’ can allow us to build a causal understanding of the world. My Ladder of Causation says: "This is mathematically impossible". Who is right? 1/

AI/ML being the most promising tool in almost all fields has resulted in many newcomers diving straight into it without fully grasping the intricacies of the subject. While reproducibility or replication is a combined outcome of the above mentioned problems, it still poses great challenges for newly developing models.

Due to lack of resources and reluctance to conduct extensive trials, many of the algorithms fail when tested and implemented by other expert researchers. Big companies offering hi-tech solutions do not always publicly release their codes, making new researchers experiment on their own and propose solutions for large problems without rigorous testing, thus lacking reliability.

Click here to find out about how lack of reproducibility in machine learning models is making the healthcare industry risky.

Conference, in-person (Bangalore) Cypher 2022 21-23rd Sep

Conference, in-person (Bangalore) Machine Learning Developers Summit (MLDS) 2023 19-20th Jan, 2023

Conference, in-person (Bangalore) Data Engineering Summit (DES) 2023 21st Apr, 2023

Conference, in-person (Bangalore) MachineCon 2023 23rd Jun, 2023

Stay Connected with a larger ecosystem of data science and ML Professionals

Discover special offers, top stories, upcoming events, and more.

Stay up to date with our latest news, receive exclusive deals, and more.

© Analytics India Magazine Pvt Ltd 2022