AI that sees with sound, learns to walk and predicts seismic physics

Research in the field of machine learning and AI, which is now a key technology in almost every industry and company, is far too voluminous for anyone to read it all. This column, Perceptron, aims to collect some of the most relevant recent discoveries and papers — particularly, but not limited to, artificial intelligence — and explain why importance to them.

This month, engineers at Meta detailed two recent innovations from the depths of the company’s research labs: an AI system that compressed audio files and an algorithm that can accelerate the performance of protein folding AI by 60x. Elsewhere, scientists at MIT have shown that they are using spatial acoustic information to help machines see their surroundings better, simulating how a listener would hear sound from any point in the room.

Meta compression work doesn’t exactly reach uncharted territory. Last year, Google announced Lyra, an audio neural code trained to compress low-bit speech. But Meta claims its system is the first to work for CD-quality stereo audio, making it useful for commercial applications like voice calls.

Meta audio compression

Meta audio compression

Architectural drawing of Meta’s AI audio compression model. Image Credits: Meta

Using AI, Meta’s compression system, called Encodec, can compress and decompress audio in real time on a single CPU core at rates of around 1.5 kbps to 12 kbps. Compared to MP3, Encodec can achieve a compression rate of about 10x at 64 kbps with no noticeable quality loss.

The researchers behind Encodec say that human evaluators prefer the quality of Encodec-processed audio versus Lyra-processed audio, suggesting that Encodec could ultimately be used to deliver better audio quality in situations where bandwidth is limited or at a premium .

As for Meta’s protein folding work, it has less immediate commercial potential. But it could lay the groundwork for important scientific research in the field of biology.

Meta folded protein

Meta folded protein

Protein structures predicted by the Meta system. Image Credits: Meta

Also Read :  Skin-Like Haptic Device Provides True Touch Sensation for Metaverse

Meta says its AI system, ESMFold, has predicted the structures of about 600 million proteins from bacteria, viruses and other microbes that have yet to be characterized. That’s more than three times the 220 million structures Alphabet-backed DeepMind managed to predict earlier this year, which covered nearly every protein from known organisms in DNA databases.

The Meta system is not as accurate as the DeepMind system. Of the ~600 million proteins it generated, only a third were “high quality.” But it is 60 times faster at predicting structures, which enables it to scale structure predictions to much larger protein databases.

Not to mention Meta, the company’s AI division also this month detailed a system designed to perform mathematical reasoning. Researchers at the company say their “neural problem solver” learned from a dataset of successful mathematical proofs to generalize to different types of problems.

Meta is not the first to build such a system. OpenAI developed its own, called Lean, which it announced in February. Separately, DeepMind has experimented with systems that can solve challenging mathematical problems in the study of symmetry and knots. But Meta claims its neural problem solver was able to solve five times more problems for the International Mathematical Olympiad than any previous AI system and has outperformed other systems on widely used math benchmarks.

Meta notes that math-solving AI could benefit the fields of software verification, cryptography and even aerospace.

Drawing on the MIT work, research scientists there developed a machine learning model that can predict how sounds in a room will travel through space. By modeling the acoustics, the system can learn room geometry from audio recordings, which can then be used to build visual renderings of the room.

The researchers say the technology could be applied to virtual and augmented reality software or to robots that have to navigate complex environments. In the future, they plan to improve the system so that it can be generalized to new and larger scenes, such as entire buildings or even entire towns and cities.

Also Read :  3 Benefits of Technology Integrations in Cloud Security

At Berkeley’s robotics department, two separate teams are accelerating the rate at which a quadrupedal robot can learn to walk and perform other tricks. One team sought to combine best-of-breed work from the many other advances in reinforcement learning to allow the robot to go from a blank slate to robust walking on uncertain ground in 20 minutes in real time.

“Perhaps surprisingly, with some careful design decisions regarding task layout and algorithm implementation, it is possible for a quadruped robot to learn to walk from scratch with deep RL in less than 20 minutes, across a range of different environments and Types of surfaces. Crucially, this does not require novel algorithmic components or any other unexpected innovation,” the researchers wrote.

Instead, they choose and combine several state-of-the-art approaches and get great results. You can read the paper here.

A robot dog demonstration from EECS Professor Pieter Abbeel’s lab in Berkeley, California in 2022. (Photo courtesy of Philipp Wu/Berkeley Engineering)

Another motion learning project, from the lab of Pieter Abbeel (friend of TechCrunch) was described as “imagination training.” They set up the robot with the ability to try to predict how its actions will work, and although it starts out helpless, it quickly learns more about the world and how it works. This leads to a better prediction process, which leads to better knowledge, and so on in feedback until it’s running in less than an hour. He learns as soon as possible to recover after being pressured or otherwise “perturbed” as the language has. Their work is documented here.

Work with a potentially more immediate application came earlier this month from Los Alamos National Laboratory, where researchers developed a machine learning technique to predict the friction that occurs during earthquakes — providing a way to predict earthquakes . Using a language model, the team says they were able to analyze the statistical features of the seismic signals emitted from a fault in a laboratory earthquake machine to project the time of the next quake.

Also Read :  Bovine pneumonia test supports on-farm interpretation approach Labmate Online

“The model is not constrained by physics, but it predicts the physics, the actual behavior of the system,” said Chris Johnson, one of the research leaders on the project. “Now we are making future predictions from past data, which is no more than describing the instantaneous state of the system.”

Dream time

Dream time

Image Credits: Dream time

Applying the technique in the real world is challenging, say the researchers, because it is not clear whether there is enough data to train the forecasting system. But at the same time, they are optimistic about the requests, which could anticipate damage to bridges and other structures.

Last week there is a cautionary note from MIT researchers, who warn that neural networks used to simulate actual neural networks should be carefully examined for training bias.

Neural networks are of course based on the way our own brain processes and signals information, reinforcing certain connections and combinations of nodes. But that doesn’t mean the synthetic and real ones work the same. Indeed, the MIT team found, neural network-based simulations of grid cells (part of the nervous system) produced similar activity only when their creators carefully constrained them to do so. If allowed to regulate themselves, the way actual cells do, they did not produce the desired behavior.

That doesn’t mean that deep learning models are useless in this area – far from it, they are very valuable. But, as Professor Ila Fiete said in the school’s news post: “they can be a powerful tool, but must be very invisible when interpreting them and deciding whether they are making de novo predictions really, or even shedding light on what it is. that is the brain is optimizing.”

Source

Leave a Reply

Your email address will not be published.

Related Articles

Back to top button