
Force, Mass, Acceleration. These are the three variables that make up Newton’s iconic equation F=MA. But how did Newton know about these concepts in the first place? A precursor step to understanding physics  any physics  is identifying relevant variables. Without the concept of mass, force, and acceleration, not even Newton could discover the laws of mechanical motion. Can machines discover such variables automatically? This is the question that we posed to a new AI program.
The program was designed to observe physical phenomena through a video camera, then try to search for a minimal set of fundamental variables that fully describe the observed dynamics. Previous AI programs could assemble laws from variables, but finding the variables themselves remained the purview of humans. 
We began by feeding the system raw video footage of phenomena for which we already knew the answer. For example, we fed a video of a swinging doublependulum (a sort of chaotic pendulum hanging off another pendulum). The double pendulum is known to have exactly four “state variables” – the angle and angular velocity of each of the two arms. After a few hours of analysis, the AI outputted the answer: 4.7. We thought this answer was close enough  Especially since all the AI had was raw video footage, without any knowledge of physics or geometry.
We then proceeded to visualize the actual variables that the program identified. Extracting the variables themselves was not easy, since the program cannot describe them in any intuitive way that would be understandable to humans. The AI cannot simply give a descriptive text such as "the first variable is 'Kinetic Energy' for example, but it can provide the value of each variable for each frame. So we plotted the variables against each other and against conventional variables. After some probing, it appeared that two of the variables the program chose loosely corresponded to the angles of the arms. The other two remain a mystery, however. We tried correlating the other variables with anything and everything we could think about: Angular and linear velocities, kinetic and potential energy, and various combinations of known quantities, but nothing seemed to match perfectly. We knew the AI had found a valid set of four variables since it was making good predictions, but we don’t yet understand the mathematical language it is speaking.
We then proceeded to visualize the actual variables that the program identified. Extracting the variables themselves was not easy, since the program cannot describe them in any intuitive way that would be understandable to humans. The AI cannot simply give a descriptive text such as "the first variable is 'Kinetic Energy' for example, but it can provide the value of each variable for each frame. So we plotted the variables against each other and against conventional variables. After some probing, it appeared that two of the variables the program chose loosely corresponded to the angles of the arms. The other two remain a mystery, however. We tried correlating the other variables with anything and everything we could think about: Angular and linear velocities, kinetic and potential energy, and various combinations of known quantities, but nothing seemed to match perfectly. We knew the AI had found a valid set of four variables since it was making good predictions, but we don’t yet understand the mathematical language it is speaking.
After validating a number of other physical systems with known solutions, we fed videos of systems for which we did not know the explicit answer. We first fed videos of an “air dancer” undulating in front of a local used car lot. After a few hours of analysis, the program returned 8 variables. We fed a video clip of flames from a holiday fireplace loop, and the program returned 24 variables. A video of Lava Lamp produced 8.
A particularly interesting question was whether the variable set was unique, or whether a different set was produced each time the program was restarted. Many scientists have always wondered: If we ever met an intelligent alien race, would they have discovered the same physics laws as we have, or might they describe the universe in a different way? Perhaps some phenomena seem overly complex to us because we are trying to understand them using the wrong set of variables. Indeed, the number of variables was the same each time the AI restarted, but the specific variable set was different each time. It seems that there are many different ways to describe the universe.
We believe that this sort of AI can help scientists uncover complex phenomena for which theoretical understanding is not keeping pace with the deluge of data – areas ranging from biology to cosmology. While we used video data in this work, any kind of array data source could be used. You could use radar arrays, or DNA arrays, for example.
The work is part of our decadeslong work to create algorithms that can distill data into scientific laws. Past software systems, such as “Eureqa” software could distill freeform physical laws from experimental data, but only if the variables were identified in advance. What if we don’t know the variables in the first place?
Scientists may be misinterpreting or failing to understand many phenomena simply because they don’t have a good set of variables to describe the phenomena. Choosing the right variables is not straightforward as it seems. For millennia, people knew about objects moving quickly or slowly, but only when the notion of velocity and acceleration was formally quantified, could Newton discover his famous law of motion F=MA. Variables describing temperature and pressure were needed to be identified before laws of thermodynamics could be formalized, and so on for every corner of the scientific world. The variables are a precursor to any theory. What other laws are we missing simply because we don’t have the variables?
A particularly interesting question was whether the variable set was unique, or whether a different set was produced each time the program was restarted. Many scientists have always wondered: If we ever met an intelligent alien race, would they have discovered the same physics laws as we have, or might they describe the universe in a different way? Perhaps some phenomena seem overly complex to us because we are trying to understand them using the wrong set of variables. Indeed, the number of variables was the same each time the AI restarted, but the specific variable set was different each time. It seems that there are many different ways to describe the universe.
We believe that this sort of AI can help scientists uncover complex phenomena for which theoretical understanding is not keeping pace with the deluge of data – areas ranging from biology to cosmology. While we used video data in this work, any kind of array data source could be used. You could use radar arrays, or DNA arrays, for example.
The work is part of our decadeslong work to create algorithms that can distill data into scientific laws. Past software systems, such as “Eureqa” software could distill freeform physical laws from experimental data, but only if the variables were identified in advance. What if we don’t know the variables in the first place?
Scientists may be misinterpreting or failing to understand many phenomena simply because they don’t have a good set of variables to describe the phenomena. Choosing the right variables is not straightforward as it seems. For millennia, people knew about objects moving quickly or slowly, but only when the notion of velocity and acceleration was formally quantified, could Newton discover his famous law of motion F=MA. Variables describing temperature and pressure were needed to be identified before laws of thermodynamics could be formalized, and so on for every corner of the scientific world. The variables are a precursor to any theory. What other laws are we missing simply because we don’t have the variables?
Videos



Technical ABSTRACT
All physical laws are described as relationships between state variables that give a complete and nonredundant description of the relevant system dynamics.
However, despite the prevalence of computing power and AI, the process of identifying the hidden state variables themselves has resisted automation.
Most datadriven methods for modeling physical phenomena still assume that observed data streams already correspond to relevant state variables. A key
challenge is to identify the possible sets of state variables from scratch, given only highdimensional observational data. Here we propose a new principle
for determining how many state variables an observed system is likely to have, and what these variables might be, directly from video streams. We demonstrate
the effectiveness of this approach using video recordings of a variety of physical dynamical systems, ranging from elastic double pendulums to fire
flames. Without any prior knowledge of the underlying physics, our algorithm discovers the intrinsic dimension of the observed dynamics and identifies candidate
sets of state variables. We suggest that this approach could help catalyze the understanding, prediction and control of increasingly complex systems.
However, despite the prevalence of computing power and AI, the process of identifying the hidden state variables themselves has resisted automation.
Most datadriven methods for modeling physical phenomena still assume that observed data streams already correspond to relevant state variables. A key
challenge is to identify the possible sets of state variables from scratch, given only highdimensional observational data. Here we propose a new principle
for determining how many state variables an observed system is likely to have, and what these variables might be, directly from video streams. We demonstrate
the effectiveness of this approach using video recordings of a variety of physical dynamical systems, ranging from elastic double pendulums to fire
flames. Without any prior knowledge of the underlying physics, our algorithm discovers the intrinsic dimension of the observed dynamics and identifies candidate
sets of state variables. We suggest that this approach could help catalyze the understanding, prediction and control of increasingly complex systems.
learn more 
Source Code: GitHub Visual Intrinsic Variables
Columbia Press: Columbia Engineering Roboticists Discover Alternative Physics  Mechanical Engineering 
Project participants 
Related Publications 
Chen B, Huang K, Raghupathi D, Chandratreya I, Du Q, Lipson H (2022) Discovering State Variables Hidden in Experimental Data Nature Computational Sciences, Vol. 2, pp. 433–442
