I am currently working as an undergraduate research intern at Talmo's Lab in the Salk Institute. We are working on the VNL project, a collaboration between Salk Institute and Harvard University aiming to use advanced Goal-directed Deep Reinforcement Learning methods such as Inverse Kinetmatics Imitation Learning to create imitation pipelines for computational models of the brain using GPU accelerated Brax & JAX and multi-node distributed CPU training with Acme & Ray.
Visit Our Lab WebsiteMaybe using a smart way of update may capture some innate characteristics of nature. The brain captures such nature of information processing, capturing some ways that just happen to work. We are not trying to mimic the brain, but try to capture such natural way of processing by using inspirations from brain.
Abstract Deep Imitation Learning Idea (borrowed from Talmo's Lab VNL slides)
For an organism to exist today, it must be able to survive to pass on its genes. We believe that Ethology is what the brain evolves to produce and that it is guided by survival becasue one major function of neural mechanism is to produce actiuon, to interact with teh world, and to output. It is about what we do. We think that the capability to describe behavior is really powerful. Motion is an integration of all computations from sensory to supervisory signals. Brain does not output logits like neural netowrks, but neurons do get motion, which is muscle activation or torques in simple cases.
We believe that inverse kinematics imitation injects the basis layer alignment with biology to support more abstratc expolration of representations of the brain in artificial agents. Thus, we aim to create pipelines with architectures and learning algorithms that are capable of generalizing and continual learning of low-level skills that are transferable for multiple higher-level task-driven goals, trying to get closer to what the "real brain" is capable of doing.
Deep imitation learning illustration using encoder/decoder structure (borrowed from VNL Research Strategy)
Particularly we are interested in the motion of bio-mechanically realistic rodent model as we can use them (along with many motion captured CV data) to perhaps form an neural represenattion of how control is been done in the brain. For illustration purpose, here are some recent results that our team produced.
An PPO trained goal-oriented deep reinforcement learning agent (bio-mechanically realistic rodent).
An DMPO trained inverse kinematics imitation learning agent (CMU humanoid) with 3e7 actor steps (white is expert trajectory and yellow is learned agent's control).
And an DMPO trained inverse kinematics imitation learning agent (bio-mechanically realistic rodent) with reused representation retarined on the jump gap task.
For the speed and computability, we use gpu based data-parrallelism training using Jax/Brax or cpu-based multi-node task-parralelism distributed training using Ray/Acme.
Distributed Training Implementation of VNL