In the sizzle reel, the early waterdrop demos are beautiful but seem staged, the later robotics demos look more plausible and very impressive. But referring to all these "4D dynamical worlds" sounds overhyped / scammy - everyone else calls 3D space simulated through time a 3D world.
> Genesis's physics engine is developed in pure Python, while being 10-80x faster than existing GPU-accelerated stacks like Isaac Gym and MJX. ... Nvidia brought GPU acceleration to robotic simulation, speeding up simulation speed by more than one order of magnitude compared to CPU-based simulation. ... Genesis pushes up this speed by another order of magnitude.
I can believe that setting up some kind of compute pipeline in a high level language such as Python could be fast, but the marketing materials aren't explaining any of the "how", if it's real it must be GPU-accelerated, but they almost imply that it isn't. Looks neat, hope it works great!
Its a feature of that field of science. I'm currently working in a lab that is doing bunch of things that in papers are described $adjective-AI. In practice its just a slightly hyped, but vaguely agreed upon by consensus in weird science paper english term, or set of terms. (in the same way that guassian splats and totally just point clouds with efficient alpha blending[only slightly more complex, please don't just take my word for it])
You probably understand what this term is meant to describe, but to spell it out gives a bit of insight into _why_ its got such a shite name.
o "4d": because its doing things over time. Normally thats a static scene with a camera flying through it (3D). when you have stuff other than the camera moving, you get an extra dimension, hence 4D.
o "dynamical" (god I hate this) dynamic means that objects in the video are moving around. So you can just used the multiple camera locations to build up a single view of an object or room, you need to account for movement of things in the scene.
o "worlds" to highlight that its not just one room being re-used over and over, its a generator (well its not, but thats for another post) of diverse scenes that can represent many locations around the world.
I saw this on twitter and actually came on HN to see if there was a thread with more details. The demo on twitter was frankly unbelievable. Show me a water droplet falling...okay...now add a live force diagram that is perfectly rendered by just asking for it? What? Doesn't seem possible/real. And yet it seems reputable, the docs/tech look legit, they just "aren't released the generative part yet".
What is going on here? Is the demo just some researchers getting carried away and overpromising, hiding some major behind the scenes work to make that video?
My understanding is they built a performant suite of simulation tools from the ground up, and then they expose those tools via API to an "agent" that can compose them to accomplish the user's ask. It's probably less general than the prompt interface implies, but still seems incredibly useful.
Maybe I missed it, but are there any performance numbers? It being 100% implemented in Python makes me very suspicious that this won’t scale to any kind of large robot.
There is enough space on large robots to add in beefier compute if needed (at the expense of power consumption). Python is run all the time on robots. Compute usually becomes more of a problem as the robot gets smaller, but it should still be possible to run the intensive parts of a program on the cloud and stream the results back.
This looks neat. Single step available - as far as I can tell though, no LIDAR, no wheels? Very arm/vision focused. There’s nothing wrong with that, but robotics encompasses a huge space to simulate, which is why I haven’t yet done my own simulator. Would love a generic simulation engine to plug my framework into, but this is missing a few things I need.
What method is Genesis using for JIT compilation? What subset of Python syntax / operations will be supported?
The automatic differentiation seems to be intended for compatibility with Pytorch. Will Genesis be able to interface with JAX as well?
The project looks interesting, but the website is somewhat light on details. In any case, all the best to the developers! It's great to hear about various efforts in the space of differentiable simulators.
I was mildly impressed with the water demo, but that robot thing is kinda crazy, really. Finally looks like a framework for AI which can do my laundry.
"Currently, we are open-sourcing the underlying physics engine and the simulation platform. Access to the generative framework will be rolled out gradually in the near future."
> Genesis delivers an unprecedented simulation speed -- over 43 million FPS when simulating a Franka robotic arm with a single RTX 4090 (430,000 times faster than real-time).
That math works out to… 23.26 nanoseconds per frame. Uhh… no they don’t simulate a robot arm in 23 nanoseconds? That’s literally twice as fast as a single cache miss?
They may have an interesting platform. I’m not sure. But some of their claims scream exaggeration which makes me not trust other claims.
It's possible they're executing many simulations in parallel, and counting that. 16k robot arms executing at 3k FPS each is much more reasonable on a 4090. If you're effectively fuzzing for edge cases, this would have value.
Yeah it’s gotta be something like that. The whole claim comes across as rather dishonest. If you’re simulating 16,000 arms at 3000 fps each then say that. Thats great. Be clear and concise with your claims.
In the sizzle reel, the early waterdrop demos are beautiful but seem staged, the later robotics demos look more plausible and very impressive. But referring to all these "4D dynamical worlds" sounds overhyped / scammy - everyone else calls 3D space simulated through time a 3D world.
> Genesis's physics engine is developed in pure Python, while being 10-80x faster than existing GPU-accelerated stacks like Isaac Gym and MJX. ... Nvidia brought GPU acceleration to robotic simulation, speeding up simulation speed by more than one order of magnitude compared to CPU-based simulation. ... Genesis pushes up this speed by another order of magnitude.
I can believe that setting up some kind of compute pipeline in a high level language such as Python could be fast, but the marketing materials aren't explaining any of the "how", if it's real it must be GPU-accelerated, but they almost imply that it isn't. Looks neat, hope it works great!
It is a nice physics engine, it uses Taichi (https://github.com/taichi-dev/taichi) to compile Python code to CUDA/GPU (similar to what Warp Sim does, https://github.com/NVIDIA/warp)
> "4D dynamical worlds"
Its a feature of that field of science. I'm currently working in a lab that is doing bunch of things that in papers are described $adjective-AI. In practice its just a slightly hyped, but vaguely agreed upon by consensus in weird science paper english term, or set of terms. (in the same way that guassian splats and totally just point clouds with efficient alpha blending[only slightly more complex, please don't just take my word for it])
You probably understand what this term is meant to describe, but to spell it out gives a bit of insight into _why_ its got such a shite name.
o "4d": because its doing things over time. Normally thats a static scene with a camera flying through it (3D). when you have stuff other than the camera moving, you get an extra dimension, hence 4D.
o "dynamical" (god I hate this) dynamic means that objects in the video are moving around. So you can just used the multiple camera locations to build up a single view of an object or room, you need to account for movement of things in the scene.
o "worlds" to highlight that its not just one room being re-used over and over, its a generator (well its not, but thats for another post) of diverse scenes that can represent many locations around the world.
I saw this on twitter and actually came on HN to see if there was a thread with more details. The demo on twitter was frankly unbelievable. Show me a water droplet falling...okay...now add a live force diagram that is perfectly rendered by just asking for it? What? Doesn't seem possible/real. And yet it seems reputable, the docs/tech look legit, they just "aren't released the generative part yet".
What is going on here? Is the demo just some researchers getting carried away and overpromising, hiding some major behind the scenes work to make that video?
My understanding is they built a performant suite of simulation tools from the ground up, and then they expose those tools via API to an "agent" that can compose them to accomplish the user's ask. It's probably less general than the prompt interface implies, but still seems incredibly useful.
[dupe]
Earlier project page: https://news.ycombinator.com/item?id=42456802
Maybe I missed it, but are there any performance numbers? It being 100% implemented in Python makes me very suspicious that this won’t scale to any kind of large robot.
There is enough space on large robots to add in beefier compute if needed (at the expense of power consumption). Python is run all the time on robots. Compute usually becomes more of a problem as the robot gets smaller, but it should still be possible to run the intensive parts of a program on the cloud and stream the results back.
This looks neat. Single step available - as far as I can tell though, no LIDAR, no wheels? Very arm/vision focused. There’s nothing wrong with that, but robotics encompasses a huge space to simulate, which is why I haven’t yet done my own simulator. Would love a generic simulation engine to plug my framework into, but this is missing a few things I need.
Twitter announcement: https://x.com/zhou_xian_/status/1869511650782658846
GitHub: https://github.com/Genesis-Embodied-AI/Genesis
academic project page: https://genesis-embodied-ai.github.io
HN: https://news.ycombinator.com/item?id=42456802
Any roboticists here? Is this impressive/what is the impact of this?
What method is Genesis using for JIT compilation? What subset of Python syntax / operations will be supported?
The automatic differentiation seems to be intended for compatibility with Pytorch. Will Genesis be able to interface with JAX as well?
The project looks interesting, but the website is somewhat light on details. In any case, all the best to the developers! It's great to hear about various efforts in the space of differentiable simulators.
I suspect that the actual generation and simulation/rendering takes several minutes for each step.
I was mildly impressed with the water demo, but that robot thing is kinda crazy, really. Finally looks like a framework for AI which can do my laundry.
What does it mean that gs.generate() is missing in the project?
"Currently, we are open-sourcing the underlying physics engine and the simulation platform. Access to the generative framework will be rolled out gradually in the near future."
The GitHub claims:
> Genesis delivers an unprecedented simulation speed -- over 43 million FPS when simulating a Franka robotic arm with a single RTX 4090 (430,000 times faster than real-time).
That math works out to… 23.26 nanoseconds per frame. Uhh… no they don’t simulate a robot arm in 23 nanoseconds? That’s literally twice as fast as a single cache miss?
They may have an interesting platform. I’m not sure. But some of their claims scream exaggeration which makes me not trust other claims.
It's possible they're executing many simulations in parallel, and counting that. 16k robot arms executing at 3k FPS each is much more reasonable on a 4090. If you're effectively fuzzing for edge cases, this would have value.
Yeah it’s gotta be something like that. The whole claim comes across as rather dishonest. If you’re simulating 16,000 arms at 3000 fps each then say that. Thats great. Be clear and concise with your claims.
Agreed.