comments (10)

  • I think open-ended simulation for agents will be a key component for training and planning. Similar as human dreams simulate different scenarios in our head. Biggest challenge will be simulating more abstract and complex systems.

    Few months ago I did experiment with an open-ended world simulation for AI agent, where the simulated world was progressively building itself based on each of agent actions in open-ended manner. The idea was to give an agent infinite possibility regarding tool calling, where the tool call would be approved by the adjudicator, and the world state would change. The key issues with the PoC were:

      - World decoherence (tried to solve that with a poor graph implementation)
      - World flatness - high abstraction did not account for small events that would compound in real world
      - Start with empty context was real issue to get the agent to explore the world
      
    
    Anyways the project came to be really funny when you watched agent struggling in desperation to perform real world actions which would be impossible in real world. Main observation was that when presented agent with current action budget, it modulated the creativity and how desperate its actions were.

    Xx_crazy420_xX

  • I understand what the model is doing. I am struggling to understand where this is going to fit in a workflow. I understand a big gap is that any LLM based ai agent isn't aware of the consequences of its actions because it barely understands the future state its actions will have, hence this model that can.

    So, is this like a bolt on where you have an agent powered by an LLM, then the world model reviews the action it wants to take, and the agent confirms this is the intention? Like is this to augment an existing agent with additional capabilities?

    androiddrew

  • This might be pretty big. One of my biggest frustrations with smaller models (especially MoE) is their failure to track workflow state at a high level. I'm constantly reminding them what we decided on or asking them to revisit, and reminding them eats context.

    Seems like this might make that a lot less painful. And if not off the bat, with some minimal tuning or even just good prompting.

    blurbleblurble

  • I'm a fan of this direction. For me the most interesting use case for these world models isn't even training, it's verification. If this thing or some idealized version of it can actually reliably simulate state transitions, could you use it to verify an agent's execution path against hard constraints and replace/eclipse LLMs-as-a-judge?

    dippogriff

  • I think the next movement is heading to multi model orchestration.

    https://developer.nvidia.com/blog/train-small-orchestration-...

    pulkas

  • The smaller of the two models is open weights and available on Huggingface:

    https://huggingface.co/Qwen/Qwen-AgentWorld-35B-A3B

    adrian_b

  • trilogic

  • Eli5? What is this compared to a regular llm assistant model like the base qwen?

    psc007

  • I thought in this day and age "world model" also includes robo arm training data and robot arm benchmarks

    singularity2001

  • The benchmarks here are confusing at best. Am I reading correctly that this model is essentially as good or better than all frontier models right now?

    aliljet