Browsing: chain-of-thought reinforcement learning