Make the machine learn by trial and experimentation
Reinforcement learning is used whenever there is an agent that acts in a dynamic environment. Some examples:
Reinforcement learning works by letting the agent make decisions in a simulated environment, and punish or reward it according to its results. This is done repeatedly (tens of thousands of times). Eventually, the agent learns a reward function that maximizes rewards and minimizes punishments, thus becoming "intelligent".
This type of learning can be very effective, especially against traditional forms of artificial intelligence. In the world of chess, AlphaZero (RL based) dominated Stockfish (traditional AI) shortly after its creation.
OpenAI Hide and Seek simulation