AlphaGoZero (1) was the first Robot which could learn to play "Go" without historical data. In Fact, AlphaGoZero did not play from "scratch" but was playing against AlphaGo, against on another robot.
Like the child who learns to play football by learning the codes and imitating the adult, the robot was learning by reinforcement: by the feedbacks of another robot.
It's exactly like we learn with reward and punishment. Our entire educational system, - the stupid ones yes -, is based on this 2 tits of social education with teachers acting more or like the robot of AlphaGo.