The joint investigation with Israeli-Palestinian publication +972 Magazine and Hebrew-language outlet Local Call has found Unit 8200 trained the AI model to understand spoken Arabic using large volumes of telephone conversations and text messages, obtained through its extensive surveillance of the occupied territories.
According to sources familiar with the project, the unit began building the model to create a sophisticated chatbot-like tool capable of answering questions about people it is monitoring and providing insights into the massive volumes of surveillance data it collects.
“We tried to create the largest dataset possible [and] collect all the data the state of Israel has ever had in Arabic,” the former official, Chaked Roger Joseph Sayedoff, told an audience at a military AI conference in Tel Aviv last year. The model, he said, required “psychotic amounts” of data.