
Wikipedia in the Age of AI and Bots
Visit our website to learn more about the event agenda, speakers, and other details
Wikipedia has been a source of text data for natural language processing and machine learning for almost 25 years. We’d like to cover how data science and computer scientists can help foster the health of what has been agreed on as an essential public good in the age of AI by giving some context that explains how the datasets are created. Historically, access to the site via scraping and bots has caused multiple issues for Wikimedia Foundation infrastructure, but recent changes in traffic behavior and volume due to growth of large language models (LLMs) have caused an increase in incidents. Managing this expansion has created unique challenges for the organization considering Wikimedia’s free knowledge mission, and the need to continue to foster human traffic growth. Maintaining the sustainability of the platform and prioritizing human and mission-oriented access first has required nuanced approaches to identifying and responding to observed trends.
Details:
Time: 12:00 pm - 1:15 pm PT
Location: Gates Computer Science Building, Room 119, 353 Jane Stanford Way, CA 94503.