Microsoft’s use of Office documents for AI training has sparked concerns about data privacy and intellectual property rights, as users discover their content may be automatically included in AI model training without explicit consent.
Key discovery: A cybersecurity expert from Cyberciti.biz has revealed that Microsoft’s Connected Experiences feature automatically collects data from Word and Excel files for AI training purposes, with the feature enabled by default.
- The feature allows Microsoft to utilize various types of content, including articles, novels, and commercial works, for AI training
- This data collection occurs through Microsoft’s Connected Experiences functionality within Office applications
- The company has not officially confirmed or denied these claims regarding data usage for AI training
Privacy implications: Microsoft’s approach raises significant concerns for businesses and content creators who use Office products for proprietary or sensitive work.
- Users must navigate through seven different menu levels to disable the data collection feature
- The opt-out process requires accessing File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences
- The complicated nature of the opt-out process has led to criticism about its accessibility
Legal framework: Microsoft’s Services Agreement includes specific language that provides the company with broad rights to user content.
- The agreement grants Microsoft a “worldwide and royalty-free intellectual property license” to use customer content
- This license allows Microsoft to copy, retain, transmit, reformat, and display user content
- The company states these rights are necessary for providing services, protecting users, and improving Microsoft products
Industry context: This practice aligns with a broader trend in the technology sector where companies leverage user-generated content for AI development.
- The approach has raised ethical questions about consent and data usage
- Similar practices are becoming increasingly common among major tech companies
- The situation highlights the growing tension between technological advancement and user privacy
Looking ahead: The discovery of this default data collection setting may prompt increased scrutiny of tech companies’ data practices and potentially lead to calls for more transparent opt-in procedures for AI training data collection.
Microsoft Word and Excel AI data scraping slyly switched to opt-in by default — the opt-out toggle is not that easy to find