Large amounts of data including websites, books, forums, code repositories, image libraries and social media provide the raw material for training large models. That approach is now under pressure from lawsuits, licensing demands and privacy complaints. As the easy supply of public data gets more contested, companies are looking for another source of training material. Increasingly, that source is human behavior itself.
While AI might seem like magic to many, it really is a big pattern recognition and generation system. Whether generating code, text, images or video, the power of AI comes from understanding the patterns from large amounts of data and then translating human requests into outputs that best match those patterns. It goes without saying that these AI systems are data hungry.
Your Typing And Clicking Behavior Now Under Watch
Meta is installing software on work computers used by U.S. employees to capture mouse movements, keystrokes, clicks and some screen snapshots, according to a recent Reuters report. According to internal documents reviewed by Reuters, the point is to train AI systems that can understand how people move through software and complete office tasks. Meta said the data will not be used for performance reviews, but rather that it wants to build models that can observe digital work and learn from it.
The industry is shifting from people’s digital outputs and now paying closer attention to the inputs and how people work. This shift reflects where the AI market is heading. The next commercial battle is not just about generating code, text, videos or images. It is about building systems that can take action inside software. OpenAI’s computer-use documentation describes tools that let a model inspect screenshots and produce interface actions. Anthropic’s documentation for Claude’s computer-use tool describes a similar setup, built around screenshots, mouse actions, keyboard inputs and interface navigation. These products point toward a world where AI is asked to complete tasks, not just answer prompts.
To build those systems well, compiling zettabytes of published data is not enough. A model needs examples of what people actually do on a screen. Which menu they choose and which field they click first. Which keyboard shortcut they use and where they pause, retry or correct a mistake. Those traces are useful training material for systems meant to act like junior workers inside digital tools.
The Privacy Risk
Also this week, Reuters reported that AI company Clarifai deleted 3 million OkCupid user photos and facial-recognition models trained on them after scrutiny tied to an FTC action against OkCupid and Match Group. The FTC said OkCupid had given unauthorized third party access to personal data from millions of users in conflict with its own privacy promises. Clarifai was not accused of wrongdoing by the FTC, but the episode shows that data collected in one setting can easily become training material in another.
The implication is that AI model developers are still in need of more sources of data to power the next model release that can do ever-more powerful things. That is what makes this new phase more sensitive than the older debates over scraping the public web. A public blog post is one thing. A record of someone’s work habits or dating profile photos, is another.
Workplaces are now becoming training grounds. The reason for it is because behavior data is hard to fake. It shows how software is actually used in the wild and what makes one software application more or less, usable than another. Moving from just generating outputs, these models can produce behaviors that seem more human and evoke more human-like sentiment based on the way people actually interact.
Anyone who has spent time inside a large company knows that real work rarely follows the ideal process shown in a product demo. Employees improvise. They jump between tabs, copy values into spreadsheets, reopen forms and use shortcuts no manual ever documents. Those habits are messy, but they are valuable. They show where digital workflows break and how experienced workers get around the friction.
That makes employee telemetry attractive to any company trying to build AI agents for enterprise work. If a model is supposed to schedule meetings, update CRM records, review dashboards, route requests or complete basic internal tasks, there is obvious value in watching how skilled employees already do those things. Reuters reported that Meta’s internal materials described the effort as a way to improve AI understanding of human-computer behavior, including menu selection and keyboard shortcuts.
Since employers already control the devices, the software stack and much of the policy environment, the enterprise environment offers a more friendly place for data collection. The barrier to collection is lower there than in many consumer settings. However, worker monitoring has always been contentious. When that monitoring is linked to model training, the stakes rise.
The UK Information Commissioner’s Office is already taking a position on the use of workforce monitoring for training AI systems, and has warned that employee monitoring must be necessary, proportionate and transparent, particularly when employers collect detailed data about worker activity. Reuters reported that labor experts see Meta’s approach as a likely source of legal concern in Europe, where data protection law sets tighter limits on workplace surveillance. A company may describe the program as AI development. Regulators may still ask whether the collection itself was excessive.
While workers might welcome AI systems that are more effective and efficient, employees may see this as surveillance dressed up as innovation. The tradeoff of convenience and security has always been there, but now we’re seeing the tradeoffs on convenience and privacy. There are already increasing reports of employee alarm over Meta’s program, including frustration over the lack of an opt-out on company laptops.
As AI companies keep searching for greater amounts of higher detail AI training data, the real question is where they draw the line when that search moves from public content into the private mechanics of daily life and work. Companies that want to build more capable systems may see this as a logical next step. Workers, regulators and consumers are far less likely to view it so casually.











