
Security operations teams face a paradox: build more sensitive detection systems and drown analysts in false positives, or tune for precision and miss real attacks. Sanchit Mahajan, a Software Development Manager in Amazon’s Security organization, lives in this tension daily. He leads teams that develop platforms to process massive volumes of telemetry data, enabling the detection of sophisticated threats without overwhelming incident responders.
With over 15 years spanning payments, e-commerce, and cybersecurity, Sanchit has seen how alert fatigue compounds as organizations scale. His work focuses on a different approach: using AI to automate signal generation, correlate billions of events, and turn raw threat intelligence into actionable insights. The goal isn’t to eliminate human judgment but to free security analysts from triaging noise so they can focus on threats that matter.
In this conversation, Sanchit explains how AI helps distinguish between attacks and anomalies, why automated signal generation alters the economics of threat coverage, and what happens when both attackers and defenders begin using machine learning. For anyone working in security operations or curious about AI’s practical application in high-stakes environments, this interview offers a ground-level view of what works at scale.
Sanchit, what does “alert fatigue” actually mean in security operations, and why has it become such a critical problem at scale?
Incident response is a tier 1 operation in the security world, where every security alert is assessed manually to determine if this was indeed a genuine attack or just noise. Alert fatigue happens when Incident Response Teams are paged/alerted about potential security problems (a.k.a. Security Alerts) to investigate, but the volume and frequency of those alerts is too high for manual investigation. Since most focus goes into manually triaging these alerts, which is taxing for Security Incident Responders, they are often overwhelmed by triaging more than what can be handled with their available bandwidth. Additionally, a second type of alert fatigue occurs when most of the alerts they triage are not True Positives, which takes focus away from triaging security problems that should not have required their attention and often diverts their focus from actual problems that need to be triaged. This also leads to the issue of missing a True Positive (a.k.a. False Negatives) because most alerts now look similar and are easy to ignore based on the impression of recently triaged alerts. For comparison, consider you have installed a camera in your backyard to detect any movement and call you whenever that happens; but interestingly, the camera also picks up all movements, including dust particles, insects, or waving trees/grass, and calls you every hour, especially during night hours. Would you pick up every call or just silence your cell phone, which can potentially ignore notifications for a real intruder?
You process massive volumes of telemetry data at Amazon. How has AI changed the way security teams separate real threats from noise?
AI has really accelerated the way alerts are triaged and separate them from Noise, for instance AI models are getting advanced in triaging new vs existing threat activities. For instance, the models trains on data that is already marked as noise and predicts for newer alerts coming in if they follow a similar pattern or not, if they do it either auto suppresses the alerts or recommends with a score, LLM on top of it makes it easier for Incident Responders to understand why AI considers it’s a noise, considerably reducing triage time. Another usage of AI is to group alerts based on similar data. For instance, a single user violating multiple detection rules may create 10 different alerts for 10 different machines, but AI is able to assess the pattern dynamically and map it to a single user, creating one alert instead of 10, thus reducing the total number of alerts to be triaged.
Traditional security relies on rules; if X happens, alert. What can AI-driven detection do that rules-based systems can’t when facing sophisticated attackers?
The ground truth is that an attack never happens in isolation; there is always a trail towards the attack. While rule-based detections are static in nature and are often designed to catch a particular behavior, they often miss the ways similar attacks can be achieved; this is where AI-based detections differ. AI-driven detection tries to find out commonalities between multiple detection outputs and matches the rules dynamically without a pre-set of rules configured, which makes them more accurate. For instance, a user trying to log in to multiple accounts and getting denied, though the traditional detection might never hit the threshold, but an ML-based detection can find the relation between multiple login failures based on dynamic attributes, e.g., IP or user, or source machine, etc., and flag this anomalous behavior.
Walk us through “automated signal generation.” How does it help security analysts catch threats they’d otherwise miss?
The base theory that “attacks never happen in isolation” applies here. A signal is not an attack on its own but an indicator of malicious intent. When combined or correlated with other signals, it becomes stronger evidence of an attack. This thought process addresses two things: a single signal doesn’t have to be perfectly accurate but rather an event of interest and a signal can never be an issue on its own. This notion frees security engineers from building perfectly accurate detection as step one, rather than focusing on reducing a subset of events to a volume that they are interested in.
Now that quality is not the primary concern, automation of signal generation is more effective since a lot of security research work, internal reports, and external threat reports become the foundation of AI-based signal generation. AI can scan through its current security catalog and understand the latest industry news (threat reports, internal research) and automatically translate them into signals from its knowledge base. Doing this activity manually is resource-intensive and not scalable at the speed the threat landscape is growing, which often creates blind spots in threat coverage.
For comparison, every month, 100s of new threat intelligence are made available for review. Doing that manually is not feasible, where most data is noise and does not apply to the current scope of the company, while other reports need more deep dives to mine signals. Once the signals are identified, AI helps develop logic to catch the attack that is applicable to the company’s proprietary logs. While this process is not fully automated and still requires review from engineers, it accelerates efficiency.
Security AI faces a unique challenge: you can’t train on millions of examples of novel attacks because they don’t exist yet. How do you build detection models when the threats are constantly evolving?
There are a couple of key approaches: Threat Research and Learning through Feedback.
For threat research, security engineers ingest reports from inside and outside of Amazon and assess whether the threats specified in the reports map to Amazon’s ecosystem or not. These reports contain information that is assessed by AI, as well as the current threats that are already addressed internally, which gives the models the ability to predict whether the current threat model is applicable to Amazon’s ecosystem or not. Hence, we are able to stay ahead of what new threats are applicable to Amazon’s ecosystem and react to them accordingly.
The second part is feedback learning. Based on the feedback from existing security alerts, which are already analyzed by experts and marked as false positives over time, we understand which behaviors are not applicable to Amazon or are activities which are expected in nature and should not be flagged. This trains the ML model to ignore or avoid occurrences that are well-known and only focus on anomalies that are seen for the first time, as well as map to the findings generated from threat reports.
Historically, this combined approach enables the ML models to train effectively. The idea is to train models on the data and industry trends and understand if those trends are applicable to Amazon’s ecosystem. If they are, then we are interested in diving deeper, which keeps us ahead of unknown attacks.
Threat intelligence feeds dump enormous amounts of data—IPs, attack patterns, indicators of compromise. How does AI turn this flood into something security teams can actually use?
Every day, security teams get bombarded with thousands of IP addresses, domain names, file hashes, and attack signatures from dozens of different sources. Without AI, it’s like trying to find a needle in a haystack while the haystack keeps getting bigger. AI doesn’t just dump everything on analysts’ desks, but it learns what’s actually relevant to your specific environment. At Amazon, our systems automatically filter out indicators that don’t apply to our infrastructure or threat landscape. So instead of getting 10,000 random IP addresses, analysts get maybe 50 that are actually worth investigating because they’re targeting our specific services or using techniques we’re vulnerable to. Additionally, they have already been trained on False Positives to filter out certain feeds. AI takes a basic IP address and automatically enriches it with context where it’s located, what malware families it’s associated with, which threat actors use it, and how it connects to other indicators. The impact happens in behavior/pattern recognition and correlation. AI can identify patterns across thousands of indicators that otherwise are not scalable for manual effort. Maybe five different IP addresses seem unrelated, but AI notices they all resolve to domains registered with the same email address, use similar SSL certificates, and target the same vulnerabilities. As an output, we get a complete picture of a threat actor’s infrastructure instead of just random data points. The system automatically analyzes the relationships between indicators and scores them based on confidence, relevance, and potential impact. Instead of treating every indicator equally, analysts get a prioritized list where the most critical threats bubble to the top, and all this processing happens automatically and feeds directly into security tools in real-time. Lastly, IOCs/IPs do not mean an attack without clubbing it with an actual signal, going with the theory that an attack never happens in isolation.
Security teams talk about “correlation”—connecting different signals to spot attack patterns. How are you using AI to discover connections that humans couldn’t find across billions of events?
This brings us to our belief that “attacks never happen in isolation.” Therefore, there must be a paper trail of the attack path and how it was execution. While manual correlation techniques do produce quality and low-volume alerts, they are limited by human capacity and are therefore constrained. AI/ML-based detections work in different ways, for instance, anomaly-based detections. Anomaly-based detection identifies patterns that were not seen before and alerts on them, e.g., a surge in failed logins across services, which is not normal, or a single IP downloading files in bulk. What makes this especially powerful is that AI doesn’t just find known attack patterns; it discovers new ones by identifying statistical anomalies in how events cluster together. It’s constantly learning what “normal” correlation looks like and flagging when event relationships deviate significantly from those patterns, even if we’ve never seen that specific attack sequence before. This confidence is always fed back from a pool of false positives for learning and improving ML/AI detections.
The second type of ML/AI detections looks for behaviors, which is a tough task because defining behavior doesn’t exist manually, so AI now has to first identify those behaviors and then look for telemetry data that portrays that behavior, especially when volume is huge. For such types, we develop pattern matching on indicators like TTPs over resources where each signal is looking for a specific TTP, but when indexed on certain resources we get to see a graph of multiple TTPs happening on a single resource; this now seems like a high-confidence attack. The system doesn’t just look at individual events – it builds these massive relationship graphs that connect seemingly unrelated activities across time, users, systems, and networks. For example, AI might notice that a user logged in from an unusual location, then 20 minutes later there was an API call to export customer data, followed by a file upload to an external service three hours later. To a human analyst looking at alerts, those might seem like three separate, low-priority events. But AI sees the timeline and recognizes it as a classic data exfiltration pattern.
Security analysts are burning out from alert overload. How do you balance building sensitive detection systems with the need to cut false positives?
This is a challenge that is not fully solved yet. There is always a push and risk to bear more false positives than missing a false negative, but this causes fatigue for incident responders. Though it is always a challenge which is not fully solved, there are best practices to address it. Known or expected lists (allow lists) are a mechanism where Incident Response teams mark certain combinations as safe, so they are not sent the same alerts again for a period of time. Every time an IR analyst marks something as a false positive, that information feeds back into the AI models to improve future filtering. Existing detections that ever bear a false positive are considered for tuning based on the number of FPs that were produced in the last month. This continuous tuning ensures we don’t repeat the same problems. Poorly written detection code or something that will hit mass false positives is dry-run on historic data before it is actually sent out. This ensures the stability of detections. Lastly, grouping similar alerts based on a common factor (powered by AI) is always very useful to collate similar alerts and take bulk action; this avoids two different IR folks working on the same item. We also track what we call “alert-to-incident conversion rates,” basically measuring how often our alerts turn into real TPs. This helps us tune not just individual detection rules, but the entire prioritization logic. The goal isn’t to eliminate false positives entirely, that’s impossible without missing real attacks, but to make sure that when a responder gets an alert, it comes with enough context and confidence scoring that they can quickly determine whether it’s worth investigating. This approach has reduced analyst burnout while actually improving our detection capabilities, because analysts can focus their expertise on the threats that matter most rather than getting lost in the noise.
You’ve worked in payments, e-commerce, and security—all areas where missing a threat is costly, but so are false alarms. What have you learned about deploying AI where both precision and recall matter?
The key learning I have had on this varies from domain to domain. In payments and e-commerce, more than security, the problem translates into some sort of fraud detection which is very niche to this line of business. For instance, flagging $10 transactions can be very taxing given the sheer scale of such transactions and the blast radius of losing customer trust, but also false positives on $10K transactions can get escalated very quickly. The solution I have learned lies in thresholding the amounts and the trend of that attack, which works in payments. E-commerce is another complex challenge. For instance, DDoS attacks on sales days can be very damaging. Such attacks need ML models to quickly analyze between organic volume vs. groups of volume intending to bring services down and quickly isolate those accesses. What has worked in systems for me is that producing mass alerts is unavoidable because at Amazon scale, these systems are meant to behave differently. AI/ML models help in quickly grouping these similar attacks into smaller quantities which can then be triaged effectively. AI also helps further enrich these grouped alerts. For instance, by the time it lands for human intervention, all the details and analysis alongside LLM-generated text is served to responders who can then quickly move on. Again, continuous feedback loops are a critical aspect. Every time an analyst marks something as a false positive, that information feeds back into the AI models to improve future filtering. I would say AI’s age in security is too young to predict or catch attacks precisely yet, they act as helpers and efficiency boosters at Amazon scale. They will be extremely accurate in small businesses, but at large scale they will often run into high cost of operations for accuracy tradeoffs which means they need too much tuning.
How do you see AI transforming security operations over the next five years? Will automated systems handle most threat detection and response, or will security always need human judgment?
I believe there will be a greater impact from AI in the next five years, given the current pace of development. But I also don’t feel that it will replace humans, but rather complete the skill set to increase efficiency. The scale at which Amazon is also growing, with current human resources, will get outnumbered, and hence the role of AI and ML-powered detection rules is very critical. How I see it playing out: AI will handle the heavy lifting of data processing, correlation, and initial threat assessment, but humans still make the critical decisions about response and remediation. AI is yet to overcome its own challenges like token usage and hallucinations, but I believe the future is AI/ML models running cheaper and faster than ever, which would be game-changing. Human judgment will remain critical for several reasons like – sophisticated attackers will increasingly use AI themselves, creating an arms race where human creativity and intuition become competitive advantages. Additionally, the business context and risk decisions around security incidents require understanding organizational priorities, regulatory requirements, and strategic implications that AI can’t fully grasp and when things go wrong you need humans who understand both the technology and the business to make rapid decisions under pressure. I don’t feel these basics can still be replaced. For instance, I don’t expect AI to suggest everyone to get rich or do I feel AI is a silver bullet to 100% automate security operations.


