How International Law Fails to Govern AI

Part 2 of Israel’s AI Revolution

Introduction: When Machines Decide Who Dies

In the fall of 2023, the Israel Defense Forces began running a targeting operation in Gaza unlike anything in the history of modern warfare. At the center of it were two artificial intelligence systems. The first, known as Lavender, scanned the data profiles of virtually every person in the Gaza Strip—phone records, social media activity, known associations, movement patterns—and assigned each one a numerical score indicating their suspected affiliation with Hamas or Palestinian Islamic Jihad. By some reports, the system flagged approximately 37,000 Palestinians as potential targets.[1] The second system, called The Gospel, operated on a parallel track, processing surveillance data to recommend physical targets—buildings, tunnels, private homes—for airstrikes. Where human analysts in previous conflicts might generate fifty targets in a year, The Gospel could produce over a hundred in a single day.[2]

Officers reportedly approved AI-generated targets in as little as twenty seconds.[3] The system’s own designers acknowledged an error rate of roughly ten percent, meaning that for every ten people Lavender flagged as militants, one was likely a civilian.[4] At the scale these systems operated—hundreds of strikes per day, sustained over weeks and then months—that margin of error did not represent an occasional mistake. It represented a policy.

What those numbers looked like, on the ground, was this: night after night, residential buildings collapsed into rubble on live television. Families were pulled from the wreckage. Neighborhoods that had stood for generations were reduced to gray expanses of concrete dust. It happened so many times that identifying any single strike as exceptional became impossible—not because the individual horrors were ordinary, but because there were so many of them that they blurred together into an unbroken stream of destruction. Anyone watching the footage in real time understood, even without knowing the details of the systems behind it, that something fundamental had changed about the way this war was being fought.

By March 2026, more than 72,000 Palestinians had been confirmed killed, including over 20,000 children. Independent epidemiological studies suggest the true toll is significantly higher—one population-representative survey published in The Lancet Global Health estimated 75,200 violent deaths through January 2025 alone, a figure more than a third higher than official counts for the same period.[5] Roughly ninety percent of Gaza’s civilian infrastructure was destroyed. Famine was declared in parts of the Strip. A UN Independent Commission of Inquiry concluded, in September 2025, that Israel had committed genocide.[6]

That change is the subject of this essay. International Humanitarian Law—the body of rules that has governed armed conflict since the Geneva Conventions—was designed for wars in which human beings identified targets, weighed risks, and bore personal responsibility for the consequences. It was not designed for wars in which an algorithm generates kill lists faster than any human can meaningfully review them. The question is no longer whether AI will reshape the conduct of war. It already has. The question is whether the legal architecture meant to limit war’s worst consequences can survive the encounter.

The Rules That Were Supposed to Hold

International Humanitarian Law, also known as the laws of war or the law of armed conflict, is the product of a simple and terrible insight: that wars will happen regardless of our efforts to prevent them, and that some restraint on how they are fought is better than none. The modern framework rests primarily on two pillars: the Geneva Conventions of 1949, along with their Additional Protocols of 1977, which establish protections for civilians, prisoners of war, and the wounded; and the Rome Statute of 1998, which created the International Criminal Court and defined the categories of war crimes, crimes against humanity, and genocide.

The entire system is built on three foundational principles, each of which assumes that a human being stands at the point of decision.

Distinction

The principle of distinction requires combatants to differentiate between military targets and civilians. Deliberately targeting civilians is a war crime. In practice, this means that the person authorizing a strike must make a judgment—based on available intelligence, observation, and context—that the target is in fact a combatant or a legitimate military objective. The principle does not demand perfection, but it demands a genuine effort at discernment.

Proportionality

Even when a target is legitimately military, the expected civilian harm must not be excessive relative to the concrete and direct military advantage anticipated. This is not a formula; it is a judgment call, and it is meant to be a difficult one. A commander who levels an apartment block to kill a single low-ranking operative has violated proportionality even if the operative was a legitimate target. The principle exists precisely because wars are fought in places where civilians live, and it imposes on the attacker the burden of weighing one kind of destruction against another.

Precaution

Parties to a conflict must take all feasible precautions to avoid or minimize civilian harm. This includes verifying that targets are what they appear to be, choosing weapons and methods of attack that reduce collateral damage, and—critically—canceling or suspending an attack if it becomes apparent that the target is not a military objective or that the expected civilian harm would be disproportionate.

In conventional warfare, accountability for violations of these principles falls on individuals and command structures. A soldier who fires on a clearly marked hospital can be prosecuted. A commander who orders a disproportionate strike bears legal responsibility. The chain of causation runs from decision to action to consequence, and at each link, a human being is answerable.

That chain is what AI disrupts.

The Machine in the Loop

The phrase most often used to describe the role of humans in AI-assisted military systems is “human in the loop.” It is meant to be reassuring: the algorithm recommends, but a person decides. In practice, as the evidence from Gaza suggests, the loop can become so compressed that the human element is reduced to a formality—a rubber stamp applied in seconds to a decision the system has already made.[7]

This is not a theoretical concern. It is the operational reality of how AI has been deployed in at least one major armed conflict, and it challenges each of IHL’s core principles in ways that the existing legal framework is not equipped to address.

Distinction Under Pressure

Lavender’s targeting decisions are derived from data inputs: phone records, social media activity, surveillance footage, known associations. These inputs are proxies. They may correlate with militant activity, but they are not the same thing as verified intelligence. A person who communicates frequently with a known Hamas member may be a militant—or a relative, a journalist, a doctor. The algorithm cannot make this distinction with certainty. It can only assign a probability.

A ten percent error rate, applied to 37,000 flagged individuals, implies that roughly 3,700 people were misidentified—marked for potential targeting despite having no militant affiliation.[8] IHL requires combatants to verify their targets. But when the verification step consists of a twenty-second review of an algorithmically generated recommendation, the principle of distinction has not been applied. It has been performed.

Proportionality at Scale

The proportionality assessment is inherently case-by-case: each strike must be evaluated on its own terms, weighing the anticipated military advantage against the expected civilian cost. AI-driven targeting, by generating targets at industrial volume, makes this individualized assessment functionally impossible at the pace the system demands.[9]

Reports indicate that for junior Hamas operatives, the IDF accepted a collateral damage threshold of up to fifteen or twenty civilians per strike. For senior commanders, no upper limit was specified.[10] Whatever one’s view of these thresholds, the sheer number of strikes—hundreds per day, sustained over months in one of the most densely populated areas on earth—raises the question of whether proportionality can mean anything at all when applied to an operation of this scale and pace.

Precaution and the Accelerated Kill Chain

The precaution principle assumes that there is a moment—between the identification of a target and the execution of a strike—in which a human being can pause, reconsider, and if necessary, call it off. AI systems are designed, by their nature, to compress exactly this moment. The value they offer to a military is speed: the ability to move from intelligence to action faster than an adversary can adapt.

But speed and precaution are in direct tension. A system that generates targets faster than humans can review them does not merely risk violating the precaution principle. It structurally undermines it. The twenty-second approval window reported in Gaza is not a failure of the system. It is the system working as intended.[11]

The Accountability Gap

When a strike goes wrong in conventional warfare, the chain of responsibility is at least theoretically traceable: the intelligence analyst who identified the target, the commander who authorized the strike, the pilot who executed it. When an AI system generates the target, that chain fractures. Who is responsible if Lavender’s algorithm misidentifies a civilian as a militant? The software developer who designed it? The intelligence officer who set its parameters? The commander who approved the strike in twenty seconds? The institution that deployed the system at this scale?

The answer, under current international law, is unclear—and that ambiguity is not incidental. It is structural.[12] Opacity compounds the problem: these systems are classified, their training data is secret, their decision-making processes are opaque even to many of the people who use them. Investigating a potential war crime requires understanding how and why a target was selected. When the answer is “the algorithm determined it,” investigation hits a wall that IHL was never designed to scale.

The Systems: A Case Study in Algorithmic War

The Israeli military’s use of AI in Gaza is not the only instance of algorithmic targeting in modern warfare, but it is the most extensively documented. Investigations by +972 Magazine, Human Rights Watch, and others have provided an unusually detailed picture of how these systems operate in practice—and what they produce.[13][14]

Lavender: Who to Kill

Lavender functions as a scoring engine for human targets. It ingests data from multiple surveillance streams and assigns each individual in Gaza a rating reflecting their estimated probability of affiliation with Hamas or Palestinian Islamic Jihad. At the operation’s peak, the system’s database encompassed virtually the entire adult population of the Strip. Individuals who crossed a threshold score were added to kill lists and passed to operational units for execution.[15]

The IDF has characterized Lavender and similar tools as “auxiliary tools that assist officers” and insists that every target undergoes individual human assessment consistent with international law.[16] Critics, including former intelligence personnel who spoke to +972 Magazine, describe a reality in which human review was perfunctory at best—a brief confirmation that the algorithmically generated target was male, followed by authorization to strike.[17]

The Gospel: What to Destroy

While Lavender identifies people, The Gospel identifies places. It processes surveillance data, signals intelligence, and geospatial information to recommend physical targets: buildings assessed to house military infrastructure, tunnel entrances, weapons storage sites, and—controversially—private residences.[18]

The Gospel’s output volume represents a qualitative shift in how targeting works. In previous operations, human analysts produced targets at a pace measured in dozens per year. The Gospel generates them by the hundred per day. The IDF has pointed to this capability as evidence that AI increases precision by filtering irrelevant data and focusing analyst attention on verified threats.[19] But the volume itself creates a problem: when a system produces targets faster than any human team can independently verify them, the claim that each one has been individually assessed becomes difficult to sustain.

Where’s Daddy?: When to Strike

The third system in this triad is the one that most directly tests the limits of what international law can absorb.

“Where’s Daddy?” is a real-time tracking tool. Its function, as described in investigative reporting, is to monitor the movements of individuals flagged by Lavender and alert operators when the target has returned to their family home.[20][21] The operational logic is straightforward: rather than striking a target in the field, where they might be surrounded by other combatants or in a location that is difficult to hit, the system identifies the moment when the target is at home—surrounded not by fighters, but by their family.

The implications require no legal expertise to understand. A system designed to time strikes for when a target is in their home is a system designed to strike when civilians—spouses, children, elderly parents—are most likely to be present. Whatever military advantage is gained by killing the target in a known, stationary location must be weighed against the near-certainty of killing the people who live with them.

This is where the precaution principle does not merely fail under pressure. It inverts. The system is not failing to minimize civilian harm. It is optimizing for a condition—the target’s presence at home—that maximizes it. The IDF has not, to date, provided a detailed public response to reporting on this specific system.

The IDF’s Position

It is important to present the Israeli military’s stated position fairly. The IDF maintains that all AI tools used in targeting operations are advisory in nature, that every strike is authorized by a human officer who conducts an independent legal and operational assessment, and that its operations in Gaza comply with the principles of distinction, proportionality, and precaution as defined under international law.[22]

These claims are not implausible on their face. Militaries around the world use automated systems to assist in targeting, and the presence of AI in the process does not automatically constitute a legal violation. The question is whether the scale, speed, and systemic design of these particular systems—especially the volume of strikes, the brevity of human review, and the operational logic of tools like “Where’s Daddy?”—are compatible with the meaningful human oversight that IHL requires. The evidence available to date suggests, at minimum, serious grounds for concern.

The Infrastructure Deepens

It is also worth noting that the AI architecture underpinning Israeli military operations has not contracted since the original reporting on Lavender and The Gospel. In March 2025, a joint investigation by The Guardian, +972 Magazine, and Local Call revealed that Unit 8200 had been building a large language model—similar in design to ChatGPT—trained on millions of intercepted Palestinian phone conversations and text messages obtained through blanket surveillance of the occupied territories.[23] The system is designed to answer questions about individuals under monitoring, drawing connections across vast surveillance datasets that would be impossible for human analysts to process manually. Former western intelligence officials described the project as going further than what would be acceptable in allied agencies with stronger oversight of surveillance powers.

Separately, the IDF deployed an internal AI chatbot called “Genie” across all military command centers in early 2025, capable of pulling real-time operational data and providing natural-language answers to commanders in the field.[24] These systems represent the next layer: not just AI that selects targets, but AI that processes, interprets, and contextualizes the entire intelligence picture on which targeting decisions are based. The trajectory is toward deeper integration, not less.

Why the Law Isn’t Catching Up

The standard policy response to the challenges described above follows a familiar script: draft new treaties, require transparency, mandate human oversight, hold corporations accountable. These proposals are not wrong in principle. They are, at present, functionally irrelevant—and the reasons why tell us more about the problem than the proposals themselves.

Every Major Power Wants These Tools

The United States, China, Russia, the United Kingdom, France, and a growing number of middle powers are all investing heavily in AI-enabled military systems. The incentive structure is straightforward: AI targeting offers a decisive operational advantage, and no state with the capacity to develop it is going to forgo that advantage on the basis of a legal framework that its competitors may not respect. This is the same dynamic that has stalled meaningful arms control in every domain from nuclear weapons to cyberwarfare.

Binding international treaties on autonomous weapons have been under discussion at the United Nations Convention on Certain Conventional Weapons since at least 2014. Progress has been, by any honest assessment, negligible. In November 2025, the UN General Assembly passed a historic resolution calling for a legally binding agreement on lethal autonomous weapons by 2026. It passed overwhelmingly—156 nations in favor. The United States and Russia were among the five that voted against.[25] The states with the most advanced AI military programs remain precisely the ones least willing to constrain them. For its 2026 fiscal year, the U.S. Department of Defense requested over thirteen billion dollars for autonomous weapons and systems—while simultaneously gutting the oversight offices responsible for assessing civilian harm. Unlike landmines or cluster munitions—where the Ottawa Treaty succeeded in part because the weapons were seen as strategically marginal—AI targeting systems are viewed by military planners as the future of warfare itself.[26]

The Speed Mismatch

International law moves at the speed of diplomacy: years of negotiation, ratification, implementation. AI capability moves at the speed of software deployment. By the time a treaty addressing current AI targeting systems could be drafted, negotiated, and ratified, the technology it addresses would be two or three generations obsolete. This is not merely a practical obstacle. It is a structural mismatch between the pace of the problem and the pace of the available remedy.

Classification as Shield

Transparency requirements—the most frequently proposed reform—run directly into national security classification. No military is going to publish the training data, error rates, or decision logic of its targeting algorithms. The operational details that would be necessary to assess legal compliance are precisely the details that states will classify. The result is a system in which violations are, by design, nearly impossible to investigate from the outside—and the states responsible for oversight are the same states deploying the systems.

The Accountability Shell Game

Proposals for corporate accountability face a different version of the same problem. The companies that develop military AI systems—whether defense contractors or dual-use technology firms—operate under government contracts, often classified, in jurisdictions where the legal framework for holding them accountable for downstream military use is thin at best. A tech company whose facial recognition algorithm is incorporated into a targeting system did not authorize any particular strike. The commander who authorized the strike relied on a recommendation from a system whose internal workings they may not fully understand. The developer who trained the algorithm worked on a general-purpose tool that was later adapted for military use. Responsibility disperses at every level.

The Deeper Failure

But beneath all of these obstacles—the arms race dynamics, the speed mismatch, the classification barriers, the dispersal of corporate accountability—lies a more fundamental problem. The enforcement architecture of international law was not merely slow to address AI in warfare. It was actively prevented from addressing the conflict in which AI targeting was most extensively documented.

The legal tools to respond to what happened in Gaza already existed, and they were used. South Africa brought a case to the International Court of Justice alleging violations of the Genocide Convention. The United Nations General Assembly voted repeatedly for ceasefire resolutions. Multiple nations filed formal legal challenges and demanded investigations through established international channels. In September 2025, a UN Independent Commission of Inquiry—after a two-year investigation—concluded that Israel had committed genocide, finding it responsible for four of the five acts defined under the Genocide Convention.[27] In November 2024, the International Criminal Court issued arrest warrants for Prime Minister Netanyahu and former Defense Minister Gallant on charges of war crimes.[28] The machinery of international law was activated—correctly, procedurally, by the book.

None of it mattered. The United States, exercising its veto power as a permanent member of the Security Council, blocked binding resolutions that would have imposed consequences. When the ICC issued its arrest warrants, the U.S. responded not by supporting enforcement but by sanctioning the court’s judges—targeting, among others, the two who had voted to continue the investigation into Gaza.[29] The pattern was not new—Washington has shielded Israel from Security Council action for decades—but its significance in this context is acute. The nations that sought legal remedy were not lacking better statutes. They were not undone by gaps in IHL’s coverage of algorithmic targeting. They were overridden by the structural reality that a single state, acting in its own strategic interest, can neutralize the entire enforcement mechanism of international law—and punish the institutions that attempt to apply it.

This is the fact that no discussion of legal reform can afford to ignore. The AI-specific gaps in IHL are real and they matter. But they are layered on top of an enforcement architecture that was already broken—one in which the most powerful states are functionally exempt from the rules they helped write. Closing the legal gaps around algorithmic targeting, autonomous weapons, and accountability for AI-driven strikes would be meaningful progress. But it would not have changed the outcome in Gaza, because the obstacle was never the absence of applicable law. It was the absence of any power willing and able to enforce it against the interests of those who hold the veto.

This Is Not an Israel Problem

It is tempting, and in some quarters politically convenient, to frame the challenges of AI in warfare as specific to the Israeli-Palestinian conflict. They are not. The United States has used algorithmic targeting in drone operations across Yemen, Somalia, and Pakistan for years, with well-documented patterns of civilian casualties and minimal accountability. Ukraine and Russia are both deploying autonomous drone systems with increasing sophistication. China is developing AI-enabled military capabilities at a scale that dwarfs any individual theater.

What makes the Gaza case significant is not that it is unique, but that it is the most visible. The investigative reporting that has documented Lavender, The Gospel, and “Where’s Daddy?” exists in part because Israeli civil society, despite enormous political pressure, has produced journalists and whistleblowers willing to make these systems public. In many other contexts—the U.S. drone program, Chinese military AI—the details are far less accessible. The structural problems are the same.

The Weight of the Problem

There is a version of this essay that ends with a call to action: demand reform, support treaties, hold governments accountable. Those things are worth doing, and the people doing them—the human rights organizations, the investigative journalists, the whistleblowers who risked their safety to make these systems public—deserve more than a rhetorical nod in a concluding paragraph.

But honesty requires acknowledging that the trajectory of AI in warfare is not bending toward accountability. Every major military power is developing these tools. The legal frameworks meant to constrain them were designed for a different era and are not adapting at the speed required. The incentive structures that drive states to develop AI targeting are stronger than the incentive structures that might lead them to restrain it.

What happened in Gaza—the algorithmic generation of kill lists, the twenty-second approvals, the system that tracked people to their homes so they could be struck alongside their families—is not an aberration. It is a preview. The technology will get faster, cheaper, and more widely available. The conflicts in which it is deployed will multiply. And unless something fundamental changes—not a treaty, not a regulation, but a shift in the political willingness of states to accept constraints on their own power—the gap between the law and the reality of modern warfare will continue to widen.

The rules of war were written by human beings, for human beings, on the assumption that human judgment would remain at the center of the most consequential decisions in armed conflict. That assumption is no longer operative. What replaces it will define not just the future of warfare, but the future of the principle—fragile, imperfect, but indispensable—that even in war, there are things we do not do.

[1]+972 Magazine, “’Lavender’: The AI machine directing Israel’s bombing spree in Gaza,” April 3, 2024.

[2]+972 Magazine, “’A mass assassination factory’: Inside Israel’s calculated bombing of Gaza,” November 30, 2023.

[5]Gaza Ministry of Health figures as reported by OCHA and UNRWA, current through March 2026. Independent estimates, including a population-representative study published in The Lancet Global Health, suggest significantly higher figures.

[6]UN Independent International Commission of Inquiry on the Occupied Palestinian Territory, “Legal analysis of the conduct of Israel in Gaza pursuant to the Convention on the Prevention and Punishment of the Crime of Genocide,” A/HRC/60/CRP.3, September 16, 2025.

[7]Lieber Institute for Law and Land Warfare (West Point), “The Gospel, Lavender, and the Law of Armed Conflict,” June 28, 2024.

[9]Human Rights Watch, “Questions and Answers: Israeli Military’s Use of Digital Tools in Gaza,” September 10, 2024.

[12]Verfassungsblog, “Gaza, Artificial Intelligence, and Kill Lists,” May 16, 2024.

[16]IDF response to claims about use of ‘Lavender’ AI database in Gaza, as published by The Guardian, April 3, 2024.

[23]The Guardian, +972 Magazine, and Local Call, “Revealed: Israeli military creating ChatGPT-like tool using vast collection of Palestinian surveillance data,” March 6, 2025. See also Ynet News, reporting on the IDF’s “Genie” system, April 2025.

[25]UN General Assembly First Committee resolution calling for a legally binding agreement on lethal autonomous weapons systems, November 2025. The resolution passed 156 in favor, with the United States and Russia among the five nations voting against.

[26]The Ottawa Treaty (Mine Ban Treaty), 1997; the EU Artificial Intelligence Act, adopted 2024.

[28]ICC arrest warrants for Prime Minister Benjamin Netanyahu and former Defense Minister Yoav Gallant issued November 2024. ICC Appeals Chamber rejected Israel’s challenge to the investigation, December 15, 2025.

[29]U.S. sanctions against ICC judges Gocha Lordkipanidze and Erdenebalsuren Damdin, December 18, 2025, as reported by Al Jazeera and others. The sanctions followed earlier rounds targeting ICC prosecutors and staff throughout 2025.