AI could soon tackle projects that take humans weeks

reuters

By Lala Hajiyeva

March 22, 2025 23:00 Updated 144d ago

AI is rapidly advancing and could soon handle tasks that typically take humans weeks to complete. A recent analysis shows that AI models are closing the performance gap, with the potential to transform industries by taking on complex, time-consuming projects more efficiently.

Artificial intelligence (AI) systems are making significant strides and may soon be able to tackle complex tasks that typically take humans weeks to complete, according to recent analysis. While current AI models still lag behind humans in handling lengthy tasks, their rapid improvement suggests that they could close this gap sooner than anticipated.

METR, a non-profit organization based in Berkeley, California, has been evaluating AI models’ performance across nearly 170 real-world tasks, including coding, cybersecurity, general reasoning, and machine learning. The team measured how long it typically takes expert programmers to complete these tasks and established a ‘human baseline.’

In order to assess AI’s progress, METR introduced a new metric called the ‘task-completion time horizon.’ This metric tracks the time it takes for AI models to complete tasks at a specified success rate, compared to how long it would take human experts to do the same.

According to a preprint released by METR on arXiv this week, earlier AI models such as GPT-2, released by OpenAI in 2019, were unable to complete tasks that took human experts more than a minute. More recent models, such as Anthropic's Claude 3.7 Sonnet, launched in February, completed 50% of tasks that would typically take humans 59 minutes.

The study reveals a striking pattern: the time horizon for the leading 13 AI models has doubled approximately every seven months since 2019. However, the growth rate has accelerated in 2024, with the latest models doubling their time horizon roughly every three months. If this trend continues, METR predicts that AI models will be able to tackle tasks requiring one month of human expertise with 50% reliability by 2029, or possibly even sooner.

The implications of this rapid progress are vast. Tasks that take a month of human effort could range from starting a new company to making groundbreaking scientific discoveries. However, some experts are cautious about extrapolating these findings too far. Joshua Gans, a professor at the University of Toronto, argues that while extrapolations are tempting, there remains considerable uncertainty about how AI will actually be deployed in real-world scenarios.

Assessing AI’s Capabilities

METR’s analysis, which focuses on a 50% success rate for task completion, offers a more stable and robust approach than other methods. By choosing this middle ground, the team ensured that minor fluctuations in data distribution wouldn’t significantly impact the results. The study also explored the impact of raising the reliability threshold to 80%, which reduced the average time horizon by a factor of five, though the overall doubling trend remained similar.

AI’s improvements over the past five years can largely be attributed to advances in model scale—such as the amount of training data and the number of parameters used in the models. METR credits the progress in time horizon to advancements in AI’s logical reasoning, tool usage, error correction, and self-awareness.

The time horizon metric helps overcome some of the limitations of traditional AI benchmarks, which often fail to accurately reflect real-world work and quickly reach a plateau. As AI continues to improve, the metric provides a continuous, intuitive measure that better captures long-term progress.

While current AI models excel in specific benchmarks, their broader economic impact has been relatively limited. According to METR, the best models currently operate within a time horizon of around 40 minutes, a timeframe that doesn’t necessarily align with highly valuable economic tasks. However, AI researcher and entrepreneur Anton Troynikov argues that AI could have a more significant economic impact if organizations were more willing to experiment with and invest in these models.

The Future of AI and Economic Impact

The ongoing advancements in AI indicate that, as models continue to improve, they may eventually be able to handle tasks that would otherwise require weeks of human effort. However, for this potential to be fully realized, it will require both technological progress and a shift in how organizations deploy and integrate AI into their operations.

As AI continues to evolve, its role in shaping industries and driving innovation could become even more profound. While the path forward remains uncertain, one thing is clear: the future of AI holds immense potential, and its rapid progress could soon reshape how we approach complex tasks.

U.S. Vice President visits troops in UK ahead of Trump-Putin summit

U.S. Vice President JD Vance visited American troops in Gloucestershire, England ahead of President Trump’s historic Alaska meeting with Russian Pre...

Türkiye’s Fintech, Gaming & AI Startup Investments Hit $211M in 2025

36min ago

Middle East

Türkiye’s Fintech, Gaming & AI Startup Investments Hit $211M in 2025

In the first half of 2025, Türkiye's startup fintech, gaming, and AI startups raised a combined $211 million, a clear sign of growing investor confid...

1h ago

Ukraine talks

Trump, Zelenskyy and EU leaders stress Ukraine talks before Alaska summit

U.S. President Donald Trump held a virtual call with European leaders, including Ukrainian President Volodymyr Zelenskyy, ahead of his upcoming Alaska...

Russian and Belarusian military exercises practice Oreshnik hypersonic missile on September

1h ago

Oreshnik missile drill

Russian and Belarusian military exercises practice Oreshnik hypersonic missile on September

Russian and Belarusian armies are going to hold Zapad-2025 joint exercises in September. The trainings will include drills on the planned deployment o...

Juneau braces for record glacier flood in Alaska

2h ago

Climate

Juneau braces for record glacier flood in Alaska

Juneau, Alaska, is on high alert as floodwaters from the Mendenhall Glacier threaten to reach record levels, forcing residents in vulnerable areas to ...

Türkiye’s Fintech, Gaming & AI Startup Investments Hit $211M in 2025

In the first half of 2025, Türkiye's startup fintech, gaming, and AI startups raised a combined $211 million, a clear sign of growing investor confidence.

21:19

Zangezur Corridor

From rivalry to cooperation: why Zangezur Corridor matters

19:00

Swiss Parliament

Swiss Parliament rejects pro-Armenian resolution against Azerbaijan

On 11 August, 2025, the Foreign Affairs Committee of the Council of States — Switzerland’s upper parliamentary chamber rejected draft resolution No. 24.321, titled 'Annexation of Nagorno-Karabakh and the Release of Azerbaijan’s Political Prisoners.'

France, Germany and Britain warn Iran of possible snapback sanctions

18:36

E3-Iran

France, Germany and Britain warn Iran of possible snapback sanctions

France, Germany and Britain have warned they are prepared to reinstate United Nations sanctions on Iran if it does not return to negotiations over its nuclear programme, according to a letter sent to the U.N. on Tuesday.

comments (0)

What is your opinion on this topic?

Leave the first comment

Most viewed

Tomorrowland pushes forward after main stage fire ahead of festival launch