Uncategorized

AI Being Super Useful Matters More Than Being Super Applicable

December 1, 2024

390

2d3331b1-954b-435e-be1f-eef44d2a006a_f34fa7a7

When will super applications emerge? This question has dominated discussions in the AI industry over the past year. Comparisons to the PC and mobile Internet eras, which saw the rapid rise of super apps, are tempting but oversimplified. AI represents a transformative technological wave, akin to the Industrial Revolution, making its trajectory more comparable to the rise of steam engines or electricity than to recent digital innovations.

In 1776, the first practical steam engine ushered in the Age of Steam, but it took decades before railways, shipping, and industry fully embraced the technology in the 1800s. Similarly, the electric power revolution, central to the second industrial revolution, required decades for advancements like power plants, electric lights, and assembly lines to redefine industries. These examples highlight that groundbreaking technologies often demand time to evolve into practical, widely adopted applications.

The super applications of the AI era will undoubtedly arrive, but their time has not yet come. Over the past year, the AI industry’s eagerness to pursue a singular super application seems premature. AI’s foundational models, like the large language models, hold immense potential but do not inherently deliver practical value. Their real significance lies in enabling the development of diverse applications tailored to solve specific problems.

For AI developers and entrepreneurs, focusing on gradual progress and iterative improvements is far more strategic than chasing the elusive AGI or super application. Incremental innovation can pave the way for genuinely transformative solutions.

Recent developments illustrate this point. At the Baidu World Conference 2024, Baidu revealed that daily API calls to its ERNIE Bot skyrocketed from 200 million to 1.5 billion within six months—a 7.5-fold increase. This surge reflects both the rapid growth of AI applications in China and the tangible value these models are beginning to deliver. While the super application remains a distant goal, the ecosystem of practical, high-impact AI tools is already taking shape.

Baidu also unveiled Miaoda, a groundbreaking code-free programming platform enabling multi-intelligence body collaboration and multi-tool invocation. This innovation stands apart from conventional code-generation tools, which have traditionally empowered elite users, such as Silicon Valley engineers. These tools cater to a scarcity of expensive technical talent, enhancing productivity for those already at the top of the skills pyramid.

Miaoda democratizes AI’s potential, targeting everyday users with no coding expertise. By integrating advancements in foundational models and intelligent body capabilities, Baidu has crafted a tool that grants programming-like abilities to millions.

In the AI era, tools like Miaoda could redefine innovation. With natural language programming lowering the barriers to entry, ordinary users can conceptualize and realize new products and services. This widespread accessibility promises a future where technology serves as a universal enabler, turning lofty ideas into countless practical and valuable applications, embodying the true spirit of technology for all.

For a long time, China’s domestic large models struggled to gain traction across industries. Industry demand is strong for intelligent hardware and AI assistants, but few are willing to pay because limited multimodal capabilities left generative AI resembling simple chatbots, initially intriguing users but ultimately failing to retain them due to subpar experiences.

To enhance their reliability, the AI industry has widely adopted retrieval-augmented generation (RAG) techniques, which incorporate external data retrieval to anchor model outputs in factual information. This approach has largely mitigated hallucinations, making large models more usable and valuable.

Further advancements in multimodal capabilities have expanded the potential applications of generative AI. Baidu’s recently unveiled iRAG (image-based RAG) technology represents a significant leap. By integrating retrieval-enhanced text-to-image generation, Baidu aims to eliminate multimodal hallucinations, enabling the creation of accurate and visually stunning outputs for applications in film, comics, picture books, posters, and more.

For instance, in the automotive industry—where high-quality visual marketing is essential—creating photo-perfect promotional materials is traditionally costly and time-consuming. With iRAG, car manufacturers can generate visually compelling images at a fraction of the cost and time, often achieving superior visual impact. This capability addresses practical industry needs, such as maintaining the integrity of logos and brand colors, which require precision and reliability.

Generative AI currently follows two main development paths. The first is the AGI (Artificial General Intelligence) approach, which focuses on achieving general intelligence through foundational models—a long-term goal requiring years of innovation. The second is the application-driven approach, which emphasizes solving immediate user needs while iteratively improving the model through application feedback.

Baidu exemplifies the latter strategy, balancing foundational research with practical applications. Its focus on multimodal accuracy has rendered iRAG technology viable, with feedback from real-world use cases now informing further model refinements. This iterative cycle demonstrates how targeted, application-driven innovation can accelerate both AI development and adoption across industries.

Globally, major technology players are pivoting towards AI agents. In mid-September, OpenAI researcher Noam Brown announced the formation of a new multi-agent research team. Microsoft’s CEO, Satya Nadella, unveiled ten new AI-powered business agents in one go, while Google previewed its next-generation agent, Jarvis, capable of autonomous internet browsing and information retrieval. These developments highlight a strategic transition among AI giants: from expanding foundational models to deploying advanced agent-based solutions.

Robin Li, Baidu’s founder, asserts that AI agents represent the most mainstream form of AI applications and are poised for a breakthrough. He likens their emergence to building websites during the PC era or launching self-media accounts in the mobile age. However, intelligent agents are inherently more dynamic—acting as personalized assistants, sales representatives, and customer service agents. Li envisions these agents as the primary medium for content, information, and services in an AI-native era.

This sentiment resonates globally. OpenAI CEO Sam Altman, during a Reddit Q&A, emphasized the potential of AI agents, suggesting that they may drive the next major breakthrough in AI.

Intelligent agents offer a unique combination of accessibility and scalability. They present a low barrier to entry but can grow into transformative platforms, much like websites or mobile apps did in previous technological eras. Failing to invest in AI agents today is akin to neglecting website development 20 years ago or app creation a decade ago. These agents are set to define the future of AI applications, becoming the cornerstone of how individuals and organizations interact with content, information, and services in a world increasingly driven by artificial intelligence.

Source: Baidu, SCMP, PYMNTS