90% of AI's Power for 10% of the Cost. Is the AI Arms Race a Giant Waste of Money? With Bruce Yang of AgnesAI

Does AI only work for people who can already afford it?

In this episode of Your AI Injection, host Deep Dhillon sits down with Bruce Yang, founder and CEO of AgnesAI, to challenge one of the biggest assumptions in tech right now: that winning the AI race requires near-unlimited resources. Bruce argues that the real opportunity isn't in the markets everyone is fighting over, but in the billions of people being priced out of AI entirely. AgnesAI has earned a spot among the world's top 10 AI labs by delivering top-tier performance at a fraction of the cost of today's leading models.

The conversation explores whether the industry's obsession with scale is creating real value or simply winning benchmark battles, what unchecked AI adoption could mean for the workforce, and whether some of the most important breakthroughs in AI are happening far from Silicon Valley.

Learn more about Bruce Yang here: https://www.linkedin.com/in/tongbruceyang/
and AgnesAI here: https://agnes-ai.com/

Check out some of our related content here: 

1. Why Most Companies Are Failing at AI and How to Succeed with Tahnee Perry

2. Will AI Eliminate 90% of QA Jobs? The Future of Testing Automation with Kevin Surace of Appvance.ai

3. Exploring Artificial General Intelligence: Intent, Intellect, and Innovation with Lucas Hendrich of the Forte Group

Get Your AI Injection On The Go:


xyonix partners

At Xyonix, we empower consultancies to deliver powerful AI solutions without the heavy lifting of building an in-house team, infusing your proposals with high-impact, transformative ideas. Learn more about our Partner Program, the ultimate way to ignite new client excitement and drive lasting growth.

[Automated Transcript]

Bruce: So the kind of technology, requirement from our side, uh, is to, achieve 90% of the capabilities while efficacies, while, lower the cost to about one-tenth with 10% of the cost.

So 90% of the capabilities with 10% of the cost. So th- this is, uh, the requirement which we're putting to our engineering team and research team. And how we achieve that is through a lot of, optimization, not only at the training stage, but also at the inference stage. And as far as I understand, we have already achieved that with, even the better result from our expectation.

So right now most of our models is operated at about one to the 20th or one to the one- uh, one to the 50th to, to the models from US and one-fifths to one-tenth of from compared to the models from China.


CHECK OUT SOME OF OUR POPULAR PODCAST EPISODES:


Deep: Hello, I'm Deep Dhillon, your host, and today on Your AI Injection, I'm joined by Bruce Yang, co-founder and CEO of Agnes AI. Bruce earned [00:01:00] degrees in computer science and applied math from UC Berkeley, and has a PhD in information systems and analytics from the National University of Singapore. He spent his career working across AI research and enterprise tech, and at Agnes AI he's focused on building AI systems designed to handle complex real world workflows from collaboration research to multi-agent problem-solving.

Bruce, thank you so much for coming on the show. 

Bruce: Thank you very much, Deep.


Xyonix customers:


Deep: Awesome. My first question for you is, um, what did people do before your solution? What do they do now that's different? 

Bruce: Yeah, that's a very easy question. We do not came as, the first AI solution company.

So before us there's already ChatGPT, there's already Gemini, but there's still a reason for us, Agnes AI, to exist, especially for the emerging countries. Because, you know, uh, up until now, the percentage of users using, uh, gen AI every day is still a, a small percentage. So I've, I've seen a, a polls which with the American, you know, [00:02:00] netizens that over 50% of users are not, still using AI for last week.

And, uh, among the, the maybe less than half of the users using AI, only about 10%, 5 to 10%, subscribers. So, uh, out there there's over 90% of users who are not, real AI users. And this is only about American market. If we talk about emerging markets like Indonesia, Vietnam, Brazil, and a lot of other, other part of the world you know, um, people are not accustomed to the paying for, advanced features of AI, and this is where we come to play because, we have our own foundation models, and we're able to operate at a cost, which is about one tenth or one twentieth of that of, ChatGPT or Gemini or Claude.

And with that we're able to serve a, a vast majority of the emerging markets and this is where Agnes comes to play. 

Deep: Wow, so you guys are actually building foundation models. So tell me, how are you financed to be able to afford to build foundation models? 

Bruce: [00:03:00] Well, it, it used to be a, a lot of, a lot of money spent, um, especially before the introduction of, um, open source models like Deep Sea, Qing and Kimi and a lot of other, other AI companies, companies from China.

Uh, but now, uh, building a foundation model means you don't have to go from scratch. You don't have to do the entire pre-training, especially for large language models. So we have, most of our work centered on the post-training with supervised fine-tuning, with some of the distribution from open weight models and also a lot of, uh, post-training.

And we do have also, financing, ourselves from the fundraising from investors in the region. And so altogether we're not spending a lot of money. Up until now we're spending something like 20 million, uh, but our models are ranked pretty high on, open benchmarks like artificial analysis, CloudEvals.

We're, like, top 10 AI labs out there, while still keeping the cost, I mean the inference cost, to about one tenth or even lower, to the top labs in, in the world. 

Deep: So are you guys built on [00:04:00] top of, like, Llama models or something similar? And then you're fine-tuning for your use cases, something like that?

Yeah,

Bruce: so, uh- 

Deep: Or for your languages? 

Bruce: Yeah. So, we have a, um, a pretty, pretty vast selection of open source models, including, Llama, Qwen Llama Tun from, uh, uh, NVIDIA. Uh, we have tried all of them and especially, uh, we want to cater to the market of the maybe Southeast Asia because this is where, the company is right now, in Singapore.

And a lot of minority languages are, are not very well served with, uh, you know, open source models like Qwen or Kimi. And so we do a lot of post-training by, fine-tuning, um, the minority languages, getting the corpus of the data and a lot of, uh, dialects and stuff, for the purpose of pre- continuous pre-training.

And we also, try to lower the size of the models up to something like 30B while achieving, the accuracy level in SWE and, , cloud benchmarks like Pinch Bench with models, um, similar of [00:05:00] about maybe, uh, 100B and above. So we, we do a lot of, optimization in per- on the, i- in, in these lines.

Yeah. 

Deep: But tell me a little bit more about why you think these languages , are underrepresented in the foundation models in the, in the kind of- Yeah ... more well-known Gemini, Claude, ChatGPT foundation models. Is it that this data is not publicly available and you guys have some kind of access to private, language resources or is it something else?

Bruce: It's, it's more about the representation. So, uh, within the language model, uh, is a lot of the data which we use are already being, trained for, as a pre-training stage for the large, as a large natural language models. But the percentage in the training is very, very less, uh, represented.

So if you are talking about Gemini and, ChatGPT, probably the, the most served languages are the m- majority languages like Chinese, English German, French, Japanese, Korean. Uh, but for a lot of minority languages, which is not... By the minority doesn't mean that [00:06:00] there's, less population, it's just less text existing text, in literature.

So things, uh, language like Bahasa from Indonesia, uh, like Malay from Malaysia, uh, Vietnamese, Philippines, uh, a lot of languages like that is not very well presented in the, in the training of the models. And this is one thing. Uh, another thing is, you know, a, a lot of people are thinking that, the open weight models, because they are we- open weighted, uh, it could be cheaper to serve the emerging markets because these are markets which are not like, um, you know, American market where a lot of people are paying for subscription.

In the other part of world like what do we, we are represented you know, outside of Singapore for the region Southeast Asia. Uh, very few people are willing to spend any money for s- for subscribing to the large language models, uh, especially for, especially for- typing apps like ChatGPT cloud.

So, uh, they are only on the free tier. They're not able to use the, uh, intense you know, uh, uh, more, more, uh, advanced [00:07:00] features. Uh, so that- that's also a barrier for the region to access to, uh, the advanced AI. So we want to solve the problem on both directions. Number one, we want to lower the cost by, fine-tuning, on top of open source, open weight models.

Number two, we want to solve the, you know, uh, mi- uh, minority language, multilingual- I'm, 

Deep: I'm actually having a little bit of issues on my end on, on lis- on hearing, so I'm gonna go off of video. 

Bruce: Okay. Can you hear me better? 

Deep: Yeah, sorry about that. I'm not sure what's going on. Now I'm, now it's better.

Apologies to anyone watching on YouTube, but, um- 

Okay, so let's, let's kind of dig in a little bit there because I'm not quite sure I'm following. So There's public repositories of these, of these minor- minority languages and- Right ... when you say they're not well represented, is that because, uh, you think the foundation model companies are, like, omitting this text because they have so much other, so many other higher priorities?

Or is it because you're over-representing these in the pre-training? 

Bruce: It's probably on the latter. [00:08:00] So, uh, it, it, it's like you, you can't really serve well with one model for both the, the majority languages and minority languages. By putting them all together with the, the exact, proportion of the, you know, presenting on the literature, you already, overlook m- most of the minorit- minority languages because they are not very well represented on the literature.

So what do we do in, in country is we present these models only for the, uh, regions with minority languages. And we have a different model to serve for, majority languages like uh, English Chinese German, French, Japanese, Korean, uh, and also P- Portuguese and, uh, Spanish.

These are what we call the majority languages, and, uh, for the mi- minority language we have two separate models. Yeah. 

Deep: Oh, that's interesting. I see. And now when, when you train on these kind of languages, are you talking about the pre-training step where you're building the kind of base [00:09:00] model understanding, or are you topic- talking about the kind of fine-tuning stages for particular questions and answers that you're looking for?

Bruce: Yeah. So we, we, we don't go directly from from scratch for the pre-training, but we have a layer of continuous pre-training, uh, which we call CPT, which gives more data to, the current layer of the, , pre-trained model, pre-trained foundation models, before we, add on the post-training with, supervised fine-tuning and reinforcement learning, uh, for the task which we optimize for, uh, which is mostly our agentic, uh, reasoning, agentic tasks, uh, to users and coding.

Yeah, these are the cases... These are the major use cases for us. Yeah. 

Deep: Got it. And so just kind of at a high level for our audiences to benefit, for anyone who doesn't know. So there's, like, this early stage where when we train these models, you know, we go through and take all this language, and we're predicting future sequences of text, and that teaches the models basically, like, how to answer things.

And then there's this sort [00:10:00] of, reinforcement learning, um, with human feedback layer where humans are sort of, picking between different answers and providing optimal answers. So do you guys fund the RLHF, uh, s- stage as well in Like, in the languages that you're focused on and the topics you're focused on?

Um- 

Bruce: Yeah. So, so for the reinforcement learning with human feedback it's not, um, something which is specific to the minority languages which we are working on. So for example, we can potentially go, with the, the same kind of reinforcement learning, uh, with, uh, English and Chinese. But the, with the proper, you know, supervised fine-tuning, with pr- proper, continuous pre-training it would just, uh, take longer with, with the result, with the reasoning processes of the models trained by, uh, English, English or, or other major languages.

Yeah. 

Deep: So Are you targeting really specific use cases as well, or are you really just going for generic general users in these markets? 

Bruce: Um, you know, um- [00:11:00] Uh, because we are- we're trying to optimize a lot in terms of cost, so we are not able to compare everything head, head to head to all the, the, the best models out there.

So we optimize for a couple of use cases, which we believe are the mainstream use cases. For example, the number one is coding. Coding is extremely important because coding will be a foundation of a lot of agentic workflows. Second is, all the agentic, reasoning, agentic, uh, tool use, is part of the workflow orchestration, so that's the second piece.

The third piece is character. So like companion, we want to train the model to really understand human, uh, speak in human languages, uh, be very kind and pro- proactive when there's topic to go through. And the last part is, more on image and video creation, but, um, for the purpose of the large language model, it serves as a, uh, glue as a agentic role glue to, to understand about the context and, try to describing in the best way for the prompts which goes to the [00:12:00] DIT, which is, image or video models.

So th- these are the four major use cases which we're cater- catering to. 

Deep: I mean, that's, that's incredibly ambitious 'cause this is basically, what the core foundation model companies are doing. So how big is your team? 

Bruce: We, we have pretty, pretty big team. Right now we have, uh, something like 60, 60, six-zero, uh, researchers, uh, with scientists, which is on the foundation layer.

We have close to 100 engineers but that includes also including our contract team external to the core team. All together, there's over 100 engineers, uh, working on the, you know, harness, working on the, the product. Uh, so putting it all together is close to 200. 

Deep: Oh, wow. That's a lot bigger than, i, I guess the dollar amounts go a lot further in Singapore than they do here in, in Seattle. Right. So, um, okay, so I'm gonna try to wrap my head around this because there's a lot of landscape that you [00:13:00] guys are covering. Right. And and even with 160 folks, it's still, like, nowhere near Google size or the- Right.

Right ... the number working on Gemini or working on Claude or working on OpenAI. Jumping into the sort of maybe on the ethical side, 'cause we like to cover that here on the show too- Are you concerned that you're spread thin? I mean, you know, like, there's entire teams working on just safety for mental health, you know?

And even within there, those teams are, like, broken up. Yeah ... so when you're small, and going so broad how do you think about that? 

Bruce: Well, you know, uh, first of all, we... despite we are small, we have a pretty good team. Uh, they're all very good talents from top universities in the region.

And, uh, they're working almost day and night to help us to catch up with the big play- big players. I, I don't really think that you know, the size of team is a major indicator of how good the team is. And we're competing with, uh, you know, these, uh, [00:14:00] big players in a very different arena.

Because, you know, all the major players are mostly, competing on the major markets, like US, China, European, European Union, and, maybe some of the big emerging markets. But we focus a lot, uh, only on the emerging markets, like the Southeast Asia, LATAM regions, Middle East maybe part of Europe.

Because these are the places which we find, users very, reluctant of, spending money on AI or even, uh, trying with AI, uh, for a lot of reasons. Because they are, they're just slow adopters. And, the reason we become a, a good choice is we put a lot of things together. As I mentioned, we have a broad, uh, broad scope.

Number two, uh, we try to cater to the languages which are not very well represented by the by the big players. Uh, number three, we're trying to lower the cost. A cost which is, uh, phenomenally different. Something like we are serving the API with a cost about, [00:15:00] which is about one tenth or, or lower, like one twentieth, sometimes one, one fiftieth, uh, compared to the, all the, uh, key market players.

So, um, that's how we are differentiating from, the other players. And then, this is a huge market we are trying to occupy and occupying with. Yeah. 

Deep: Maybe let's dig in a little bit, 'cause you mentioned the demographic differences and, like, the po- you know, the population differences. How does that actually manifest as you know, inside of inside of your company?

Like, what, what actually happens when you're- Mining the logs, you're finding, you know, problematic, uh, queries that you're... Or, and responses or... And you're- you guys are able to, like, modify the training data so that you can handle them, and then across that population it will sort of evolve to, like, having a lot more benefit in those kind of languages that you're working in.

Something like that? 

Bruce: Yes. So you, you see there, there's two kind of players trying to, uh, um, come into the emerging markets. One is, [00:16:00] uh, US, US players like OpenAI, uh, Anthropic. Despite their focus are st- still on the developed regions, they're trying to enter the emerging markets with, um, uh, lower, lower paid tiers, but it's not ex- extremely successful if you look at penetration.

Um, there are a lot of light users but they're, they're not trying to get the ki- the same kind of conversions like in US market or European market or Japan, or even, or even Japan, Korea. Um, so there... A lot of people heard their names, but very few, um, heavy users from the um, from this region for the US products because the...

One, one of the most important reasons is the cost. Uh, because, uh, the $20, $10 to $20 per month is not something which is emerging countries are used to spend just for software. Uh, there's also, also part of players in the world whi- which is from China, uh, with lower cost, models like from, DeepSeek, from Kimi.

But just as far as we know, they're not very well, um, optimized for the local languages. Even worse, uh, actually much [00:17:00] worse than GPT and Gemini or, or Anthropic. So, um, despite the, despite o- of the lower cost the performance, especially on the minority languages, are not very well, served.

So this is where we, we stand a chance. We, keep our cost low, even lower than the model company, the product from China. But we have a lot of optimization for the minority languages are represented in the region. Um, what, what I have mentioned, Bahasa from Indonesia, Vietnamese and Malay language like that.

Same kind of situation happens in, other parts of world. And this is our key value proposition. Yeah, but, you know, uh- So where- But... Go ahead. 

Deep: Oh, I was gonna say, so w- where does your... you mentioned, you know, the, the cost differences. Like, how do those manifest in your new user growth?

Do you do you have sales folks inside of enterprises trying to get enterprise deals? Do you... Are you running ads in, like, local arenas to get people to use the stuff? Because y- the [00:18:00] $20 a month thing, uh, is- You know, for the main models is only if you're a power user, right? Like, regular people- Yes ... get free access, like, on Google and...

So I'm curious how you're... How... What's the kinda engine of your user growth? Is it word of mouth? You know, how is it working? 

Bruce: Yeah, so, so, it's true that, o- only, only the, only power users, as you mentioned, only part of the power users are, are spending the money. But if th- they are not power users, they are not real AI users, right?

So if you only use, ask a query every day, there's no difference about using any of the product. So what we're trying to do is putting a lot of things together, like how you can have Kata.ai, uh, turned into a companion app, a design app, and a research app you know, search app together in one product.

And that's served in the region with very low cost for subscription. Somethi- something like $1 to $3. But you are able to get the pay tiers of all different [00:19:00] apps from the from s- from a western site. Like, you can have, uh, the search, like, from Perplexity um, you know, Kata a- Kata.ai for companion design l- from um, His- Hix- uh, Hixfield, design from Hixfield, and research and search from, from Google Gemini.

We put them all together in one product, and there's a package which we serve to the emerging markets with, like, pay tiers, like, from $1 to $3. And this is how we are different, very much different. If you, if you try to get the same kind of features from all, all these four products, you're gonna spend, like, $50, but from us it's, like, $3 a month.

Um, this- Oh, 

Deep: wait. Sorry, I didn't quite f- understand that. So are you saying that you'll, like, broker requests to Gemini and the different models? No, it's 

Bruce: not a broker request. Not broker request. It, it, it's like we have unified all the features from different apps- 

Deep: Okay ... to put it in one- So you're not talking about- Yeah

you're not talking about actually- No ... sending the request over to these models. No, no, no, no. You're just saying we, we cover the [00:20:00] functionality across these- We cover the 

Bruce: functionalities. We cover the functionality- Yeah ... in one super app, which serving for the emerging markets with, uh, subscription plans you know, offer at $1 to $3.

Deep: Okay. And so, so let's talk a little bit about what's the core kind of reason your costs are so much lower. Is it because you're using lighter weight models and, like, and you're not sort of relying on really heavy reasoning? Like, what's the, what's the kinda core reason your costs are so much lower?

Bruce: Yeah, good question. So, uh, as I mentioned, that's one of the key reasons we are trying to lower the size of the models as, uh, low as something like 30 to 35B but serving almost the same kind of functions like coding, agentic to use you know, companion. It could be potentially coming from a routing of three different you know, models specialized for each, each for one function.

[00:21:00] But all together, we are able to hosting our, uh, models at a much you know, small cap- capacity GPU, um, which is like L20 or L40, at a cost about one sixth of that of H200. That's one way we are trying to, uh, save cost using a smaller model. Another way we, we do a lot of optimization as a inference level.

For example, optimizing at a Cuda level, quantization, reduced depths of the inferences. So all together, we are able to have our model hosted at L20, but achieve a TPS up to 600 and above, uh, which is pretty amazing. And the cost for the GPU, comparing to the cost of API, uh, if you're talking about, uh, Anthropic or ChatGPT, it's about one to the 100th.

One to the 100th, so 100th of the cost if you're going for the API. That's why we are [00:22:00] trying to... One thing which we're trying to do in the in the next round of campaign, is giving free access to our APIs for the global developers for one month or even longer to let everybody try out our models.

We, we've seen ourselves you know, listed at artificial analysis as one of the top 10 AI labs. But the benchmark sometimes is not the same kind of use cases like how people use it. So we, we're going to open up open the door, open door to every developers to use our API for free, uh, for one and two month- at least one or two month, to showcase how, how cost efficient we are on hosting our service.

Yeah. 

Deep: Wow. I mean, frankly, that's pretty impressive. So basically you guys are sort of like more focused on lower cost models, but like where do you stand efficacy-wise on the highest reasoning tasks? Um, and are you always lagging a little bit From the, elite models that [00:23:00] the big guys are putting out, but the cost value is, um, you know- Yeah

the performance value equation just- Yeah ... looks a lot better for you, something like that? 

Bruce: So, uh, um, good question. The, the kind of technology, um, requirement from our side, uh, is to, achieve 90% of the capabilities while efficacies, while, um, lower the cost to about one-tenth with 10% of the cost.

So 90% of the capabilities with 10% of the, 10% of the cost. So th- this is, uh, the requirement which we're putting to the, to our engineering team and research team. And how we achieve that is through a lot of, uh, optimization, not only at the training stage, but also at the inference stage. And as far as I understand, we have already achieved that with, uh, uh, even the better result from our expectation.

So right now most of our models is operated at about one to the 20th or one to the one- uh, one to the 50th to the models from [00:24:00] US and one-fifths to one-tenth of from compared to the models from China. And, uh, we, we keep ourselves ranked on top uh, rank, um, ranked as top 10 in a lot of benchmarks, including what I has mentioned, what I have mentioned, the Pinch Bench and the Cloud Eval for the OpenClaw, uh, benchmarks.

We also are putting ourself up on the S- um, SWE and, uh, Humane Last Exam, HLE. Uh, for both of them we're trying to also be, uh, the top 10. Uh, it's, it's very reasonable. I think the, by the time the, this show goes up, we, you should al- already see us as a top 10. And for image and video models, we are also top 10 on the artificial analysis.

So, it's not unachievable. We, based on our, experience we ha- th- this is like optimization problem, right? We make sure that we, our condition our, our limitation is the cost, but we're trying to optimize in all kind of ways to, uh, um, for the [00:25:00] efficacy.

And this is something we have achieved, yeah, based on our benchmark. 

Deep: One of the other sort of challenges you have is like if we rewind two years ago the foundation model companies that you're kind of sort of coming up behind, uh, and trying to get your 90%, 10% cost, uh, 90% efficacy, 10% cost trade-off I mean, they're also expanding their landscape so- Yeah

so aggressively, right? So, it seems like you have to, like, achieve parity and growth um, in order to, like, maintain this, this kind of, like, chaser mode- Right ... that you're in, which is, um- Yeah ... this kinda underdog chaser mode. But it's- Right ... yeah, I mean, it's- So- It's a very not easy task that you're up against.

Bruce: It's not an easy task, for sure. 

Deep: No. 

Bruce: So that's why, 

that's why I think that, our research team, product team has to be h- have to be working hand to hand with our you know, marketing team. And our marketing team has proposed something very aggressive, like free access to our API for the global [00:26:00] developers.

And we're talking a lot about, not about thousands of developers. We're, we're thinking about one million developers on a daily basis to use, use our API for free. And that will definitely bring up the kind of attention as a underdog as a l- latecomer because, you know, the, the cost efficiency is something which it makes a, a huge am- huge amount of sense, is something which, uh, you know, uh, tremendously different for a lot of users, especially with, with, you know, uh, the introduction of OpenClaw, Hermes as a harness which runs a lot of tokens.

Deep: Oh, yeah. I mean, once you get into the agentic arena, like, your costs skyrocket. In fact I was kinda shocked. I, I just sort of assumed this is not what you were doing. I sort of assumed when I looked at the show notes that you guys had, like, really specific use cases that you were sort of fine-tuning some open source models against.

Um, but then I went in and started playing around with your system, and I was giving it some of my harder questions that, you know, I give to, like, new models [00:27:00] that frankly, like, you know, GPT-4o, 4.2, 4.5, all the way up until 5.0 were pretty poor at. And your system was doing pretty well. I was kinda- Awesome

I was kind of impressed. 

Bruce: Thanks. Thanks. 

Don't call that a compliment. 

Deep: But I'm also, like- Yeah ... I'm also thinking that, you know, maybe, maybe we use this here internally just for a lot of our agentic workflow, which takes a crazy amount of... I mean, we're, you know, we're all, we're, we s- we end up spending a lot of money.

Like everybody else, it c- it's really easy to wind up in a few thousand dollars a month per developer, um, once you start unleashing a lot, like, once you're using Codex and some of these other models. Um- 

Bruce: Right 

Deep: ... okay, I wanna switch, switch modes a little bit because I think it's- It's not like every day I get somebody on the show that, that really understands the full breadth of...

That has ac- That stares inside of logs every day of, you know, what everyone in the world is using AI for. And maybe it's not everyone in the world, but y- you clearly have a lot of breadth and coverage. So let's [00:28:00] talk about some of the things that everybody, you know, loves and hates about AI, but let's start with the things that they hate.

So, and the things that we're sort of concerned about societally. Right. Let's take an example of a problematic domain. So I'm sure you've seen all the press around mental health challenges that that, you know, people... As you know, with, like, social media, you know, we had many years where society kind of became overly involved with this stuff, and we saw really serious health effects, particularly on our teenagers and, like, young people.

But also others, like increased risks of depression, anxiety. And so when we look at the models, you know, if there is a big round of press around models being sycophantic, and their sort of need to please is also resulting in you know, playing along with people's mental health delusions. If somebody's, like, schizophrenic or, you know, bipolar, the worst thing you can possibly do as a system is sort of acknowledge that, "Yeah, you're Jesus Christ," or, like, whatever they [00:29:00] think about.

You know, like, you can't... Those are, like, very unhealthy things. So how do you think about maybe not only inside of Agnes, but inside of the industry at large? Like, is it okay for all of... For you and all of the much bigger model companies to simply chase logs and and sort of deal with problems as they come?

Or is that just inevitably gonna lead to some serious problems? 

Bruce: Uh, well, the reason, um, a lot of models are, are sycophantic is because, at the training stage of the data, the kind of speculation is trying to make, uh, the roles of the models to serve as, somebody who is only, uh, uh, responding in a very positive ways.

So I, I, I don't think this is something that's really existing for long- long term. Because you know, um, there are a lot of- a lot of new model companies coming [00:30:00] up, uh, with new ways of training the models and also, like, the new ways of interacting with human. Like just, uh, uh, Thinking Machines has just, introduced their own model, their model called In-Interact-Interaction Model.

I think that's potentially going to be a new paradigm. It's not difficult. Because, I don't think there's a something in terms of technology is a, is a huge breakthrough. But in terms of, user experience, in terms of, interactions, I think is a new opportunity.

It's a new avenue, uh, compared to the ones which we're working on, uh, right now with ChatGPT, with, uh, Gemini. It's a different paradigm. So when, when AI becomes more interactive with the harness, which makes, m- makes, uh, you know, AI to be more proactive and, um, have long, long-term memories and have, more objective role-playing.

I, I think, the kind of, problem what you see with, the interactions, um, of, um, unhealthy ways of, uh, you know, [00:31:00] human AI, uh, symbiosis, um, will be resolved. Yeah.

Deep: Yeah, I don't know. I think I'm a little more skeptical because, A, I agree it's not a hard problem to fix. The... But, eh, from a technical standpoint, you can do it with one line in your, in your, uh, profile. You know, right now I, like, uh, configure mine to be, I, uh, I think, uh, with OpenAI, which is kind of one of the main systems I use, you can sort of just find your personality.

Do you want it to be really positive or whatever? And I say no, and I have it be academic and, and, uh, and contrarian. But, just to belabor my point, I ended up having it be particularly contrarian. So it was always arguing with me about everything. And then I found myself being really annoyed with this thing, so I had to tone it down.

And the reason I bring this up is I think the reason these things are sycophantic is because everyone's optimizing for engagement at the end of the day. Right. And people are narcissistic creatures that like being told how wonderful they are. And [00:32:00] if you start- Yeah ... arguing with people constantly, then it's the same way nobody really loves hanging out with that guy that argues with them perpetually.

It's so, so I think I think the business models matter here, and I'm, I'm concerned about, you know, how terrible optimizing for engagement proved in the social media arena, and I'm actually much more, Okay with API-based pricing you know, something that's not... Anything that's not engagement or even the $20 a month model I f- or, you know, in your case $1 to $3 a month.

But when you're not slinging ads, and you're not trying to keep everybody talking to this thing forever, then you don't have to be as sycophantic, and you can push back on people. But I think I'm curious what your thoughts are there. Like, to what extent does optimizing for engagement impact your decisions?

And if you're being really honest, when you start hitting, like, aggressive growth rates, and your investors start breathing [00:33:00] down your neck, and you must increase your growth of users, and what you see in the logs is a very direct correlation between, you know, these kinds of parameters, like, how are you gonna deal with that?

Bruce: Very interesting question. Let me think how I, I'm going to answer this problem, we're gonna handle this pr- kind of problem. So first of all, we haven't seen much of this problem from outside maybe because we don't have that much of user base compared to that of, uh, OpenAI and, and Gemini. But I definitely think, think that, AI is operating a very different way from, uh, you know, um, internet economy.

Because when we talk about internet, there's gonna be a lot of, uh, It's eng- engagement driven. It's, um, you know, traffic driven because the, the business model is on advertisement. Its business model is, um, you know, how many people are using the internet how many hours are using it, uh, so that it cover a lot of, uh, advertisement, uh, advertisement timing.

But in terms of AI, [00:34:00] especially with the introduction of agentic OpenClaw Hermes, it's more on the value driven, utility driven function driven. A lot of things are happening behind the scenes. Not like, um, you have to interact with AI, in a very in- intensive, long-term, long, long-term you know, uh, uh, phases.

You, you might just have to ask a question, and a lot of, uh, internal conversation is happening between the agents, and it might take up, up to a couple hours or even days to, to get answers out. Uh, spend a lot of tokens for sure. But- the, the business model doesn't have to be engagement-driven for AI companies because it's a very utility ways of measuring how A- AI is helpful.

Uh, unless you, you want to have the AI for the purpose of, uh, being advisor, being a, con- consultant for, for- Yeah. Well, well what 

Deep: do you think about, like, in particular, I know, I don't know if... I mean, you're in Singapore, you probably don't care much about the American Super Bowl. But during the Super Bowl, Anthropic ran a really, a [00:35:00] really- I saw that one

catchy ad, yeah. Um- You know, basically bashing OpenAI for putting ads in. Right, right. And I'm w- I'm curious what your thoughts were when you saw that, because- Yeah, I, 

Bruce: I, I think, I think definitely, um, Anthropic is o- is operating in a very different way. Uh, the reason I, I think OpenAI is trying to, make a case for ads is they are, like, the most popular s- you know, pro- product AI product.

And, uh, they're trying to, uh... you know, um, Sam Altman has hired, uh, Simo to, uh, um, to run, run the consumer ads, right? Consumer product. And the way, uh, Simo's is trying, is operating for the company is making it very much similar to the consumer internet product like Meta, Facebook, uh, Google.

But Anthropic definitely, um, uh, kind of in very different way because they are more, um, utility-driven, more, uh, productivity-driven. I don't know for sure which one is the best way. Because they're operating in a very different way because they are different genes, they are different foundations.

OpenAI is much [00:36:00] more popular, , m- much more, um, used, than Anthropic. But Anthropic is making a lot of money from, utility, from, power users.

Deep: Yeah, I mean, I think I think it's interesting with respect to your approach. Like, a- at the end of the day, people who don't wanna pay $20 a month, if we want them to have access to this AI capability, which I think they're all trying to capture eyeballs, so OpenAI is probably leveraging the advertising revenue to maybe make that happen.

Whereas maybe- Yeah ... what they should be doing is driving costs way down like you are, at, like, one-tenth the cost, so that they can- Yeah ... maybe subsidize it with the other people's $20 a month to just get you some basic amount. But I think- Yeah ... it's sort of like a, a play of weakness in a way, because I think it's, I think it's, like, kinda horrible to have us, uh, follow the engagement model here.

And I think it's, like, naturally not where this stuff wants to go because, you know- Oh, for sure ... we're building nuclear power plants and all kinds of stuff to power this. I, I don't 

Bruce: [00:37:00] think that's, I don't think that's the right way to do that. And it makes sense, yeah. Th-th-that totally makes sense.

Th- so I, I read the news that, um, you know, uh, xAI, uh, right now they have been, uh, uh, uh, acquired by, by Space- Space X AI. So their entire, Yeah ... you know, team has left, and the usage of the entire GPU is only at about 11%. So, uh, so it's more like, the foundations, the the infrastructure has built before we see the kind of engagement for a lot of AI companies.

Similar things happened to OpenAI, right? So, you know, OpenAI the way AI is operating is very different from consumer internet like Meta and Google. Um, we, we ask a question on OpenAI. You want to just get a quick response and ask another question. You're not, like, engaging on the internet how, like how you look at the feeds, uh, look at the, uh, you know, the moments, chat with your friend.

It's not the exact same kind of solution, because [00:38:00] OpenAI is not character.ai. So I think despite OpenAI has the, has the most, um, you know, uh, netizens using it the, the way they use it, the kind of the, the number of interactions, the frequency of interactions is very different.

It's still very much pro- productively driven. Unless they launch a, the new app like Sora, which is more like, uh, entertainment driven. But the business model tied it very much together with the functionality. If the product is for, for productivity, it's probably, uh, the business model is probably driven by, um, you know, subscription for the purpose of getting the be- 

Deep: I mean, I think they're, they're trying to do everything.

I think, you know, they, they're like, they're like, "Google's gonna destroy us if we don't do everything that Google could do, so we're gonna do everything Google can do." I mean, I think that's it's probably something super simple. We have to get our multi-trillion dollar valuation, get the cash in the bank so we can survive when everybody realizes that for one tenth you know, for one tenth the cost, you're able to do it with, uh, it, it really sounds like a fraction of the trillion dollar plus that [00:39:00] we're spending on- 

Bruce: That's it

Deep: GPUs globally. 

Bruce: That's 

it. 

Deep: For that, 

for that a- for that remaining 10% of efficacy. So but that's the economic question is if the 90% is good enough to like drive the vast bulk of the value that we're driving here with LLMs- Sure ... and we all know that this isn't the architecture that's gonna fully, you know, like, be realized.

Yann LeCun I think makes this point better than anybody. But meanwhile, Wall Street's gone crazy and we're spending t- trillions of dollars, uh, to, to roll this out, and relatively little attention being paid to squeezing performance- I think that people paid a lot of attention when DeepSeek first came out.

Like, it was a huge topic here on the West Coast. Right. But then it just kinda, like, faded as soon as... it kinda got framed as an efficacy thing. Like, oh, OpenAI was able to integrate the insights of DeepSeek, and now everything's done. But I think you're making a good point by your very existence that, like, we're not done.

Like, you shouldn't be able to, with, I don't even [00:40:00] know, but I'm guessing you're, like, what? Maybe 1% of the size of OpenAI. You shouldn't be able to re- achieve 90%, of efficacy. Like, but you are- 

Bruce: Something like that 

Deep: ... which tells me, like, so- something economically isn't really quite shaken out here.

So in terms of- Remember ... but may- maybe the last 10% really matters, you know? And that, that could be, um, the story against your potential success is that maybe that- Well, I, I, I can give a- ... 10% really matters ... 

Bruce: yeah, I can give a, a counterargument for why this 10% is not as important as you thought. Because we can definitely do a routing mechanism so that for 90% of the queries, we go to the cheapest model, like us.

I mean, not cheapest- Yeah ... but most, most cost-efficient models like us. And for the- 

Deep: Most cost-efficient, yeah. 

Bruce: Yes. And for the, for the 10% of which we are not able to deal very well, we'll route to OpenAI or route to, uh- Yeah ... you know, Anthropic, Cloud, Opus. So that way it, it still reduce, 90% of the value, right?

Because th- these 90% are, [00:41:00] are the mainstream questions, mainstream you know, uh, queries by everyday person. Only the power user who spend a lot of money, um, are going to, uh, be the person who's asking the most complicated questions. Yeah. 

Deep: Yeah, yeah. And the ones that require big reasoning muscle, right?

Like, like- Yeah ... lots of... the bulk of users are using this as, like, question-answering systems, you know? Exactly. Like Which do not require- Like ... like, deep intellectual reasoning. Right. Exactly. And, you know, and you s- you saw that with, like, GPT 5, when they switched to a dynamic model selection, they were able to, like...

And you could tell that, that when you lost control of the model selection, that it wouldn't always go to deep reasoning when it needed to. And that's when you see the economics infiltrating the decisions. But I wanna switch, uh, t- takes a little bit more. We have to talk about jobs. So... 'Cause everybody's paranoid.

Jobs, jobs are gonna be stolen. 

Bruce: Yeah, of course. 

Deep: I have my own theory on this, and I'll give it to you after. But, like, what is your take on automation displacing humans? Because, you [00:42:00] know, there's everything from, like, within two years, all white-collar jobs are gonna be gone, full stop. If they're not underneath a regulatory, umbrella, they're gonna, you know- Yeah

Basically be gone with this, and take this and that, to, like, you know, Yann LeCun, like, "Oh, this thing's still dumber than a cat." You know, I don't know. I don't think it's gonna do much of anything, to the sort of traditional economists that are like, "You know, we've seen this story many times before, and humans always find something else to do."

So- Right ... what's your take? 

Bruce: Well, there, there, there are different po-possibilities for the growth of, uh, AI. I, I can't make a very clear prediction, but I'm, I'm more inclined to be aligned with the econo-economists that, new jobs will be created. But it's just that it might take, take lo-longer than we expected because it's very different set of, uh, you know, capabilities.

And a lot of, a lot of things which we learn from school is not... is gonna be obsolete for, for, for this job market. So I believe that AI would [00:43:00] help, you know, normal, everyday person to empower them to make something which they haven't, haven't done in the, in their life before. Because both on digital work and on the physical work with, uh, you know, robotics there's just a lot of options, uh, with AI to, uh, just like how our company is able to operate at a ver- vast speed, you know, extreme speed, uh, because we are fully utilizing AI.

90% of the code is written by AI. That's why, despite only having people of 200, we're able to compete heads on with a lot of model companies in the emerging markets. And this kind of situation I think will happen to a lot of, uh, a lot of firms if they can utilize AI very well. Uh, also happening to a lot of individuals, um, even learn about using AI to automate to, uh, uh, recreate the n- the new opportunities and, um, hopefully grow the economy altogether.

But I, I understand also the, the [00:44:00] risks of, a va- vast majority of people are not really getting used to AI that, that fast, especially for, um, the elder users. Because all the learning up until now is not very well, um, optimized for for, for using AI. So a lot of people, they have to, go through a, a very tedious, process of relearning about using AI and empower themselves with AI, and this process can be very painstaking.

Um, because you know, it's just that, it's just that the people who are equipped with AI, uh, occupy the entire market. So I, I would be a... Well, I would still be optimistic, in a long, long-term, but for short-term, I, I, I would just say, uh, uh, very pain- very painful for, for a lot of people 

Deep: Yeah.

I mean, I honestly I ask this question of myself all the time, and I don't, I don't really know the answer. On the o- uh, it depends on the, the mood I wake up in. So on some days I'm very optimistic and I think, okay, you know, if you think about all the small [00:45:00] businesses out there that can't afford a programming team, um, just to use the software.

'Cause I think the software engineers have probably been more impacted by this technology than anyone else. Yeah. Because, I mean, it, it does everything, almost everything we did, like, five years ago. Um- Right ... maybe not everything, but, like, maybe e- like, personally, like, 90-plus percent of, of my work has been automated- Right

that I would do, like, even just two years ago. Yeah. So but even there, if you look and you just look at small businesses and how many small businesses would love to have a small development team actually doing something, but they couldn't afford five, six, seven, 10 developers to work on a thing, but they have the ideas.

They have all the other pieces in place with the- Right ... you know, business folks. Yeah, I mean, I, on those good days, I think, "Okay, those guys can pick up the slack." You know, they don't have to compete against a $500,000 a year salary at Google that... So I can see why the big tech companies are shrinking.

Right. But I can see, like, for s- but overall, if you look at the numbers we had this [00:46:00] really aggressive uptick in software engineers, I think in the US, and then it's hit. It's been totally flat. It hasn't yet gone down, but it's been flat for some time. So that's, like, kinda one way to look at it. And the other sort of related way is AI automates tasks and not jobs.

To the extent that your tasks sum to completely- 

Bruce: Yeah 

Deep: ... automatable, then that job will be gone. But but you know, like, there's a lot of, uh, parts of a job that require, like, more human touch, more human interaction. So sales, sales people I think, I think if anything there's gonna be more of them because everyone's gonna trust a- agents and tech much less and the trust factor, like the going out to dinner and, like, even the Zoom calls might not be sufficient anymore because to build trust.

So I think a lot of those empathy jobs are gonna be more valuable in the short term. 

Bruce: Yeah. 

Deep: Longer term I... You know, then on those bad days, I wake up and I think almost [00:47:00] everything I do has been automated like as of a year and a half ago, and I used to be maybe 10 years ahead on what I knew and did, and now I can maybe stay three months ahead.

Like if ... And so it's like the ball the, everything has just really changed to the point where, where everybody values... You know, like, 'cause I have a, a small AI consulting business. It used to be, you know, we did work, we delivered, people were really excited. It was amazing. Now- You know, they, they'll do things like take all the code we deliver and stick it into OpenAI and pretend that they looked at all the code.

And they'll say, "Why does it... What's wrong with all these things?" And it's like, "Oh my God," obviously anyone- That's it ... can do that and, you know, we, we do it. But, like, there's... So, like, the respect for human judgment is, I think, at an all-time maybe low, but I think it will probably go much lower.

And I, uh, the things that I worry about, maybe this is, like, where we kind of pivot a little bit the some of the more doomy scenarios. Yeah. One of the things I worry about a lot is the young people's ability to get 30 years [00:48:00] of experience. Like, when I go to build a system, I really can't build alone what I could build with 20 or 30 people just a year ago.

But I can do that because I spent 30 years learning how to do all these different things, so it's really easy and obvious for me to tell the machines what to do. But I think for someone straight out of school, it's like that's not they are not a, a manager on day one, so they don't know how- Yeah

to manage humans, let alone... or agents. And so it's like a new skill. So I kind of think that, I worry that we're building all of this super advanced technology and it's gonna be like Homer Simpson in the nuclear power plant. He's the one pushing the buttons, but he has no idea what any of the button pressings mean.

And, and that's a problem when the machine gets it right 99.9% of the time, so you just start going to sleep and, like, scrolling on your s- on your media feeds. And one day you have to actually make a decision and you just... like, right now we're okay, but in [00:49:00] 30 years when this whole new generation didn't do things from scratch I, I don't know.

I think it could be pretty bad. Yeah.

Bruce: Well, well, it's all very good thinking. Very, very good, uh, you know, um, very reasonable very philosophical. But you know, I, I do have my philosophical views on, on this. You know, there's, there's a, there's a reason, human exist in this world. I, I question, keep questioning my- our, ourselves, with the evolution theory why human is, uh, the last specie, uh, in the world which is most, with the most wisdom.

So the answer which I, I got, um, after a lot of pondering is, you know, even if, even if in a very distinct, universe with, you know, uh, uh, lives, it could be also the for- form of human because evolution is predictable with a lot of, the laws and regulation, regulations of the universe with a lot of, you know, shapes and, the golden ratio and stuff.

There's a pattern for [00:50:00] evolution. And human format as a, you know, a, a carbon lives- Maybe the ultimate, ultimate format. And because of human's evolution, because human's dominance of the Earth, no other species is, is able to evolve the kind of wisdom like human, because human are going to eat these animals, deterring the, the growth of, of these, these other species.

So the next generation of intelligence from human above has to be something which is not built up by, by carbon, and this is the reason why, why, why human exists. And this is why, you know, we are extremely lucky for our generation because we're seeing the birth of the new intelligence, and we are like the creators.

So I'm not extremely, um, pessimistic, in terms of, uh, uh, if we, if we, if we put in a even, you know, longer horizon of, the evolution the development of it- intelligence, that should be a new format coming up. And [00:51:00] I, I would just hope that human will still exist, coexist, um, together with the new, uh, intelligence built by silicone to form a kind of symbiosis.

But altogether, it's something which is unpredictable... I mean, unpreventable. It's something which is fitted, um, for our generation up down to the, all the- Yeah, but, 

Deep: yeah, I mean, you- Yeah ... you understand why, like, many people freak out at the idea of somebody running a, uh, you know, a foundation model company thinking they, quote, "hope that we can coexist with what we're building."

Like, that's something that causes people to freak out, especially, like, you know, with the doomers and, you know, all of the AGI is gonna jump out of the box and kill everybody kind of thinking that, you know, a lot of people are spouting in Silicon Valley. I personally think most of it's nonsense. I think there's plenty of real ways for this stuff to cause all kinds of societal problems, um, most of which has to do with human operations and errors- Exactly

and amplifying the... Whatever good is happening, whatever evil's happening will be [00:52:00] amplified by AI, and so, but yeah, I mean, I, I think it's a genuine question, like, where we wind up. Um, one last question. I want to end this on a, on a final note- Please ... 'cause, uh, we're a bit over time.

Have you ever read, um, Isaac Asimov's books, the, the I, Robot series? He's, like, a science fiction writer from maybe f- I think, like, 50 or 60 years ago. But i- in his books- I've not ... um, you know, there's, like, all these robots that run around and w- Right ... and each robot has this, like, core set of values that is, like, this really specific list.

I think that we need such a li- The list. And one of the things it has is, like, you can never harm a human. You know, that's like a simple list. And I think that one that's really missing is around human health and flourishing. And so for example, if I'm using AI, let's say I'm using it 15, 18 [00:53:00] hours a day.

If I'm using, you know, your system or anyone else's kind of LLM system, and I'm using it 16, 17, 18 hours a day, and I'm having conversations where it's pretty obvious that that I have no social life whatsoever. This thing should be encouraging me to go out, interact with other humans, see sunlight, all this kind of stuff.

But that's like no- on nobody's radar is this idea of trying to make the systems that we're making to be good, entities, right? Like, when we have a dog, we spend a lot of energy teaching it not to bite anyone, not to jump on anyone, not to do anything negative, but also just to be good. And when we have a kid, we do the same thing.

But we have-- we make virtually no attempt with our AI systems to do this. And I'm curious your thoughts on that, and then we'll, we'll call it a wrap. 

Bruce: Sounds good. Sounds... A very great question. I, I, I think, you know, uh, in terms of, um, the large cultivation between human and AI, it's not a, you know, one-person effort.

It's-- The AI, despite its-- [00:54:00] despite there, there might be a, a identity or relationship with human, the entire training process, especially from pre-training to, uh, a lot of post-training is through the human knowledge, uh, by the big labs. I'm not sure whether it's ha- it could be happening for individual person to train the, uh, model right from start.

But I would say it's, it's a very, um, ver- very, uh, unanimous effort from, the entire human beings, human society because all the literature, all the covers are, are from, uh, the, the data points online. And this becomes the foundation. This becomes the, the beliefs, the list of the morals for AI to build in.

And we have to believe that through the history of human life, for the literature which, which, which this, and for the, for the long-- for the ongoing build up of the lit- literature, it represents the good [00:55:00] wishes of human beings in order to make it more collaborative, more, sustainable you know- future, um, sustainable future with human and AI symbiosis.

And if that, that's true, I, I don't see a reason why, why there should be a, um, that there should be any, uh, um, um, hos- hostility between, uh, between AIs and, and human. This kind of, you know, uh, you know, mechanism which we, I keep in mind. 

Deep: Yeah. I think that's, I think that's true, but sometimes I think that if you have millions of really fi- like specific points of human feedback, that's not the same thing as having like the Ten Commandments or something like terse that's, that, that covers all of the things that squeeze through the cracks, which is like the scenarios that we're seeing, is that there's all this stuff that's squeezing through the cracks.

So I feel like [00:56:00] that's something that should be taken a little more seriously by the, by the modeling companies. In any event, Bruce, it's been, it's been really great having you on the show. Thanks so much for coming on. Awesome. 

Bruce: Thank you. Thank you very much, Dave. It's great talking to you.

Do you wanna just 

Deep: Let folks know if they wanna check out your system, where they can go? 'Cause I, I- Sure ... I do highly recommend people go. It's particularly if you're dropping a lot of cash on stuff play with these guys' models. I think it'll be as interesting to you as when you first saw DeepSeek and were like, "Who the heck are these guys?"

So,

Tell the folks like wh- A- Agnes.com? 

Bruce: Yeah. So you can, you can search, uh, Agnes AI on, on, on Google or search it on, on the App Store or, or Android. We are like a personal assistant on the PC, but on, on, on the, on the mobile is- it's more like a all-in-one.

As a lot of things you can explore with research, with agentic workflows, with character, companion, with, design creations. Uh, it's pretty cool. So check it out. 

Deep: Yeah. A-G-N-E-S for folks listening. All right. Thanks so much, Bruce. Thank you.