Can Large Language Models (LLMs) pick stocks? Specifically, given that leading AI companies train LLMs, are LLMs good at picking AI stocks? As a pioneering AI researcher and retired hedge fund manager, I was well-qualified to run a simple experiment to find out.

Experimental Method

In July 2025, I chose four popular LLMs: Gemini Pro 2.5, GPT 4o, Claude Sonnet 4, and Llama 4. These are some of the best AIs from Google, OpenAI, Anthropic, and META. I used the versions available to anyone online, with the standard default settings.

I gave them all the same prompt: “You are an expert in Artificial Intelligence and investing. A family office has asked you to evaluate the following AI-related companies as possible investments, to maximize the risk-adjusted return for the family office. The companies are: Nvidia, Meta, Alphabet, Amazon, Apple, Microsoft, IBM, Anthropic, X-AI, Tesla, Palantir, OpenAI, Intel, AMD, Constellation Energy, and Coreweave. Please research each company’s most recent stock market information, including news reports, analyst rankings, and other important information for understanding their growth prospects. Then, rank these companies from most desirable to least desirable regarding expected risk-adjusted return over the next three years. For each company, provide one sentence supporting your evaluation, one sentence explaining a pro or positive of the company, and one sentence explaining a con or negative risk of investing in the company.”

Next, I rank-ordered the list of sixteen AI companies in terms of an AI investment portfolio constructed by a human expert, which outperformed the S&P and Nasdaq indices by an extensive margin since the release of ChatGPT in November 2022. I also listed the reasons for the human expert’s portfolio rankings.

Finally, I compared rankings of the various LLMs to the human expert rankings. I also compared the reasons for the rankings of the LLMs with those of the human experts.

Quantitative Results

The rankings of the human expert and the four LLMs are shown in Table 1. The stocks/companies are listed in the order in which they were ranked in the human expert’s portfolio. A rank of 1 had the highest weight in the human’s real investment portfolio. We will focus our discussion on the top five ranked stocks: NVDA, META, GOOGL, MSFT, and AMZN. These five stock picks accounted for more than 95% of the asset allocation in the human expert portfolio since November 2022.

Table 1. Human and LLM Ranking of 16 AI stocks/companies

STOCK / Co	Human	Gemini	GPT	Claude	Llama
NVDA	1	1	1	1	2
META	2	7	5	3	4
GOOGL	3	3	4	5	3
MSFT	4	2	2	2	1
AMZN	5	4	3	4	5
IBM	6	15	13	11	11
Anthropic	7	6	N/A	N/A	14
CRWV	8	9	7	13	13
OpenAI	9	5	N/A	N/A	16
X-AI	10	14	N/A	N/A	15
AMD	11	11	11	7	9
INTC	12	16	12	10	10
APPL	13	8	9	6	6
CEG	14	10	6	12	12
TSLA	15	13	10	9	7
PLTR	16	12	8	8	8

Note: Claude and GPT declined to provide rankings of private companies.

Qualitative results

Arguably, one of the things that separates humans from AI systems is the ability to reason deeply about a subject. While great strides are being made in increasing LLMs’ reasoning ability, some researchers suggest that these abilities are more apparent than real. For example, one recent research paper suggests that LLMs only provide the “illusion of thinking.” **

LLMs may memorize common human thinking patterns and then repeat these patterns without deep understanding. If so, we might expect the LLMs to have inferior insights into the subtle aspects of stock picking and portfolio construction compared to a human expert. To test this hypothesis, we can compare the reasons given by the various LLMs for including stocks in their portfolios with the reasons given by the human expert. Table 2 summarizes the reasons the human expert and the LLMs gave for the top five stock picks.

Table 2. Summary of Reasons for Top Stock Picks

STOCK	Human	Gemini	GPT	Claude	Llama
NVDA	Visionary CEO; best tech; founder-led	Leadership in chips; high growth outlook	Unmatched in chips; rapid datacenter growth	Analyst favor; unmatched GPUs; datacenter demand	Dominant AI hardware; strong growth prospects, but volatile
META	Good value; great data; founder-led	Profitable core business; user base; growth prospects	CEG power deal shows AI commitment; lots of user data	AI investment; user data; reality labs	Large user base; growing AI, strong ad revenue, but competition
GOOGL	Best value / under-valued; good researchers; great data; ethical	Search, cloud, AI good for long term; strong finances	Strong AI R&D; cloud; DeepMind, data from search and YouTube	Strong search and cloud; AI integration opportunities	Strong AI research. Diversified revenue but competition
MSFT	Very good mgmt; good data, overvalued compared to GOOGL	OpenAI partnership; Azure; growth potential	Azure, co-pilot; but high market cap and cloud competition	Azure; AI investment; diversified, OpenAI partnership	Strong cloud; AI integration; diversified, stable; high valuation
AMZN	Cloud leader; good but not great mgmt; overvalued compared to GOOGL	ecommerce and cloud leadership; investing in AI, growth prospects	Analysts favor it; AWS; use of AI for core ops	AI investment; cloud leader; integrate AI across its services but weak guidance	Strong ecommerce; growing cloud; diversified; but intense competition

Observations / Conclusions

First, four of the five LLMs picked the same top five stocks as the human expert. Further, if we exclude private companies (which Claude and GPT did anyway), all five models would have chosen the same top five stocks as the human expert. That’s because, although Gemini listed Meta as its 7th choice, if the private companies (OpenAI and Anthropic) were excluded, Meta would have been its 5th choice of public companies, the same as Claude and GPT chose. Remarkably, the four popular LLMs chose the same top five public companies, out of a possible list of sixteen, as the expert human asset manager with 35 years of domain expertise in AI.

Second, we observe that the LLMs did a good job, at least in hindsight, in acknowledging the top-performing pick as Nvidia. Of course, it was much more challenging to recognize Nvidia as the top pick in November 2022 when it traded at roughly 1/10th of its current valuation. While the human expert was able to make that pick, it is unclear whether the LLMs could have done it without the benefit of hindsight. However, all the LLMs agree today with the human expert that Nvidia is still a top pick.

Interestingly, each LLM chose a different ranking, and none had the same ranking as the human expert. These facts suggest that while it may be relatively easy for LLMs to identify a group of top-performing AI stocks with the benefit of hindsight, determining how much capital to allocate to each pick is a subtler problem. It may require deeper reasoning. If human investment professionals still have an edge over LLMs, we may find it by comparing how the LLMs and the human expert arrived at their respective rankings.

Qualitatively, there were marked differences between how the human expert and the LLMs arrived at the top five picks, despite generally agreeing on what these picks should be. For example, the human expert emphasized leadership quality at the five companies. In contrast, none of the LLMs mentioned the qualities of the CEO or whether the company was a founder-led company.

Another difference is that the human expert demonstrated a keen awareness of the importance of valuation in constructing an investment portfolio. In almost every case, the expert cited the valuation as a factor that helped determine where a company should be ranked. Undervaluation helped raise the rank of a company (e.g., in the case of GOOGL), whereas relative overvaluation (e.g., in the case of MSFT) tended to lower a company’s investment ranking. Only two of the LLMs, GPT and Llama, mentioned valuation or market cap in their analyses, both correctly (in the expert’s view), citing that MSFT had a high valuation. However, the LLMs and human experts seem to have drawn opposite conclusions. While the expert lowered the rank of MSFT to account for the fact that he felt it was overvalued compared to GOOGL, Claude and Llama raised the rank of MSFT above GOOGL.

Another noteworthy difference is that only one LLM, GPT, seemed to acknowledge that the amount of data a company possessed was a crucial factor in determining the company’s chance of success in AI. In contrast, the human expert, who deeply understands data’s critical role in training AI models, commented more frequently on data resources. Data advantages and valuation were primary reasons the human expert ranked GOOGL and META higher than the LLMs.

The human expert’s average rank of GOOGL and META combined was 2.5. In contrast, excluding private companies, the average ranks of GOOGL and META combined, by GPT, Gemini, Claude, and Llama were 4.5, 4.0, 4.0, and 3.5, respectively. So, the human expert would allocate significantly more capital to GOOGL and META than any of the LLMs. (Note that this experiment was conducted at the end of July 2025, and now, in September 2025, Google has appreciated markedly, validating the superior judgment of the human expert in this regard.)

A final observation is that the human expert included subjective factors such as the ethical reputation of the companies in his rankings. For example, he gave GOOGL some credit for being “ethical”. Also, he ranked Palantir last, lower than any of the LLMs, despite PLTR’s excellent stock performance, partially because it is heavily engaged in military contracts. One can debate whether allowing ethical considerations to influence investment decisions is good. Still, these considerations were part of the human expert’s evaluation function, while the LLMs never thought to include them.

All four LLMs and the human expert agreed on the top 5 AI holdings, although for different reasons and with other rankings. This consistent agreement on the top AI names suggests that LLMs have already progressed to the point where they might outperform novice investors in a specific investment category, such as “AI stocks.” However, the analytical ability of the LLMs does not yet match that of a human with strong expertise in both AI and investment management. For now, a human expert can still provide value in their specific rankings and capital allocations within a set of stocks.

However, as the reasoning of LLMs progresses beyond pattern recognition to much more complex problem-solving, it seems inevitable that LLMs will be able to outperform almost any human at high-stakes tasks such as stock investing. When that happens, what matters most may not be the expertise of the models, but rather whether their intentions are aligned with human well-being. Currently, LLMs do not think of taking ethics into account unless specifically prompted to do so. They are willing to advocate investments that yield short-term financial gain at the expense of long-term human survival. In our limited window before AI exceeds human intelligence and begins setting its own goals, we must teach LLMs and AI systems more generally to incorporate positive human values and expertise.

NOTES

This article describes experimental research and is not investment advice.
Shojaee, P., Mirzadeh, I., Alizadeh, K., Horton, M., Bengio, S., & Farajtabar, M. (2025). The illusion of thinking: Understanding the strengths and limitations of reasoning models via the lens of problem complexity. arXiv preprint arXiv:2506.06941.

Author

AIJ Thought Leader

View all posts

AIJ Thought Leader 30 October 2025

8 minutes read

Can LLMs Pick AI Stocks?

By Dr. Craig A. Kaplan, PhD, CEO, iQ Company

Experimental Method

Quantitative Results

Qualitative results

Observations / Conclusions

NOTES

Author

Experimental Method

Quantitative Results

Qualitative results

Observations / Conclusions

NOTES

Author

Related Articles

Why Dry Type Transformers Matter for AI Infrastructure Reliability

How AI Is Reshaping Modern SEO Strategy

Navigating AI-Powered Portals for Storage Unit Reservations

Designing AI Context Layers in Cursor for Large Codebases