What Are Large Language Models?

When you talk to most people who have been casually following AI – unsurprisingly the first thing that comes to mind is Chat GPT. Open AI has done an absolutely masterful job of marketing its flagship product, which was the fastest-growing consumer product ever at one point.

But now that we’ve had more than a year of Chat GPT on the market, there are not only plenty of alternative options, but a desire in the business community to pause and understand what the technology is behind LLMs.

This article is intended to provide a basic overview of what LLMs are, some potential use cases, and what your options are – since Open AI is obviously not the only vendor of LLMs on the market.

What are LLMs?

Think of LLMs as sophisticated language learning machines. Trained on massive datasets of text and code, they develop an ability to understand and generate human-quality language. While chatbots excel at scripted interactions, LLMs go further, grasping context, nuances, and complex sentence structures. They can write different kinds of creative content, translate languages, analyze sentiment, and even answer your questions in an informative way.

What is the technology behind LLMs?

Architecture: LLMs are typically based on the Transformer architecture, introduced in the paper “Attention is All You Need” by Vaswani et al. in 2017. Without getting too into the weeds, transformers use self-attention mechanisms to weigh the importance of different words within a sentence, enabling a deeper understanding of context.

Pre-training: LLMs undergo an extensive pre-training phase on vast datasets of text. During pre-training, the model learns to predict the next word in a sentence given the previous words among other tasks designed to improve its understanding of language syntax, semantics, and context. This phase requires massive amounts of computational resources and time, which is why you hear about companies like NVIDIA really thriving in this environment.

Fine-tuning: After pre-training, LLMs can be fine-tuned on smaller, domain-specific datasets. This process adapts the model to specific tasks, such as question answering, sentiment analysis, or document summarization, enhancing its performance on those tasks by tailoring its responses to the nuances of the target domain. You are able to fine-tune existing LLMs like Open AI’s GPT 3.5 – with your own data.

The time and cost of building your own LLM would be prohibitively expensive in all likelihood. It is estimated that models like Anthropic, Google Gemini and GPT-4 are trained on trillions of words. So the best option for most of the world is to build products on top of existing LLMs rather than create your own (unless you are sitting on robust amounts of proprietary data).

How can you use LLMs?

I am separately writing some potential use cases for LLMs as part of an ongoing series in this column. But some of the most oft-used reasons to use these LLMs are:

Code generation
Writing marketing copy
Customer service
Translation

The list goes on, but the pace of innovation in the space is mind-boggling. As an example – Open AI just released a product called Sora in the past week – which allows those with access to generate one-minute long videos from text prompts.

What are some options outside of Open AI technology?

As I alluded to at the start of the article, there are many different LLM options available on market currently. One consideration is whether to use open-source LLMs, or closed-source. Open-source models offer transparency and community development, but might require more technical expertise and raise data security concerns. Closed-source models generally provide ease of use, support, and security, but can be expensive and limit customization.

Some LLMs to consider:

BLOOM – Science specific
PaLM – by Google
Claude – by Anthropic.
Cohere – Enterprise focused
Llama – by Meta

There are of course many other options, but be sure to research what the best option is for you as you embark on using AI for your company.

What's On

Apple’s iPad Pro Still Struggles To Beat The MacBook Pro

Here’s How Often You Should Review Your Estate Plan

Today’s ‘Wordle’ #1066 Hints, Clues And Answer For Monday, May 20th

‘Ghost Of Tsushima’ Is Already Flooded With Negative Reviews On Steam [Updated]

Wealth Advisors of Iowa LLC Has $484,000 Stock Holdings in JPMorgan Chase & Co. (NYSE:JPM)

Apple’s iPad Pro Still Struggles To Beat The MacBook Pro

Today’s ‘Wordle’ #1066 Hints, Clues And Answer For Monday, May 20th

‘Ghost Of Tsushima’ Is Already Flooded With Negative Reviews On Steam [Updated]

AirTag 2 With Better Location Tracking Now In Testing, Insider Says

The 12 Best Simulation Games

Will People Have Jobs?

Leave A Reply Cancel Reply

Here’s How Often You Should Review Your Estate Plan

Today’s ‘Wordle’ #1066 Hints, Clues And Answer For Monday, May 20th

‘Ghost Of Tsushima’ Is Already Flooded With Negative Reviews On Steam [Updated]

Wealth Advisors of Iowa LLC Has $484,000 Stock Holdings in JPMorgan Chase & Co. (NYSE:JPM)

AirTag 2 With Better Location Tracking Now In Testing, Insider Says

Lower Fifth Avenue retail scene thriving after several high-profile leases, expansions

The 12 Best Simulation Games

40 States That Don’t Tax Social Security Benefits

What's On

What Are Large Language Models?

What are LLMs?

What is the technology behind LLMs?

How can you use LLMs?

What are some options outside of Open AI technology?

Related News

Leave A Reply Cancel Reply