Products

Enterprise end-to-end

Testim

Custom web and mobile apps

Testim Salesforce

Test Management

qTest

Enterprise test management

Test Management for Jira

Native Jira test management

Vera

Digital validation

Mobile Testing

Testim Mobile

Custom mobile app testing

Tosca Mobile

End-to-end mobile testing

Performance Testing

NeoLoad

Load and performance testing

Data & Quality Intelligence

SeaLights

LiveCompare

Change intelligence for SAP

Data Integrity

Explore all products

Featured event

Introducing remote MCP, Agentic Test Automation, and AI workflow capabilities

We're extending our proven expertise in test automation with industry-leading agentic AI innovation. See it first here.

Learn more

Solutions

Featured event

Smarter strategies for Workday testing: From getting started to scaling with automation and AI

Learn how to build a Workday testing strategy that minimizes disruption and builds user confidence.

Learn more

Services & Support

Resources

Contact

Company

Management team Careers News Locations Partners

Blog Customer Portal

Trials & demos

Artificial intelligence

Key takeaways from our research: The rise of Large Language Models – transforming AI and beyond

In this blog, we explore the evolution, applications, training methodologies, and ethical considerations of LLMs summarized from our recent research published in the peer reviewed journal Computers, Materials and Continua.

Apr. 01, 2025

Author: Rhona Asgari

Large language models (LLMs) have redefined artificial intelligence (AI), pushing the boundaries of natural language processing (NLP) and enabling machines to understand, generate, and manipulate human-like text. From chatbots and content creation to legal and medical applications, LLMs are transforming industries at an unprecedented pace. But what makes these models so powerful? How do they work? And what challenges do they pose? In this blog, we explore the evolution, applications, training methodologies, and ethical considerations of LLMs summarized from our recent research published in the peer reviewed journal Computers, Materials and Continua.

The evolution of Large Language Models

The journey of NLP began with early models like Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks, which could process sequential data. However, these models struggled with long-range dependencies and suffered from vanishing gradient issues. The introduction of Transformers in 2017, specifically the Attention Is All You Need paper by Vaswani et al., revolutionized language modeling. Transformers leverage self-attention mechanisms, allowing models to analyze entire input sequences simultaneously rather than processing words one at a time.

This breakthrough led to the development of powerful models such as BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer). With each iteration, these models grew in size and complexity. The latest versions, like GPT-4 and PaLM-2, boast hundreds of billions to over a trillion parameters, unlocking new capabilities in text generation, translation, and semantic understanding.

How are Large Language Models used?

LLMs have found applications in a wide range of fields. Below are some of the key areas where they are making a significant impact:

Healthcare – medical documentation, patient summaries, clinical decision support
Finance – automated reporting, fraud detection, sentiment analysis
Legal and compliance – contract review, legal document analysis, compliance monitoring
Education – personalized tutoring, automated grading, research paper summarization
Software development – code generation, bug detection, documentation assistance
Marketing and advertising – social media content generation, personalized marketing campaigns
Entertainment & gaming – NPC dialogue creation, scriptwriting assistance
Customer service – AI chatbots, automated email responses
Translation & linguistics – real-time translation, dialect preservation

How are LLMs trained?

LLMs go through two major training phases:

Pre-training: The model is exposed to massive datasets, learning linguistic patterns, sentence structures, and contextual relationships in a self-supervised manner. This phase equips the model with broad language understanding but lacks task-specific fine-tuning.
Fine-tuning: The model is further trained on domain-specific datasets to enhance its performance on particular tasks or domains. Various fine-tuning techniques, including instruction fine-tuning and parameter-efficient fine-tuning (PEFT), optimize models for specific applications with minimal computational overhead. The former trains LLMs on structured datasets with human-like prompts to improve response accuracy, while the latter adapts models to new tasks with minimal computational overhead by modifying only a small subset of parameters.

Learning from context: Zero-shot and few-shot learning

LLMs can adapt dynamically to new tasks using in-context learning, where the model generates responses based on provided examples without explicit re-training.

Zero-shot learning: The model performs a task without any prior training.
One-shot learning: The model learns from a single example.
Few-shot learning: The model generalizes from a small set of examples.

These techniques enable LLMs to be flexible and adaptable, even with limited labeled data.

Aligning AI with human preferences

One of the biggest challenges with LLMs is ensuring they generate useful and ethical responses. This is where reinforcement learning from human feedback (RLHF) comes in. RLHF refines a model’s outputs by incorporating human preferences, reducing bias, and improving factual accuracy.

How does it work?

A reward model is trained to evaluate LLM responses.
The model is then optimized using reinforcement learning algorithms like proximal policy optimization (PPO).
The updated LLM generates more aligned and ethical outputs.

Before applying RLHF, LLMs undergo supervised fine-tuning (SFT), where they learn from human-annotated responses to align with desired behaviors. RLHF further refines these models by training a reward model to rank outputs based on quality and alignment with human preferences. However, this process involves trade-offs, such as reward hacking, where models optimize for high reward scores rather than true correctness, potentially reinforcing biases or generating overly cautious responses. Balancing alignment with diversity and creativity remains a key challenge in RLHF implementation.

Addressing AI hallucinations: Retrieval-Augmented Generation (RAG)

While LLMs are impressive, they sometimes hallucinate, generating false but plausible-sounding responses. One solution is Retrieval-Augmented Generation (RAG), which allows models to pull in real-time information from external sources like databases and websites.

For example, instead of relying solely on pre-trained knowledge, a RAG-powered AI assistant could retrieve live stock market data before generating a financial report, ensuring factual accuracy.

Ethical concerns in AI

As LLMs become more integrated into daily life, ethical concerns arise:

Bias & fairness: Models trained on biased data can perpetuate stereotypes.
Privacy risks: Using publicly available data for training raises concerns about data security.
Misinformation: AI-generated content can be misleading or entirely false.
Environmental impact: Training large models consumes vast amounts of energy.

To combat these issues, researchers are developing bias-mitigation techniques, fact-checking mechanisms, and energy-efficient model architectures to make LLMs more responsible and sustainable. Moreover, federated learning enables LLMs to train on distributed data without transferring sensitive information, enhancing privacy. It’s especially useful in sectors like healthcare and finance, where data security and compliance are crucial.

The future of LLMs

The next generation of LLMs is expected to introduce:

More efficient architectures: Reducing computational costs while maintaining high performance.
Better context understanding: Recognizing sarcasm, idioms, and cultural nuances.
Multimodal AI: Combining text, images, and audio for richer AI interactions.
Personalization: Tailoring AI experiences to individual user preferences.
Explainability & transparency: Improving model interpretability for safer AI deployments.

Conclusion

LLMs have already reshaped AI-powered applications, from business automation to creative writing. However, challenges related to bias, misinformation, and ethical AI use remain at the forefront of research. By refining model architectures, enhancing transparency, and integrating real-time retrieval systems, LLMs will continue evolving toward more accurate, ethical, and efficient AI solutions.

As AI continues to push boundaries, one thing is certain: LLMs are here to stay, and their impact will only grow in the years to come.

Author:

Rhona Asgari

AI Research Lead

Date: Apr. 01, 2025

Author:

Rhona Asgari

Date: Apr. 01, 2025

Topics:

Artificial intelligence

Key takeaways from our research: The rise of Large Language Models – transforming AI and beyond

The evolution of Large Language Models

How are Large Language Models used?

How are LLMs trained?

Learning from context: Zero-shot and few-shot learning

Aligning AI with human preferences

Addressing AI hallucinations: Retrieval-Augmented Generation (RAG)

Ethical concerns in AI

The future of LLMs

Conclusion

Rhona Asgari

AI Research Lead

Rhona Asgari

Recommended

You might also be interested in...

How AI transforms the role of testing: 3 opportunities for quality engineering professionals

The future of testing is agentic: What does this mean for you?

4 ways generative AI will transform the way you manage testing