Participants
25 (aged 18+)
Duration
3 months
Year
2023
Tools used
Miro
Pencil & Paper
Figma
Notion
Role:
Responsibilities:
Research Planning
User Reseach
Overall Process
Discover
Created a Research Plan
Semi-structured Interviews
Group Discussions
Literature Review
Context-of-Use Analysis
Define
Empathy Mapping
User Requirements Generation
MoSCoW Prioritization
User Requirements Specification
User Journey Mapping
Design
Hand-Drawn Sketches:
Low-Fidelity Prototypes
High-Fidelity Prototypes
Design System Creation
Deliver
Heuristic Evaluation
First-Click Testing
System Usability Scale (SUS) Testing
Iterative Refinement
Research and Discovery
I started with a literature review to identify gaps in existing language acquisition methods and highlight the potential of integrating VR and AI for immersive, contextual learning. The review justified the project’s focus on addressing user challenges like language anxiety, lack of practical practice spaces, and low engagement, emphasizing the importance of contextual scenarios. Insights informed a research plan centred on observational studies and semi-structured interviews with 10 non-native English speakers to explore user interactions and perceptions. Ethical considerations included obtaining informed consent, anonymizing data and addressing cybersickness by offering breaks or screen-based alternatives.
Research Plan & Ethics approval
The research adhered to ethical guidelines, ensuring participant consent was obtained through consent forms outlining the study's purpose, procedures, and participants' rights. Participants were also briefed about the study before it began, and their data was guaranteed to be confidential and anonymised. Ethical approval was secured from the faculty. The study explored Virtual Reality (VR) and Artificial Intelligence (AI) as future technologies for solving societal issues, emphasizing the need for ethical considerations. Participants were warned about potential Cyber-sickness caused by VR and advised to report discomfort. The study was designed as a seated experience for safety, and age compliance was enforced, as younger children are more prone to confusion between reality and VR.
Literature Review
The literature review aimed to explore existing research on language acquisition, the application of Virtual Reality (VR) and Artificial Intelligence (AI) in learning, and the potential of combining these technologies for immersive language-learning experiences. The review was critical in identifying gaps and limitations in current methods, which the project sought to address.
📚 Language Acquisition Theories
Key Finding: The literature suggests that a move from passive learning (memorization) to active language acquisition (practical use) is crucial for fluency. This insight shaped the project’s emphasis on creating an immersive and interactive VR environment where users could practice language in context.
Design Implication: The project aimed at creating an active language acquisition platform that allows users to practice their language skills in a contextual environment.
😖 Language Acquisition Challenges
🕶️ Virtual Reality
🤖 Artificial Intelligence (AI)
⁉️ Challenges and Considerations in Using VR and AI
Competitor Analysis
The competitor analysis was essential for understanding the strengths and weaknesses of existing language learning platforms and identifying gaps in the market. By analyzing Duolingo, ELSA, Rosetta Stone, Google Translate, and Glossika, we were able to identify their strengths and weaknesses:
Duolingo excels at engagement but lacks immersive, real-world language practice.
ELSA focuses on pronunciation but doesn't provide complete language immersion.
Rosetta Stone offers solid foundational learning but lacks real-time interaction or situational practice.
Google Translate is useful for quick translation but doesn’t support structured learning or immersive practice.
Glossika enhances fluency but lacks interaction and conversation.
The competitor analysis highlighted the need for an Immersive VR-AI Solution:
The tools reviewed mostly provide static lessons or translations but don't simulate real-world, dynamic language practice environments.
None offer the contextual, interactive immersion that VR and AI can provide, which was a core insight for the project's design.
Define
The Problem
Language learners, particularly non-native speakers, face significant challenges such as language anxiety, lack of immersive practice environments, and limited engagement due to traditional and non-interactive learning methods. Current tools, like mobile apps and classroom settings, fail to provide contextual and judgment-free spaces where learners can practice language fluency in real-world scenarios. Additionally, these tools often lack personalization and dynamic interactions, which are essential for practical language acquisition.
Setting Design Goals
Based on research insights, the following design requirements were identified to create an effective and immersive language-learning experience:
Context-Based Learning Environment
Research by Kang (1995) emphasizes that learning in real-life situations enhances recall, listening comprehension, and knowledge transfer. A restaurant scenario was selected as the virtual environment, as it provides a relatable and practical context for language learners. This setting ensures that learners can practice vocabulary and phrases directly applicable to daily life.
Focus on Practical Vocabulary
Drawing from Maslow’s hierarchy of needs (1943), which highlights food as a basic human necessity, the restaurant scenario allows learners to acquire vocabulary related to Physiological needs (e.g., ordering food, drinks, and basic interactions). This aligns with foundational language needs and supports functional communication skills.
Interactivity and Social Dynamics
The restaurant setting provides natural opportunities for interactivity through props (e.g., menus, tables, utensils) and multiple characters. Learners can practice diverse dialogues, such as ordering from an AI-driven waiter or engaging in casual social conversations. This dynamic environment fosters both structured and spontaneous language practice.
Relevance and Immersion:
Restaurants are universally recognized as social spaces, making them ideal for learners to build confidence in conversational scenarios. The inclusion of interactive elements and multiple NPCs allows for realistic role-play and simulation of real-world challenges, such as multi-character conversations and active listening.
User Persona
Creating the user persona helped the project by focusing design decisions on the needs of the target audience, such as reducing language anxiety and providing immersive, real-world practice scenarios. It ensured the VR environment and AI interactions aligned with users' goals, like preparing for social interaction, and addressed pain points like lack of conversational opportunities.
Design
Moodboarding
The moodboarding phase helped define the visual aesthetics and ambiance for the virtual restaurant environment. It focused on elements crucial for immersion, such as seating, a counter, a counter server, lighting, POS, and food, using copyright-free images from Unsplash compiled in Figma. The moodboard guided the design of the AI counter-server’s appearance and placement to align with real-world expectations. Feedback from the pilot run highlighted the need for additional elements like people, commotion, and a menu, which were incorporated into the final design to enhance realism.

Prototyping
Challenges in Prototyping
Usability Test Takeaways
Theme 1: Language Anxiety
Description: Participants expressed confusion regarding the boundaries between interactable and non-interactable NPCs.
Quotes from the research participants
“I wasn’t sure if I could talk to the NPC, they looked like they were just part of the environment.”
“Some NPCs were not clearly marked, and I ended up approaching them thinking I could interact.”
Insights / Design Implications: The lack of clear signifiers for interactable NPCs led to user confusion and hesitation in initiating interactions. This suggests a need for clearer affordances or visual cues to distinguish interactable NPCs from non-interactable ones.
Theme 2: Realism and Immersion
Description: Cognitive demand refers to how mentally taxing the interaction is. In the case of contextual engagement, cognitive demand will be determined by the extent to which the user has to think about or adapt to the context and the interaction itself. The restaurant setting helps users generate topics, reducing cognitive load for topic initiation. Since the context provides a framework (menu items, restaurant-related questions), users don't have to spend mental effort coming up with topics. This means cognitive demand is lower when engaging with NPCs about food, drinks, or restaurant items.
Quotes from the research participants
"Yeah, I think yes. For me it’s the situation is very important for me to practise."
Insights / Design Implications: The restaurant context naturally reduces cognitive effort in topic generation. To improve this, adding more context-specific cues or prompts could further reduce cognitive load and enhance engagement.
Theme 4: Trust in AI
Description: Participants had mixed feelings about trusting the AI, with some expressing concern about glitches, while others were neutral or trusted the technology.
Quotes from the research participants
“I don’t trust the AI 100%, because I’m worried it might glitch.”
“I trust it to translate things just like Google Translate, so I’m fine with it.”
Insight: Trust in the AI was contingent on its reliability, particularly around performance consistency. Most participants had either neutral or positive feelings toward the AI, though some users expressed concerns about potential glitches.
Theme 3: Realism and Immersion
Description: Realism played a significant role in motivating users to engage with NPCs, but there were varying levels of immersion based on NPC features.
Quotes from the research participants
“I wasn’t sure if I could talk to the NPC, they looked like they were just part of the environment.”
“Some NPCs were not clearly marked, and I ended up approaching them thinking I could interact.”
Insight: Users were more engaged with the NPC that exhibited realistic behaviour (e.g., speech) but expressed a desire for further realism in non-interactive elements.
Theme 5: Initial Uncertainty About Interaction Boundaries
Description: Participants expressed confusion regarding the boundaries between interactable and non-interactable NPCs.
Quotes from the research participants
“I wasn’t sure if I could talk to the NPC, they looked like they were just part of the environment.”
“Some NPCs were not clearly marked, and I ended up approaching them thinking I could interact.”
Insight: The lack of clear signifiers for interactable NPCs led to user confusion and hesitation in initiating interactions.
Theme 6: Ease of Controls
Description: Participants found the controls difficult to use, particularly the need to press the space key to initiate conversations. It was also noted that with the VR headset on, using the space key was awkward.
Quotes from the research participants
“I couldn’t press the space key easily with the VR headset on. It was awkward.”
Insight: The difficulty accessing the space bar button affected the fluidity of interactions, suggesting a need for more intuitive control methods. This indicates that the controls were not optimally designed for VR, suggesting that alternative methods (e.g., proximity-based triggers) would improve usability.
Accessibility through audio and text
Theme x:Multimodal Learning
Description: Participants, especially those with low fluency or accessibility needs (e.g., glasses wearers), found the dual delivery method helpful in improving understanding. We discovered that combining text (visual) and audio (auditory) content significantly enhanced user comprehension and engagement in the VR language acquisition experience. This insight emphasized the effectiveness of multimodal learning in creating a more inclusive and accessible VR environment for diverse users.
Quotes from the research participants
“Having both text and audio was a game-changer. When I didn’t understand the audio, the text made everything clear."
"The combination helped me retain more vocabulary because I could hear it and read it at the same time."
"I wear glasses, so the auditory feedback was perfect. I didn’t have to strain to see the text while wearing the headset."
Insight: Incorporating multimodal learning proved to be a simple yet highly effective solution for improving both understanding and accessibility in the VR language acquisition process.
Impact
This case study explored how VR and AI can create an immersive and effective language acquisition platform. By examining interaction design factors, user experiences, and the integration of multimodal learning, the study identified key guidelines for enhancing language acquisition. The findings demonstrate the potential of immersive environments to support contextual learning, reduce language anxiety, and provide dynamic, personalized language practice.
Looking ahead, the case study highlights areas for future exploration, such as gamification, expanded AI interactions, and the inclusion of diverse language settings. These advancements could foster more engaging, inclusive, and effective language learning experiences. This research lays the foundation for designing innovative platforms that blend technology with pedagogy, offering learners worldwide an accessible and powerful tool for language acquisition.