Jangho Seo ← Projects

Case study · Recommendation & Knowledge Tracing

Personalized Learning at Scale

AI for more equal learning: estimate what each learner knows so they get the right difficulty — not problems so hard they give up — and recommend the courses they'd actually care about. Across an AI digital textbook and a learning platform serving millions.

Role
AI Engineer · TmaxEduAI
Timeline
2023 – 2024
Stack
Python · PyTorch · Airflow · SQL · Java
Focus
Knowledge tracing · Recommendation systems · MLOps

01. Problem

One-size-fits-all content fails learners in two opposite ways. Give a struggling student problems that are too hard and they lose motivation — the fun drains out. Drop a motivated learner into a sea of courses with no guidance and they can't find what's actually relevant to them.

Closing both gaps means answering two questions well — what does this learner currently know? (so you can serve the right difficulty) and what would they actually want next? (so the path stays relevant) — for new learners with little history (cold-start), live, at scale.

02. The AI digital textbook (math)

For a new math digital textbook, the goal was adaptive practice — the right problems for each student. The obvious tool, our team's deep knowledge-tracing model, needs a lot of per-student solving history that a brand-new product simply doesn't have yet. So I designed a rule-based system that works from day one and quietly accumulates the data to make ML possible later.

  1. A knowledge state with no cold start

    Every sub-unit has a few common problems (3–5) every student solves, so there's never zero history. From correctness and item difficulty, the system estimates per-concept mastery.

  2. Adaptive problem sets

    Based on that state, it assigns a tailored number and difficulty of problems, advancing through sub-unit review → unit → final assessment. Do well and it serves more, harder items; struggle and it serves more concept-reinforcing, easier ones — and a missed foundational item triggers a problem from the prerequisite concept in the knowledge map.

  3. Signals beyond right and wrong

    Mastery rises and falls differently depending on item difficulty, and solving time counts too — rushing or guessing too fast is penalized rather than rewarded.

  4. Built for teachers, in Java

    The deployment solution was Java-only: SQL pulls difficulty-appropriate items, Java assembles each problem set and caps repeats so the same problem doesn't keep reappearing. Teachers can also enter parameters (target difficulty, class size) to auto-generate a workbook from the problem bank.

Why rules, not a model? In education, teachers can't blindly trust a black-box estimate for high-stakes assessments — it has to be understandable and respect curriculum order. Rules earned that trust early, and I tuned them with field teachers, by grade level. Just as important, I designed the database to double as an ML foundation — logging problems served, correctness, difficulty, and achievement — so the product ships as rules today while accumulating the labeled history to move to ML (a Deep Knowledge Tracing–based diagnosis model I also researched) tomorrow. Delivered via RESTful APIs on the internal platform.

03. The learning platform at scale

On a corporate learning platform serving millions, the question flips from "what should this student practice next" to "which of thousands of courses would this learner actually want" — running live, across many tenants.

  1. A two-layer recommender

    An LSTM over each learner's course and content history plus profile to predict what they'll engage with, and a BERT model matching their interests and role to course titles. Added cold-start handling (skipping learners and tenants with too little history) and filtering to drop mandatory and duplicate courses.

  2. Production & MLOps

    Shipped it through Airflow — re-inferring daily on fresh activity and retraining weekly on the full history — with pre/post-processing to keep recommendations valid across a multi-tenant platform, served via RESTful APIs.

Architecture

Data

Sequential interaction history + user profiles

Model

LSTM history + BERT profile · knowledge tracing

Pipeline

Airflow: weekly retrain + daily inference

Serving

REST API to millions of learners

The recommendation flow: interaction data feeds the models, which are retrained and run on a schedule through Airflow, then served to learners via APIs.

04. Results

+5%

course enrollment (CVR)

+25%

content completion rate

0.88

knowledge-tracing AUC (PoC)

05. Beyond the platform

I shaped the AI approach for a national workforce-reskilling initiative — applying the same ideas (knowledge tracing, recommendation, and personalized coaching) to help people retrain for software roles as digital skills shift. Same conviction underneath: learning should meet people where they are.

← Back to all projects