Case study · Recommendation & Knowledge Tracing
Personalized Learning at Scale
AI for more equal learning: estimate what each learner knows so they get the right difficulty — not problems so hard they give up — and recommend the courses they'd actually care about. Across an AI digital textbook and a learning platform serving millions.
- Role
- AI Engineer · TmaxEduAI
- Timeline
- 2023 – 2024
- Stack
- Python · PyTorch · Airflow · SQL · Java
- Focus
- Knowledge tracing · Recommendation systems · MLOps
01. Problem
One-size-fits-all content fails learners in two opposite ways. Give a struggling student problems that are too hard and they lose motivation — the fun drains out. Drop a motivated learner into a sea of courses with no guidance and they can't find what's actually relevant to them.
Closing both gaps means answering two questions well — what does this learner currently know? (so you can serve the right difficulty) and what would they actually want next? (so the path stays relevant) — for new learners with little history (cold-start), live, at scale.
02. The AI digital textbook (math)
For a new math digital textbook, the goal was adaptive practice — the right problems for each student. The obvious tool, our team's deep knowledge-tracing model, needs a lot of per-student solving history that a brand-new product simply doesn't have yet. So I designed a rule-based system that works from day one and quietly accumulates the data to make ML possible later.
-
A knowledge state with no cold start
Every sub-unit has a few common problems (3–5) every student solves, so there's never zero history. From correctness and item difficulty, the system estimates per-concept mastery.
-
Adaptive problem sets
Based on that state, it assigns a tailored number and difficulty of problems, advancing through sub-unit review → unit → final assessment. Do well and it serves more, harder items; struggle and it serves more concept-reinforcing, easier ones — and a missed foundational item triggers a problem from the prerequisite concept in the knowledge map.
-
Signals beyond right and wrong
Mastery rises and falls differently depending on item difficulty, and solving time counts too — rushing or guessing too fast is penalized rather than rewarded.
-
Built for teachers, in Java
The deployment solution was Java-only: SQL pulls difficulty-appropriate items, Java assembles each problem set and caps repeats so the same problem doesn't keep reappearing. Teachers can also enter parameters (target difficulty, class size) to auto-generate a workbook from the problem bank.
Why rules, not a model? In education, teachers can't blindly trust a black-box estimate for high-stakes assessments — it has to be understandable and respect curriculum order. Rules earned that trust early, and I tuned them with field teachers, by grade level. Just as important, I designed the database to double as an ML foundation — logging problems served, correctness, difficulty, and achievement — so the product ships as rules today while accumulating the labeled history to move to ML (a Deep Knowledge Tracing–based diagnosis model I also researched) tomorrow. Delivered via RESTful APIs on the internal platform.
03. The learning platform at scale
On a corporate learning platform serving millions, the question flips from "what should this student practice next" to "which of thousands of courses would this learner actually want" — running live, across many tenants.
-
A two-layer recommender
An LSTM over each learner's course and content history plus profile to predict what they'll engage with, and a BERT model matching their interests and role to course titles. Added cold-start handling (skipping learners and tenants with too little history) and filtering to drop mandatory and duplicate courses.
-
Production & MLOps
Shipped it through Airflow — re-inferring daily on fresh activity and retraining weekly on the full history — with pre/post-processing to keep recommendations valid across a multi-tenant platform, served via RESTful APIs.
Data
Sequential interaction history + user profiles
Model
LSTM history + BERT profile · knowledge tracing
Pipeline
Airflow: weekly retrain + daily inference
Serving
REST API to millions of learners
04. Results
+5%
course enrollment (CVR)
+25%
content completion rate
0.88
knowledge-tracing AUC (PoC)
- CVR and completion gains were measured over a 3-month post-launch monitoring period on a platform serving millions of learners.
- For the knowledge-tracing PoC, I also surfaced the model's explainability limits and proposed directions to make it interpretable enough for real deployment.
05. Beyond the platform
I shaped the AI approach for a national workforce-reskilling initiative — applying the same ideas (knowledge tracing, recommendation, and personalized coaching) to help people retrain for software roles as digital skills shift. Same conviction underneath: learning should meet people where they are.