Sosun Yim thtjs0076

Hi, I'm Sosun Yim👋

🎓 Data Science @ UC Berkeley (expected May 2027) 🔬 Interested in Machine learning,Data analysis,and building data-driven projects 🌱 Currently studying SQL, statistics, and machine learning 📫 Open to Data Science / Data Analyst internships

🔭 Featured Project

Spam Classifier — Text classification on the Enron email dataset (~33K emails)

Compared Baseline LR, TF-IDF + LR, and fine-tuned DistilBERT
Best F1: 0.9927 (DistilBERT), with a latency vs. accuracy trade-off analysis
Investigated 9.6% duplicate data as a potential leakage source and re-validated results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sosun Yim thtjs0076

Block or report thtjs0076

Hi, I'm Sosun Yim👋

🔭 Featured Project

🛠️ Tech Stack

📊 GitHub Stats

Popular repositories Loading

Uh oh!