Shipaton 2025 - Day 30 - Building a Lithuanian Vocabulary Dataset
Shipaton 2025 Day 30: Built the initial Learn Lithuanian vocabulary dataset with Lithuanian and English words organized in a clean CSV, ready for upcoming features. Try the app on Google Play and share feedback.
Mantas Butenas
8/31/20251 min read
Day 30: A Milestone in the Shipathon 2025
Today was all about words - the core ingredient of any language app. I spent the day gathering Lithuanian vocabulary, mapping clean English translations, and organizing everything into a consistently formatted CSV. It’s not flashy work, but it’s the foundation the whole learning experience will stand on.
What I've Accomplished Today
✔️ Collected a list of high-frequency, everyday Lithuanian words.
✔️ Paired each item with a clear, beginner-friendly English translation.
✔️ Standardized spelling, diacritics (ą, č, ė, į, š, ų, ū, ž) and capitalization.
✔️ Organized everything in Excel first for quick visual QA, then exported to CSV.
🎯 Result: A clean initial CSV vocabulary dataset with consistent columns and formatting - ready to plug into the Android app.
Why this matters
A well-structured dataset can make a world of difference when you start tackling a new language. Having everything in a CSV format means that it can easily be imported into various apps or tools designed to boost vocabulary retention, making it an excellent resource for learners.
A tidy dataset means:
✔️ Faster iteration when building quizzes and spaced-repetition.
✔️ Less friction adding audio, example sentences, and images later.
✔️ Clear upgrade path to levels (A1 → A2) and themed packs (travel, food, small talk).
Conclusion
If you’re following along with Shipaton 2025, thanks for the support! Got suggestions for must-know beginner words or topics? Drop them in the comments - your ideas might make it into the next build.