avatar

Yosia Putra Hartono Yosia Putra Hartono

Data Scientist

I am a data scientist with a full-stack data background, spanning analysis, pipelines, and ML deployment, translating complex data into clear insights using cloud platforms.

gmail

Let's keep in touch! Let's keep in touch!

Mail me

Yosia Hartono

deliver production-ready solutions. fast.
  1. MAY 2024 – APR 2025

    JDS Energy & Mining logo Artificial Intelligence and Machine Learning Co-op · JDS Energy & Mining JDS Energy & Mining

    • Co-developed ChatJDS, an OpenAI-powered chatbot on Azure with a Cosmos DB backend, serving 150+ employees and improving average project-query efficiency by 70%.
    • Engineered a technical-report processing pipeline using PyMuPDF, spaCy, and a fine-tuned LLM, automating mining metadata extraction and reducing manual effort by 97%.
    • Fine-tuned an OpenAI embeddings model on 400 mining technical reports with supervised engineer feedback, achieving 85% retrieval accuracy.
    • Loaded 10,000+ projects into an internal database with autogenerated file-preview links.
    • Python
    • Azure
    • Azure App Service
    • Azure Container Registry
    • Cosmos DB
    • Docker
    • OpenAI API
    • Azure Cognitive Search
    • PyMuPDF
    • spaCy
    • SharePoint API
    • Microsoft Graph API
    • Git
  2. MAY 2023 – AUG 2023

    Bukit Uluwatu Villa logo Data Analyst · Bukit Uluwatu Villa Bukit Uluwatu Villa

    • Developed a Python-based predictive algorithm using stochastic modeling and Monte Carlo simulations, generating tailored stock-price scenarios for multiple investment risk profiles.
    • Collaborated with financial analysts to validate assumptions and refine projections through iterative reviews.
    • Built a Power BI projection dashboard through 2027, supporting board-level decision-making.
    • Python
    • NumPy
    • Pandas
    • SciPy
    • Matplotlib
    • Monte Carlo Simulation
    • Stochastic Modeling
    • Power BI
    • Power Query
    • DAX
    • Excel
  3. SEP 2020 – JUN 2021

    TEDx Youth logo Social Media Content Designer · TEDx Youth TEDx Youth

    • Partnered with TEDx Youth to manage the @tedxyouthsmakone social media presence, supporting content planning and day-to-day execution.
    • Designed and produced 11 branded assets (posts, stories, event promos) using Adobe Photoshop and Adobe Illustrator, maintaining consistent visual identity.
    • Coordinated a team of 5 junior designers through templates, feedback loops, and deadlines, shipping 5 assets/week on schedule.
    • Adobe Photoshop
    • Adobe Illustrator
    • Social Media Management
    • Content Design
    • Branding

All Projects

  1. SEP 2025 – Present

    Course / Research logo Urban Green Space Analysis · Course / Research Course / Research

    • Employed a decision tree in scikit-learn, guided by an Analytic Hierarchy Process (AHP).
    • Discovered that increasing tree coverage by 75% could improve Air Quality Index by 8%.
    • Integrated GIS spatial layers in ArcGIS by mapping coverage and air-quality hotspots.
    • Python
    • scikit-learn
    • AHP
    • GIS
    • ArcGIS
  2. SEP 2025 – DEC 2025

    Course / Research logo Canadian Unemployment Time-Series Analysis · Course / Research Course / Research

    • Modeled Canadian unemployment using ARIMA/SARIMA, to forecast trends and evaluate model stability around COVID.
    • Validated fit with residual diagnostics to confirm residuals behaved approximately like white noise.
    • Benchmarked a seasonal MA vs non-seasonal baseline for Jan - Oct 2025, achieving 30% lower RMSE/MAE.
    • R
    • Time-Series Analysis
    • Model Diagnostics
    • Forecast Evaluation Metrics
  3. MAY 2025 – AUG 2025

    Course / Research logo News-Driven Stock Price Movement Prediction · Course / Research Course / Research

    • Fine-tuned a pre-trained BERT model in TensorFlow on headlines to generate embeddings
    • Optimized a dual-stream Random Forest beating a FinBERT benchmark by 12%
    • Implemented a 60/20/20 time-series split + 5-fold time-series CV
    • Python
    • TensorFlow
    • BERT
    • FinBERT
    • Kaggle
  4. JAN 2025 – APR 2025

    Course / Research logo Greater Victoria Region House Price Prediction · Course / Research Course / Research

    • Built a Tableau choropleth using GeoJSON multipolygons that highlights city-level affordability for Greater Victoria.
    • Trained linear vs. log-transformed regression in R.
    • Achieved $707,446 test RMSE that affects forecast reliability for real-dollar home price estimates.
    • R
    • Tableau
    • GeoJSON