Liam Frank

About Me

Hi, I'm Liam Frank — a Junior Data Scientist with hands-on experience in machine learning, deep learning, and real-time data systems. I'm currently pursuing a Master's in Data Science through Graceland University's 4+1 program, where I also completed dual undergraduate degrees in Data Science and Mathematics, with a minor in Computer Science. At SAIM, I’ve worked on production-grade AI and analytics projects in the commerical aviation sector, building models for predictive maintenance, anomaly detection, and asset identification.

Experience

Junior Data Scientist

SAIM | Overland Park, KS

Janurary 2025 - Present

  • Work on a cross-functional team with Software, IoT and Database Engineers within the B2B SAAS space
  • Designed and deployed a transfer learning-based convolution neural network to label fuel facility assets in AutoCAD drawings, increasing previous model accuracy by over 20% to 98%, automating previously required manual data labelling
  • Developed an autoencoder model to detect anomalies in high frequency centrifugal pump vibration distributions, enabling predictive maintenance capabilities and early fault detection
  • Deployed an Azure Stream Analytics pipeline to over 50 commercial aviation fuel farms natiowide to monitor real-time sensor data for anomalies and quality issues, improving system reliability and detecting potential faulty sensors
  • Designed and implemented an automated hiring forecast model and reporting system that analyzed historical project data, discipline-specific utilization rates, and future projects to generate a strategic hiring plan and determine optimal staffing levels for each engineering discipline

Data Science Intern

SAIM | Overland Park, KS

May 2024 - August 2024

  • Developed an ensemble time series forecast comprised of an ARIMA, Holt's Exponential Smoothing, and definite integration for predicting remaining useful life of fuel filter vessels using differential pressure data
  • Researched and tested the use of unsupervised machine learning algorithms for anomaly detection for centrifugal fuel pumps
  • Worked alongside Mechanical Engineers aiding in developing and revising pump train sensor requirements to effectively supplement predictive maintenance initiatives of SAIM and Argus Consulting

Education

Masters in Data Science and Analytics

Graceland University | Lamoni, IA

August 2023 - December 2025

  • 4.0 GPA

Bachelors of Science in Data Science

Bachelors of Arts in Mathematics

Graceland University | Lamoni, IA

August 2021 - December 2024

  • 3.9 GPA
  • 4x Presidents List
  • 2x Honor List
  • 1x Deans List
  • Freeman Award for Student Athletes
  • 4x Heart of America Conference Scholar Award
  • Academic Honors Scholarship

Projects

Deep Learning

  • Deep Convolutional Neural Network built utilizing transfer learning and the VGG16 architecture to classify satellite images of Military Aircraft.
  • Dataset included near 7,000 satellite images of 11 distinct aircraft types as well as satellite image of bareland
  • Employed image augmentation techniques at each training epoch to combat data size limitations and prevent overfitting
  • Accurary of 96% on testing data
  • Deployed model to a streamlit app hosted via a huggingface space for real-time user uploaded aircraft classification

View Project

  • Locally hosted PDF based Ollama RAG
  • Built with Langchain, ChromaDB vector database, and Llama 3.2

View Project

  • Deep Feed-Forward Artifical Neural Network and XGBoost to classify default status of home-equity loans
  • Trained on just 6,000 observations and 10 features, the ANN had an accuracy of 80% and the XGBoost model had an accuracy of 93% when tested

View Project

Natural Language Processing and Text Classification

  • Performed cleaning of raw text data through punctuation removal, case standardization, stopword removal, spell checking, lemmatization, and whitespace tokenization.
  • Tokenized words were transformed into numerical features using 1-gram text vectorization and the bag-of-words model. Resulting in a matrix of dimensions 50,422 x 20,818.
  • Principal Component Analysis was employed for dimensionality reduction. Shrinking the dataset from over 20,000 features to just 1,000 while retaining 44% of total variance.
  • Synthetic Minority Oversampling Technique was utilized to address class imbalances in training data.
  • Numerous supervised machine learning classification models were tested including Naive Bayes, Decision Tree, Random Forest, XGBoost, and Support Vector Machines with both linear and non-linear kernel functions.
  • Best performing models were the Random Forest of 200 trees and XGBoost model of 400 trees. Both models featured an accuracy of 97% and F1 score of 0.97.

View Project

Exploring Regression Models to Predict Certified General Aviation Aircraft Metrics

  • Multivariate Linear Regression to predict aircraft stall speed
  • Multivariate Linear Regression to predict aircraft horsepower
  • Multivariate Linear Regression to predict aircraft wingspan
  • Logistic Regression to classify number of engines

View Project

NYC Flights

  • EDA data visualizations
  • Logistic Regression to classify significant departure delays
  • K-means Clustering

View Project

The ETL Process of Fifa 23 Player Ratings

  • Web Scraping
  • Data Cleaning and Feature Engineering
  • LASSO Regression to predict player rating
  • Principal Component Analysis for dimensionality reduction

View Project

Cancer Crude Prevalence Prediction and Origin Classification

  • Univariate Linear Regression to predict cancer crude prevalence
  • Multivariate Linear Regression to predict cancer crude prevalence
  • Support Vector Machine to classify 8 level factor of cancer crude prevalence
  • K-means Clustering

View Project

Skills

Contact Me

Personal Email: liamdanielfrank@gmail.com

Student Email: liam3@sting.graceland.edu

Work Email: lfrank@saim.com