Skip to content

jshinm/project-binder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

These are the collections of research projects and software portfolios that are categorized based on the topics relevant in the field of artificial intelligence and machine learning. Also included are coding exercises that demonstrates algorithmic problem solving skills.

Table of Contents

  • About Me
  • Mostly Used Languages
  • AI/ML Research Projects
    • Extrapolative behavior of ML models
      • The project demonstrates comparative difference of extrapolative behavior between various ML models and reports quantitative/qualitative measures of performance against that of human on simple non-linear simulation datasets.
    • Automated end-to-end causal inference application
      • The project aims to bring exploratory data analytics and causal inference to generic users who are not familiar with data science particularly with the concept of in-dept machine learning models and intricacy of causal inference.
    • Human extrapolative behavioral experiment via web application
      • The human part of the extrapolation project to take comparative measures on investigating human like behaviors from various ML algorithms such as random forest and deep neural networks. In order to deliver the behavioral experiments to 150 participants recruited on AWS, the web app was created and hosted on heroku.
    • Probabilistic linkage of real-world clinical data
      • In order to link multiple datasets acquired from different sections of the JHU medicine network, this data management pipeline was built in support of collective mission of CDEM. The pipeline uses fastlink R package to automate probablistic linkage of multiple datasets which bases on expectation-maximaztion algorithm.
    • Multivariate time-series hologram biometric signal parsing
      • The real-world data most often involves significant noise that perturbs extraction of the signal of interest. This project aims to parse out biometric signals from multivaraite time-series hologram data where EEG read out coninsides with other physiological responses such as heart beat.
  • Coding

About Me

[Add contents here]

Mostly Used Languages

Top Langs

AI/ML Research Projects

The projects that involve machine learning and artificial intelligence are listed here.

  1. Extrapolative behavior of ML models

  2. Github Repo: https://github.com/jshinm/inductive-bias-experiment

    One of the purposes of machine learning models is a prediction of trend based on the patterns of given data. The issue is that these ML models are interpolative by nature and does not perform well as extrapolators. Despite that, ML models are widely used for forecasting uncharted territory. This project tests examines extrapolative behaviors of the ML algorithms such as random forest, neural networks, support vector machines and measure performance against humans.

    These ML algorithms are trained on non-linear simulation datasets (gaussian XOR shown above) and their posterior probability distributions are drawn out in a form of a grid for comparison. The line plot is indicative of increasing Hellinger distance for neural nets posteriors more so than both random forest and humans as we move further away from the origin, which suggests that the latter algorithm is more similar to the former algorithm in this experiment.

    As these non-linear datasets are not space-invariant, we can assess the posterior in a piece-wise manner. The linear evaluation with a function of angle reveals more drastic difference between neural nets and random forest where the former algorithm reaches the limit of posterior much faster than that of the latter indicating that neural nets is not only misrepresentative of spiral simulation estimation but it is also more confident in its decision.

  3. Automated end-to-end causal inference application

  4. Github Repo: --

    < content here >

  5. Human extrapolative behavioral experiment via web application

  6. Github Repo: https://github.com/jshinm/deepnet-behavioral

    < content here >

  7. Probabilistic linkage of real-world clinical data

  8. Github Repo: https://github.com/jshinm/probabilistic-linkage

    < content here >

  9. Multivariate time-series hologram biometric signal parsing

  10. Github Repo: https://github.com/jshinm/hologram-biometric-signal-parsing

    < content here >

Software Development

Current and past software projects.

Project Description Code
WebApp for machine versus human extrapolation experiment Desc1 Code1
Web Scrapper for data mining on GitHub repository Desc2 Code2
P&L generator for tax report Desc2 Code2
Pandarize Desc2 Code2
Amortized loan simulator Desc2 Code2
Omega Messenger Desc2 Code2
FlipScope Desc2 Code2
KeeWee Desc2 Code2

Kaggle Projects

Some data science side projects for Kaggle challenges

Project Code
Item1 Code1
Item2 Code2

Coding

The following is the programming exercise that covers various algorithms and data structures. There is a dedicated section for sorting. Also contains is sql and bash/shell coding challenges.

Algorithm & Data Structure

Problem Name Platform Note
Two Sum LeetCode Blind 75 Qs
Two Sum II - Input Array Is Sorted LeetCode
Two Sum IV - Input is a BST LeetCode
Reverse Integer LeetCode
Palindrome Number LeetCode
Roman to Integer LeetCode
Longest Common Prefix LeetCode
Longest Substring Without Repeating Characters LeetCode Blind 75 Qs
Container With Most Water LeetCode Blind 75 Qs
Three Sum LeetCode Blind 75 Qs
Remove Nth Node From End of List LeetCode Blind 75 Qs
Valid Parentheses LeetCode Blind 75 Qs
Merge Two Sorted Lists LeetCode Blind 75 Qs
Merge k Sorted Lists LeetCode Blind 75 Qs, only TLE Solution
Combination Sum LeetCode Blind 75 Qs
Search in Rotated Sorted Array LeetCode Blind 75 Qs
Group Anagrams LeetCode Blind 75 Qs
Rotate Image LeetCode Blind 75 Qs
Maximum Subarray LeetCode Blind 75 Qs
Climbing Stairs LeetCode Blind 75 Qs
Spiral Matrix LeetCode Blind 75 Qs
Jump Game LeetCode Blind 75 Qs
Merge Intervals LeetCode Blind 75 Qs
Insert Interval LeetCode Blind 75 Qs
Unique Paths LeetCode Blind 75 Qs, only TLE Solution
Sort Characters By Frequency LeetCode
Set Matrix Zeroes LeetCode Blind 75 Qs
Minimum Window Substring LeetCode Blind 75 Qs
Word Search LeetCode Blind 75 Qs
Decode Ways LeetCode Blind 75 Qs
BinaryGap Codility
CyclicRotation Codility
OddOccurrencesInArray Codility
FrogJmp Codility
PermMissingElem Codility
TapeEquilibrium Codility
FrogRiverOne Codility
PermCheck Codility
MaxCounters Codility
MissingInteger Codility
PassingCars Codility
CountDiv Codility
GenomicRangeQuery Codility
MinAvgTwoSlice Codility 50% Solution
Distinct Codility
MaxProductOfThree Codility
Triangle Codility
NumberOfDiscIntersections Codility
Brackets Codility
Fish Codility 87% Solution
Nesting Codility
StoneWall Codility 92% Solution
Dominator Codility
EquiLeader Codility 33% Solution
MaxProfit Codility
MaxSliceSum Codility
MaxDoubleSliceSum Codility
CountFactors Codility

Data Structure Examples

Name Example
Heap Note
Name Best TC Average TC Worst TC Worst SC Stability
Bubble Sort Ω(N) Θ(N^2) O(N^2) O(1) Stable
Selection Sort Ω(N^2) Θ(N^2) O(N^2) O(1)
Insertion Sort Ω(N) Θ(N^2) O(N^2) O(1) Stable
Shell Sort Ω(N log N) Θ(N log^2 N) O(N log^2 N) O(1)
Heap Sort Ω(N log N) Θ(N log N) O(N log N) O(1)
Merge Sort Ω(N log N) Θ(N log N) O(N log N) O(N)
Quick Sort Ω(N log N) Θ(N log N) O(N^2) O(logN)
Counting Sort Ω(N+K) Θ(N+K) O(N+K) O(K)
Tree Sort Ω(N log N) Θ(N log N) O(N^2) O(N)
Tim Sort Ω(N) Θ(N log N) O(N log N) O(N)
Smooth Sort Ω(N) Θ(N log N) O(N log N) O(1)
Radix Sort Ω(NK) Θ(NK) O(NK) O(N+K)

Database [SQL (syntax note) || Python]

Problem Name Platform Language
Combine Two Tables LeetCode SQL
Second Highest Salary LeetCode SQL
Nth Highest Salary LeetCode SQL
Rank Scores LeetCode SQL
Consecutive Numbers LeetCode SQL
Employees Earning More Than Their Managers LeetCode SQL
Duplicate Emails LeetCode SQL
Customers Who Never Order LeetCode SQL
Department Highest Salary LeetCode SQL
Department Top Three Salaries LeetCode SQL
Delete Duplicate Emails LeetCode SQL
Rising Temperature LeetCode SQL
Trips and Users LeetCode SQL
Big Countries LeetCode SQL
Classes More Than 5 Students LeetCode SQL
Human Traffic of Stadium LeetCode SQL
Not Boring Movies LeetCode SQL
Exchange Seats LeetCode SQL
Swap Salary LeetCode SQL
Reformat Department Table LeetCode SQL
SqlEventsDelta Codility SQL
SqlWorldCup Codility SQL
Weather Observation HackerRank SQL
SQL Project Planning HackerRank SQL
Interviews HackerRank SQL
15 Days of SQL HackerRank SQL
Japanese Population HackerRank SQL
Aggregation HackerRank SQL
Acceptance Rate By Date StrataStratch Python
Highest Energy Consumption StrataStratch Python
Finding User Purchases StrataStratch Python
Popularity Percentage StrataStratch Python
Highest Cost Orders StrataStratch Python
Users By Avg Session time StrataStratch Python
Top 5 States With 5 Star Businesses StrataStratch Python
Finding Updated Records StrataStratch Python
Risky Projects StrataStratch Python
Number Of Bathrooms And Bedrooms StrataStratch Python
Customer Details StrataStratch Python
SMS Confirmations From Users StrataStratch Python
Customer Revenue In March StrataStratch Python
Find the rate of processed tickets for each type StrataStratch Python
Find the overall friend acceptance count for a given date StrataStratch Python
Daily Interactions By Users Count StrataStratch Python
Successfully Sent Messages StrataStratch Python
Popularity of Hack StrataStratch Python
Most Active Users On Messenger StrataStratch Python
Average Salaries StrataStratch Python
Spam Posts StrataStratch Python
Total Cost Of Orders StrataStratch Python
Classify Business Type StrataStratch Python
Top Cool Votes StrataStratch Python
Order Details StrataStratch Python
Workers With The Highest Salaries StrataStratch Python, SQL
Reviews of Categories StrataStratch Python, SQL
Highest Salary in Department StrataStratch Python, SQL
Distances Traveled StrataStratch Python, SQL
Gender with Generous Reviews StrataStratch Python, SQL
Rank Variance Per Country StrataStratch SQL
Users By Average Session Time StrataStratch SQL
Finding User Purchases StrataStratch SQL
Highest Cost Orders StrataStratch SQL
Total Cost of Orders StrataStratch SQL
Ranking Most Active Guests StrataStratch SQL
Algorithm Performance StrataStratch SQL

Bash & Shell

Problem Name Platform
Comparing Numbers HackerRank
Comparing Strings HackerRank
Loop and Skip HackerRank
Arithmetic Operations HackerRank
Compute Average HackerRank
Cut Command HackerRank
Text Processing HackerRank

Releases

No releases published

Packages

No packages published