Let's reproduce NanoGPT with Jax (Part 2) 175k->1350k tokens/sec in Single GPUenable Mixed Precision Training, Gradient Checkpointing, Gradient Clipping, and Gradient Accumulation etc for 1350k tokens/secAug 41Aug 41
Published inTowards Data ScienceLet’s reproduce NanoGPT with JAX!(Part 1)Part 1: Build 124M GPT2 with JAX. Part 2: Optimize the training speed in Single GPU. Part 3: Multi-GPU Training in Jax.Jul 21Jul 21
Published inAnalytics VidhyaSmart Water: Data Labeling with Active Learning And H2O.aiIntuitionMay 16, 2021May 16, 2021
Published inAnalytics VidhyaML Code Dive: K-Nearest NeighborsML Code Dive is a series of articles to deep dive machine learning algorithms without any ML libraries. It not only allows you to better…Apr 9, 2021Apr 9, 2021
Published inDataDrivenInvestorTraining the Lunar Lander Agent With Deep Q-Learning and Its variantsAbstractMar 22, 20211Mar 22, 20211
Published inDataDrivenInvestorML System Design Case: Recommend Restaurant in UberEatsTL;DR: the article shows a framework to solve a ML System design Case Study, And we use UberEats ReCommendation as an Example to show the…Mar 22, 2021Mar 22, 2021
Published inDataDrivenInvestorRethinking the right metrics for fraud detectionThe precision, recall, F1-score may not really work…Aug 6, 20192Aug 6, 20192
Published inDataDrivenInvestorRecommendation System: A problem statement viewThree types of problem statement on recommendation system.Jun 17, 2019Jun 17, 2019
Published inDataDrivenInvestorEmbeddings at eCommercePower personalization with deep learning and pre-trained product embeddingsMay 1, 20193May 1, 20193
What determines a good data scientist?In the big data world, more and more people hope to be a data scientist, utilizing the power of data and machine learning to solve real…Mar 16, 2019Mar 16, 2019