RLHF vs. RLAIF: Fine-Tuning LLMs for Better Alignment (OTS, SFT, PPO, Jailbreak) Large Language Models (LLMs) like GPT-4, LLaMA 3, and Claude are redefining natural language processing. Despite their advancements…
Optimizing Azure OpenAI Service: Base Model Deployment, Fine-Tuning, and Decoding Parameters Azure OpenAI Service offers powerful tools to deploy, fine-tune, and interact with GPT models, making it essential to understand the…
RAG vs. Fine-Tuning : When to Use, Combine, and Optimize for Best Results When building or optimizing AI models, two powerful techniques often come into play: Fine-tuning and RAG (Retrieval-Augmented Generation)…
Paper Review — Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution… Debugging programs is essential yet challenging, even for advanced Large Language Models (LLMs). In their ACL 2024 paper, “Debug like a…
IMG2TEXT-Part2. OFA, CLIP Interrogator and ViT Continuing from Part 1, we are going to look into the CLIP Interrogator, OFA model, and ViT model and ensemble them. Most of the codes are…
IMG2TEXT-Part1. Background (Stable Diffusion, CLIP, Prompt) In this article, I’d like to talk about background information to implement CLIPInterrogator+OFA+ViT_LB0.568. Part 2 will cover the…
Google ISLR Transformer with W&B (Part 2) In this article, I’ll be showing you how to create and train a model for the Kaggle ASL (American Sign Language) recognition competition…
Google ASL 1. Process Data with W&B 🐝 Today, I’m going to explain the dataset and how to process it for a Kaggle competition on ASL(American Sign Language), Google — Isolated…
Paper Review — Strided Transformer (TMM 2022) Strided Transformer is a monocular 3D pose estimation model which lifts a long sequence of 2D joint locations to a single 3D pose.
Paper Review — VideoPose3D (CVPR 2019) 3D human pose estimation in video with temporal convolutions and semi-supervised training
[PyTorch] Simple 3D Pose Baseline implementation (ICCV’17) In this post, I review Simple 3D Pose Baseline (A simple yet effective baseline for 3d human pose estimation, also called as SIM) which is…
HRNet : Code Explained HRNet(Deep High-Resolution Representation Learning for Human Pose Estimation) is a state-of-the-art algorithm in the field of semantic…
In this post, we create a simple convolutional neural network(SimpeConvNet) using only NumPy and… Simple CovNet with NumPy In this post, we create a simple convolutional neural network(SimpeConvNet) using only NumPy and it will classify MNIST images. The codes are from a book called ‘Deep Learning from Scratch’. Let’s check an architecture of SimpeConvNet and notations first. Architecture N: the number of
Training Basic Two Layer Network with Numpy In this post, we develop a two-layer network in order to perform classification in MNIST dataset and train it. There are mainly two parts…
Simple Affine Layer with Numpy This post is about Affine Layer based on my understanding. Outline of the post is as follows:
Optimizers Optimizers are algorithms or methods used to minimize an error function(loss function)or to maximize the efficiency of production. The…