Research

AutoTimes: Study the Source Code

AutoTime Overview We continue to explore the fusion of LLM and time series. This more recent work reaches new state of art performance on time series forecasting compared to the previous two (TimeLLM and TEST). The repository can be found here . In comparison to previous approaches, the main difference is that AutoTime leverages the autoregressive nature of LLMs, hence it is able to generate predictions of arbitrary lengths. It uses similar patch-based encoding for the time series, but also treats the time stamps text as positional encodings....

Text Prototype Aligned Embedding for Time Series: Study the Source Code

Similar to the previous post, I will analyze the source code of the Text Prototype Aligned Embedding for Time Series. The repository can be found here . The main idea of this paper is very similar from TimeLLM, which is to integrate temporal information into the LLM training process. The difference is how the alignment between text and time series is established. Like in the previous post, TimeLLM , I will first analyze the input data and the overall forward pass architecture....

TimeLLM: Study the Source Code

Time series forecasting has been a hot topic in the field of deep learning, and there are many interesting models proposed in the last few years. TimeLLM is a novel approach that integrates temporal information into the LLM training process, and it is a very interesting approach. In this post, I will link the paper’s content with the code, and discuss the implementation details. This is helpful for me since I plan to use LLM for time series forecasting in the future....

Learning Volume with Neural Numerical Integration

Implementation of Neural Operators

Here I note on the pyTorch implementation of different types of neural operators: Fourier Neural Operator (FNO) Spherical FNO (SFNO) Markov Neural Operator (MNO) Geometry-Informed Neural Operator (GINO) Physics-Informed Neural Operator (PINO) Fourier Neural Operator (FNO) The now classical work of Fourier Neural Operator has been quite influential in the class of work known as neural partial differential equations (Neural PDEs). Spherical Neural Operator (SFNO) The spherical neurla operator is a particular adaptation of the FNO to a spherical domain....

Writing Auto-regressive DataLoader for Lightning

Learning Stochastic Dyanmics using Diffusion Models

In this blog post, I talk about the project of Learning Stochastic Dynamics using Diffusion models. The repository of this project will be open sourced soon. Problem Definition In this section we discuss the exact mathematical formulation of the learning problem. First, we start with the definition of dataset: in our dataset we assume observing $N$ 3-D trajectories of length $L$, e.g., a data sample is a tuple ${(x_i,y_i,z_i), t_i} \in \mathbb{R}^3 \times [0,T]$ where $T$ is discretized by time stamps ${t_i}_{i=1}^L$....

Score SDE: Dissect

The discrete version of diffusion models (DDPM, DDIM) are easier to understand and implement, but the same may not be said about their continuous counterpart. Here I attempt to take a dive into the official implementation of the score based generative model through stochastic differential equations paper and try to map the implementation with the paper. This would help me (and potentially others) to adapt and modify the code base for my own use....

Spectral Methods Mode Visualization

In this post, I study and review spectral methods in PyTorch, with special focus on visualizing the frequency modes, both for the Discrete Fourier Transform (DFT) / Fast Fourier Transform (FFT) on a regular grid and the Spherical Fourier Transform (SFT). Discrete and Fast Fourier Transform for 2D and 3D signals Mathematically, these two implement the standard Fourier transform in $\mathbb{C}^d$; In torch, there are a family of transform functions dealing with different scenarios:...

PDE-Refiner: Implementation

Quick Overview of Main Result The work PDE Refiner is about ensuring stable long time rollouts of dynamical systems. In this work, they identified empirically that the neglect of non-dominant spatial frequency information (high frequency modes in PDE solution) is the primary reason for the inability of existing models for long temporal rollouts. To address this issue, they force the model to learn these modes by actively injecting noises at each step and predicting noise, just like the mechanism in diffuion models....

Diffusion Transformer: End to End

Here there are two parts: Diffusion model implementation Vision Transformer based Diffusion Model Implementation For the diffusion model implementation (DDPM), see the other blog post . In this blogpost, we look at the implementation of latent diffusion model with transformer backbone, in particular from the DiT Paper with its Github repository . In the notation here, the shapes are by convention: $B$: batch size. $H$: height of image. $W$: weight of image....

DDIM: Implementation

Note: found out a more detailed code go-over tool, annotated deep learning implementations . However, it seems that as a researcher, it is better to read the paper and read the code yourself before looking for illustrations from this site. In this post I go over the implementation of the DDIM sampling method (for Implicit models) and also some variants of the implicit diffusion model paradigm. It is advised to first take a read at the ddpm post about the general training setup of the diffusion model; DDIM changes the inference / sampling process to make it more efficient....

DDPM: Implementation

A diffusion model’s standard implementation contains two parts: The backbone model (in the standard DDPM setting, this is the UNet). The diffusion process (in the standard setting, this is Gaussian Diffusion). The goal of this post is to gauge the standard practice in implementing the basic Denoising Diffusion Probabilistic Model (DDPM), from their official Github implementation. Unconditional DDPM This is the basic version of DDPM, not used for conditional generation. The way the repository structure was set up for DDPM is that each python module under the denoising_diffusion_pytorch folder is self-contained....

ERA 5 Dataset: An Introduction from Deep Learning Perspective

In this post, I go through the steps to prepare the data from the ERA5 weather dataset for machine learning (in particular, weather forecast and climate modeling.) Obtain ERA5 Dataset The official website for the ERA5 dataset is located here , which contains weather data from 1940s to present. The data is obtained, at the most granular level, at 1 hour interval and with $0.25 \times 0.25$ degree for atmosphere and $0....

FUSE: Measure-theoretic Compact Fuzzy Set Embedding

The paper is officially accepted! See our OpenReview link . Will update this page to give a brief intro very soon. 😂