Projects

A list of small data science and coding projects, TILs (Today I Learned), and ongoing research projects, alongside links to supplementary resources (e.g., code, models, online appendices, etc.).

2023

Fuzzy Name Matcher

Gradio app to perform fuzzy name matching on entity names and merge financial datasets in the absence of unique keys. Allows for docker deployment.

2022

The DreamBooth Technique

DreamBooth is a fine-tuning technique for large, pretrained text-to-image models (e.g., DALL-E2, Imagen, Stable Diffusion). Based on a small reference set of training images of a given subject or object (henceforth concept), the DreamBooth technique learns a custom identifier for the given concept and implants the concept embedding into the model’s output domain. It enables the model to synthesize images of the underlying concept in different contexts and settings with very high-quality.

December 30, 2022

Holbox, Mexico


By Simon Schölzel in Project

Ungreenwash

This project utilizes OpenAI’s LLMs and publicly available data, including ESG reports, SEC 10-K filings, and earnings call transcripts, to build an app that searches and summarizes these data to empower users with ESG-related information needs to invest responsibly.

November 12 – 13, 2022

Cyclops Mountains, Papua


By Simon Schölzel in Project

GitHub

SEC EDGAR Scraper

CLI tool for downloading various types of SEC filings from the EDGAR database.

June 24, 2022

Gondwana, Ediacaran


By Simon Schölzel in Project

GitHub

CLIP-Guided Image Synthesis

A write-up that summarizes my personal learnings and experimentations with CLIP-guided image synthesis. It covers VQGAN, CLIP, Inference-by-Optimization, as well as various text-to-image and image-to-image experiments.

June 18, 2022

Atlantis, Atlantic Ocean


By Simon Schölzel in Project