Hi, I’m Simon. 🪄
I am a data scientist at Provinzial where I work on RAG systems and domain-specific AI applications. I am driven by curiosity and approach my work with a scientific mindset. I enjoy deep work and continuously learning new things.
I hold a PhD in Accounting. In my research, I have broadly focused on measurement problems in accounting. I have applied traditional quantitative methods as well as novel techniques from machine learning, natural language processing, interpretable machine learning, and causal machine learning to evaluate the prediction of accounting estimates, the analysis of corporate narratives, and the estimation of causal effects.
-
Data Ingestion & Processing
Web Scraping: rvest, Beautiful Soup, Selenium
Streaming: Apache Kafka
Query Languages: SQL -
Data Analysis & Visualization
Data Wrangling: pandas, tidyverse
Plotting Libraries: ggplot2
Dashboards & Visual Analytics: Kibana -
Machine Learning & Statistical Modeling
Machine Learning: tidymodels, scikit-learn, DoubleML
Model Interpretability: DALEX
Data Labeling: prodigy -
Deep Learning & Natural Language Processing
DL Frameworks: pytorch, transformers
NLP: spacy -
Experimentation & Prototyping
Experiment Tracking: Weights & Biases
Interactive Prototyping: gradio -
Reproducibility & Documentation
Literate Programming: Jupyter Notebooks, R Markdown -
Software Engineering & APIs
API Development: FastAPI
Version Control: Git -
MLOps & Deployment
CI/CD & Orchestration: Kubernetes, ArgoCD -
Infrastructure & Observability
Monitoring & Logging: Dynatrace, Kibana
Search & Analytics Databases: Elasticsearch