AI Engineeringfor Unstructured Financial Data

Vector DatabasesSignal ExtractionModel Finetuning

Hybrid alt-data engineer and risk modeler with a background in fintech underwriting, financial data science, and full-stack ML deployment.

LinkedIn GitHub

About Me

I'm a hybrid AI engineer and quantitative modeler with a background that bridges fintech infrastructure, machine learning research, and full-stack AI systems.

At Katapult, I helped build and scale real-time underwriting pipelines that powered over $250M in annual originations. My work combined applied machine learning, fraud detection, and dynamic pricing — with direct financial impact. As Risk Infrastructure Lead, I also led the buildout of production tooling and model deployment architecture, driving both velocity and rigor across the ML lifecycle.

Alongside industry work, I've contributed to applied ML research during my M.S. at NYU Courant, with projects spanning natural language processing, vector representations, and optimization. That academic foundation — in linear algebra, probability, and numerical methods — still informs how I design and interpret models in high-stakes, signal-rich environments.

Today, I'm focused on systematic signal extraction from unstructured data — building full-stack applications that turn transcripts, reviews, and filings into real-time trading intelligence. I'm fluent in modern vector DBs, LLM pipelines, retrieval-augmented generation (RAG), and custom API integrations.

Technical Stack

Production-proven technologies for building scalable, low-latency systems that process financial data and generate alpha.

AI Infrastructure & Cloud Deployment

Scalable infrastructure for AI applications and cloud deployment.

Python ecosystem (NumPy, Pandas, PyTorch, Scikit-learn)

SQL (PostgreSQL, MySQL)

Docker containers & GitHub Actions CI/CD

AWS (S3, Lambda, Step Functions, CloudWatch)

Infrastructure-as-code with Terraform

Machine Learning & Risk Modeling

Advanced ML for risk assessment and predictive modeling.

Credit-risk & fraud scoring pipelines

Tree ensembles (XGBoost, Random Forest, LightGBM)

Logistic/GLM calibration & probability scoring

Time-series cross-validation

Feature importance & engineering

Model governance & A/B evaluation

NLP, LLMs & Vector Retrieval

Harnessing NLP and LLMs for intelligent applications.

Text preprocessing (spaCy, regex, NLTK)

Sentence & transformer embeddings

Vector DBs: FAISS, PGVector, Pinecone

LangChain orchestration

OpenAI API, Gemini, Hugging Face Transformers

Retrieval-augmented generation (RAG) & semantic search

Let's Connect

I'm always open to discussing new projects, collaborations, or opportunities. Feel free to get in touch.

Location

Miami, Florida

Email

levente@journeymanai.io

Send a Message

Have a question or proposal? I'd love to hear from you.