About
AI Infrastructure Engineer | Inference Optimization Specialist
Hi, I'm Efrat Bdil 👋
I work in AI infrastructure, with a deep focus on the Inference stage: the moment when a trained model needs to work in production, handle real workloads, and respond with predictable and stable latency.
My work sits at the intersection of software, models, and infrastructure. I develop and optimize AI inference pipelines on top of the NR1 platform, our company's dedicated chip.
This includes working with PyTorch-based models — in both Vision (like YOLO) and NLP/LLMs (like BERT) — and integrating them into full pipelines, from input to output.
What I Do Day-to-Day
🔍 Performance Analysis
A key part of my work is understanding why a system is slow — not just where. I do profiling, which is debugging and analyzing performance: Latency, Throughput, and Bottlenecks. I compare runs on different hardware and try to understand what changes when moving from general-purpose to dedicated hardware.
🛠️ Infrastructure Development
Day-to-day, I work a lot with Python and Linux, write Bash scripts, build Docker images and run containers, and maintain complex Benchmarking environments.
⚙️ Automation & CI/CD
There's also quite a bit of automation: internal tools, CI/CD processes with Jenkins, working with Artifactory, and infrastructure that ensures every change is measured, tested, and analyzed.
🤝 Team Collaboration
The work is not isolated. I regularly collaborate with hardware, compiler, and backend teams to turn models into code that actually runs in production. This includes understanding hardware-software integrated systems and diagnosing issues that can come from drivers, configuration, or communication interfaces.
Technologies & Skills
🧠 AI & ML
💻 Programming & Tools
🚀 DevOps & CI/CD
📊 Performance & Optimization
About This Blog
This blog was created to document what I learn and share knowledge about:
- AI Inference — How to run models in production efficiently
- Performance Optimization — Techniques to improve speed and response time
- Infrastructure & DevOps — Docker, CI/CD, automation
- Hardware Acceleration — GPUs, dedicated accelerators, and NR1
- Profiling & Debugging — Tools and methods for performance analysis
Much of the learning happens through doing: understanding how small decisions affect performance, and how a complete system behaves when it meets real workloads.
Get in Touch
Feel free to read, share, and comment on the posts. Comments and community help learn more!