🎯 ML/Data Science Engineer

Machine Learning & Data Science Technical Interview

👤 Informácie o kandidátovi

1
Easy
0 / 2 points
What is k-Fold Cross-Validation?
2
Easy
0 / 2 points
What is the difference between supervised and unsupervised learning?
3
Medium
0 / 3 points
What is reinforcement learning? What is it used for?
4
Medium
0 / 3 points
What is data/class imbalance? What consequences does it have? How to treat those?
5
Easy
0 / 2 points
What is feature engineering?
6
Medium
0 / 3 points
What are transformer-based models?
7
Hard
0 / 4 points
What is RAG (retrieval-augmented generation) and how does it relate to your expertise?
8
Medium
0 / 3 points
How is fine-tuning different from regular training?
9
Hard
0 / 4 points
How do the fine-tuning options and limitations differ between closed-source models (e.g., OpenAI GPT-4) and open-source models that you can self-host?
10
Hard
0 / 4 points
What parameter-efficient fine-tuning techniques exist for open models (e.g., LoRA, QLoRA), and how do they work conceptually?
11
Easy
0 / 2 points
List vs Tuple in Python? When do you use what?
12
Medium
0 / 3 points
What are decorators in Python?
13
Easy
0 / 2 points
What is the purpose of the .groupby() method in the Pandas library?
14
Medium
0 / 3 points
Explain the difference between a generator and a normal function that returns a list. Why are generators (which use the yield keyword) particularly beneficial when processing very large, potentially memory-intensive datasets in data science?
15
Critical
0 / 5 points
Design a complete MLOps pipeline for a production-grade recommendation system handling 1M+ daily predictions. Include model versioning, A/B testing, monitoring, and automated retraining.
16
Critical
0 / 5 points
You're building a fraud detection system with 99.5% legitimate transactions. Explain how you would handle extreme class imbalance, choose appropriate metrics, and ensure the model doesn't just predict "legitimate" for everything.
17
Critical
0 / 5 points
Design a distributed training strategy for a large-scale deep learning model (BERT-scale) across multiple GPUs and nodes. Include parallelism strategies, gradient synchronization, and optimization techniques.
18
Critical
0 / 5 points
Explain feature engineering strategies for time-series data in a production forecasting system. How do you handle seasonality, trends, and create lag features while avoiding data leakage?

📊 Total Score

0%
🟢 Easy
0/10
🟡 Medium
0/18
🔴 Hard
0/12
🔥 Critical
0/20
Start checking answers!
0%