MODERN DATA SCIENTIST
-
MATH & STATISTICS
-
Machine learning
-
Statistical modeling
-
Experiment design
-
Bayesian Inference
-
Supervised learning: decision trees, random forests. logistic regression
-
Unsupervised learning: clustering. dimensionality reduction
-
Optimization; gradient descent and variants
-
DOMAIN KNOWLEDGE & SOFT SKILLS
-
Passionate about the business
-
Curious about data
-
Influence without authority
-
Hacker mindset
-
Problem solver
-
Strategic, proactive. Creative. innovative and collaborative
-
PROGRAMMING & DATABASE
-
Computer science fundamentals
-
Scripting language e.g. Python
-
Statistical computing package, e.g. R
-
Databases; SOL and NOSOL
-
Relational algebra
-
Parallel databases and parallel query processing
-
MapReduce concepts
-
Hadoop and Hive/Pig
-
Custom reducers
-
Experience with xaaS like AWS
-
COMMUNICATION & VISUALIZATION
-
Able to engage with senior management
-
Story telling skills
-
Translate data-driven insights into decisions and actions
-
Visual art design
-
R packages like ggplot or lattice
-
Knowledge of any of visualization tools e.g. Flare, D3js, Tableau