MODERN DATA SCIENTIST

Big Data Data Science
  • MATH & STATISTICS

  • Machine learning

  • Statistical modeling

  • Experiment design

  • Bayesian Inference

  • Supervised learning: decision trees, random forests. logistic regression

  • Unsupervised learning: clustering. dimensionality reduction

  • Optimization; gradient descent and variants

  • DOMAIN KNOWLEDGE & SOFT SKILLS

  • Passionate about the business

  • Curious about data

  • Influence without authority

  • Hacker mindset

  • Problem solver

  • Strategic, proactive. Creative. innovative and collaborative

  • PROGRAMMING & DATABASE

  • Computer science fundamentals

  • Scripting language e.g. Python

  • Statistical computing package, e.g. R

  • Databases; SOL and NOSOL

  • Relational algebra

  • Parallel databases and parallel query processing

  • MapReduce concepts

  • Hadoop and Hive/Pig

  • Custom reducers

  • Experience with xaaS like AWS

  • COMMUNICATION & VISUALIZATION

  • Able to engage with senior management

  • Story telling skills

  • Translate data-driven insights into decisions and actions

  • Visual art design

  • R packages like ggplot or lattice

  • Knowledge of any of visualization tools e.g. Flare, D3js, Tableau

copy saved

copies saved