Data Mining - Data Scientist


“Data scientists” (Data - Science) are engineers charged with generating insights from the data.

Ultimately, the data scientist’s job is to analyze massive amounts of data, interpret “what the data say”, and distill the bits into actionable insights that steer the direction of the company:

  • what web site elements to refine,
  • what features to develop,
  • what markets to pursue, etc

To accomplish this, data visualization is an indispensable tool. At a mundane level, the “Data scientists” provide dashboards to enable stakeholders to browse through large amounts of multidimensional data, including interactive “drill downs” and “roll ups”.

Beyond simple dashboards, data scientists often build one-off visualizations that are the result of a specific task, usually a business question.

All these visualizations, from simple line graphs to complex interactive browsing interfaces, share one common feature: although the ultimate product consists only of a few hundred to a few thousand data points, they are the distillation of gigabytes, and in some cases, terabytes of raw data.

Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician.

Data Scientist

A person who build data product.

Data Engineer Applied Scientist
Data and Systems Architecture
Hadoop, PIG/HIVE, map reduce,
mahout, Java, Python, Perl, SQL, C++, etc …
NoSQL (Hbase, Cassandra, Mongo)
Statistics, Machine Learning, Text processing,
NLP, R, Matlab, SAS, SQL, Scripting
Visualisation, Telling the story

Documentation / Reference

Powered by ComboStrap