Hot topic in the market “Data Science” and everyone is talking this and many different types of definition for data science and each varies in their view.
Data science is a multidisciplinary blend of data inference, algorithm development, and technology in order to solve analytically complex problems.
At the core is data. Troves of raw information, streaming in and stored in enterprise data warehouses. Much to learn by mining it. Advanced capabilities we can build with it. Data science is ultimately about using this data in creative ways to generate business value:
Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured similar to data mining.
Data science is a “concept to unify statistics, data analysis and their related methods” in order to “understand and analyze actual phenomena” with data. It employs techniques and theories drawn from many fields within the broad areas of mathematics, statistics, information science, and computer science, in particular from the subdomains of machine learning, classification, cluster analysis, data mining, databases, and visualization.
One way to consider data science is as an evolutionary step in interdisciplinary fields like business analysis that incorporate computer science, modeling, statistics, analytics, and mathematics.
At its core, data science involves using automated methods to analyze massive amounts of data and to extract knowledge from them. With such automated methods turning up everywhere from genomics to high-energy physics, data science is helping to create new branches of science, and influencing areas of social science and the humanities. The trend is expected to accelerate in the coming years as data from mobile sensors, sophisticated instruments, the web, and more, grows. In academic research, we will see an increasingly large number of traditional disciplines spawning new sub-disciplines with the adjective “computational” or “quantitative” in front of them. In industry, we will see data science transforming everything from healthcare to media.
Source: wikipedia
Technology & Tools for Data scientists:
The number one Data Science programming language was Python and its inherent friendliness to data analysis & supporting libraries such as NumPy, SciPy & Pandas. The second most popular language was Java, followed by C++, Perl, Ruby & C#.
Data Science required to performed a statistical analysis and the tool most required statistical tools is R followed by SAS, Matlab, SPSS, Stata & Minitab.
Below are some industry wise data science / Data Scientists skills differed and it varies based on the industry you are performing the task.
Health Insurance Company – In this Health Insurance industry Data scientist tools and methodologies specific to big data (R, Hadoop, Python, Hive)
Vacation Rentals Company – In this industry data scientist required to do lots of statistical analysis and hence the statistical programming languages (SAS or R) , open source machine learning packages (e.g. R’s caret, Python’s scikit-learn etc.) and SQL / Ability to code in a general purpose programming language such as C/C++, C# , Java or Python
Biotech / Pharma Company – Data scientist task based on the following: Bash, HTML5, PERL, Processing, Python and R
Online Advertising Company – Data scientist would perform in-depth analysis on Statistics, Machine Learning or a related field and also performing the in Python, C# or Java.
Online Gaming Company – Gaming industry required to do scripting language in Perl, Python,Ruby,etc and run some analysis on R, Octave, Weka.
To know more details on Data science, please click here