IBM Watson Data Platform (WDP) is an integrated platform of tools, services and data that helps companies accelerate their shift to become data-driven organizations.
For more information, please visit the official page of IBM Watson Data Platform, or contact to an IBM salesperson.
The purpose of this document is to have a quick start guide on the platform. More specifically, this guide focuses on a user experience of WDP: Data Science Experience (DSX). Keep in mind that this guide does not cover all the features, it's just an introduction.
This is not an official guide.
The best way to follow the guide, is to download the PDF from this link.
You can also go to the /pdf-guide
folder click on the PDF and see it online.
IBM Data Science Experience is an integrated development environment offering a suite of tools and capabilities that enable data scientists to accelerate their productivity.
DSX allows to analyze data using RStudio and Jupyter notebooks in a configured, collaborative environment that includes IBM value-adds, such as managed Spark. RStudio is integrated in the offering and provides a development environment for working with R. DSX provides Jupyter notebooks which are a web-based environment for interactive computing.
Solve your toughest data challenges with the best tools and the latest expertise in a social environment built by data scientists.
- Community: Harness the power of the community. Check out the shared data sets, notebooks, and articles in our growing set of resources
- Jupyter Notebooks: Create and collaborate on Python, R, and Scala notebooks that contain code and visualizations.
- RStudio: Jumpstart your R experience with a free, open-source RStudio tool.
- Machine Learning (Coming Soon): Create, train and deploy machine learning models.
My First Notebook: A simple example to get started with the notebooks on DSX.
The goal is to create a new notebook from scratch and add the following cells.
- On DSX (datascience.ibm.com), go to Projects.
- Click on create a new notebook.
- Write the name "MyFirstNotebook".
- Write a short description.
- Select
Python 2.7
as language, andSpark 2.0
. - Click on create button.
- Open the next
ipynb
to see the final notebook you should have to obtain. - For help, see the plain text file.
MyFirstNotebook.ipynb
: Here you can see the notebook you have to obtain. ipynb is the Jupyter notebook format.MyFirstNotebook-in-plain-text.py
: In plain text is easy to copy/paste code and text.
-
The
cars.csv
file contained on the /data folder, is a common file example in R (called as mtcars.csv). Has been downloaded from: https://vincentarelbundock.github.io/Rdatasets/datasets.html -
The
UNdata_population_total.csv
file contained on the /data folder has been downloaded from: https://apsportal.ibm.com/exchange/public/entry/view/889ca053a19986a4445839358a91963eTerms and Conditions of use: http://data.un.org/Host.aspx?Content=UNdataUse
IBM DataFirst Launch event (Oct. 2016) URL here
-
Part 1. Root Cause Analysis (Interactive Analytics) URL here
-
Part 2. Create Resolution (Machine Learning) URL here
Spark Summit Demo (Jun. 2016) URL here