Step 1. Publish a dataset
Welcome to your very own OpenDataSoft domain! You’re here to work with data, so let’s dive right in!
Retrieve the dataset
For this tutorial, we’ll work with a nice and clean dataset of the winners and losers of the Super Bowl. To follow this tutorial with the exact same dataset, click the button below.
Done? Awesome! You’re now ready to publish your first dataset in your domain and play with it.
Upload the dataset
- If you’re not already there, click the Back office button in the top bar of your domain.
- Click the New dataset button.
- Click the Add a source button.
- Choose the Super Bowl dataset file you just downloaded.
Wham! In a matter of seconds, your dataset is uploaded in your domain!
Rename your dataset
We surely can find a better title than super-bowl for our dataset. Remember that this title will be displayed in your dataset catalog, so make sure to rename your datasets to give them more explicit titles.
- Select the current title.
- Rename your dataset with a pretty and clear title.
History of the Super Bowl is a much better title, isn’t it?
Facets define the filters of your datasets, which allow you to create charts afterward. Setting up fields as facets means telling the platform which columns (and its related fields) of a dataset are relevant as filters.
- Click on the Processing tab.
- Click in the column you want to set up as a facet, in order to define it as a filter.
And that’s it! In our Super Bowl dataset, we have decided to set up as facets the following columns: Winner, Loser, City and State. It means that these columns are now filters.
Apply a processor
OpenDataSoft gives you the opportunity to modify your data in several ways, directly from the platform. You can check the documentation to have the complete list of all our processors and know how each of them works.
In this tutorial, we are going to choose a processor which will add geographical coordinates to the dataset. In this dataset, there are state names, but not their corresponding geographical coordinates. This is when the processor Join Dataset comes into play!
- Click the Add a processor button.
- In the Generic Operations section, choose the Join Dataset processor.
Fill in the different fields of the processor.
Select a source dataset: in the All available data section, choose the
Natural Earth - US States and Provinces 1:110m dataset.
This dataset contains all geographical coordinates corresponding to every US state. By joining this dataset to our Super Bowl one, we will be able to retrieve the data we need, in other words, the geographical coordinates of every US state.
- Local key (meaning the Super Bowl dataset’s column we are interested in for this operation): enter or select state.
- Remote key (meaning the Natural Earth dataset’s column corresponding to the one in the Super Bowl dataset, in other words, the local key): enter or select name.
- Output field (meaning Natural Earth dataset’s column we want to retrieve and add to the Super Bowl dataset): enter geo_point_2d.
- Select a source dataset: in the All available data section, choose the Natural Earth - US States and Provinces 1:110m dataset.
- Press Enter or click outside the processor box area for your changes to be taken into account.
There you go! You have applied your very first processor.
Data need context for people to understand them. Let's add some basic, yet very important, metadata to our dataset.
- Click on the Information tab.
- Add a description.
- Choose a theme out of the list.
- Add keywords. After entering each keyword, press enter for it to be taken into account.
- Attribute the right license to your dataset, out of the list.
Save and publish
Now that we have processed our dataset and defined it quite well, we can publish it.
- Click the Save button.
- Click the Publish button.
- Click the Explore button. It’s high time we play with our data!