Skip to content

Installing OGDI DataLab version 5 on Azure

MiroJ edited this page Feb 13, 2013 · 86 revisions

Configuring a Windows Azure Account

The following is a walkthrough for the setup of a Windows Azure account you will use to install the OGDI DataLab platform. It will explain how to setup a Cloud Service, which will host the OGDI web and worker roles. It will also explain how to create the two **Storages **needed for the configuration tables and the data storage itself.

To be able to install OGDI DataLab on Azure you need to setup one Cloud Service and two Storages. If you do not have your Azure account, you need to sign up for it using a Windows Live ID.

For our walkthrough we will use a trial account of Windows Azure, which provides the same functionality as a paid one, except it is time-limited. All screens below are shown as examples and you should use your own information to create the required services and accounts. Please note that all screenshots are based on Visual Studio 2010 and Windows Azure Management Portal released in 2012: https://manage.windowsazure.com/

Windows Azure Preview

Creating the Cloud Service

The Cloud Service will run the data provider accessible to the public, which will expose the catalogues (datasets) to any consumer – client app or a direct http request.

To create a Cloud Service:

  1. Start the Windows Azure Management Portal.
  2. Select Cloud Services on the left sidebar.
  3. Click New on the left corner of the bottom toolbar. Creating a Cloud Service
  4. In the New menu, select Cloud Service, Quick Create, and then type a URL of your new site. Note: This will be the public URL users will see unless you have a domain name to assign it to.
  5. Select a region or affinity group so that all your storages and cloud services are in the same geographic location. This will decrease data traffic and speed up your services. If you do not have an affinity group already, you will not see any in the drop-down selection.
  6. Click Create Cloud Service and wait for the process to complete. After a few seconds you will see the newly created Cloud Service in the list. Newly created Cloud Service

##Creating the Storage For OGDI DataLab v.5, you need two storages:

  • A configuration storage, where the OGDI project keeps the data endpoints;
  • A data storage, where the actual data is stored, which will be available for the public.

To create a Storage:

  1. Start the Windows Azure Management Portal.
  2. Select Storage.
  3. Click on New on the left corner of the bottom toolbar.
  4. Select Storage and then Quick Create. Creating a Storage
  5. Enter a URL prefix that will be used to connect to the account. Note: We suggest using the word config in it, so you can easily recognize this account later.
  6. Select the same region or affinity group in which you put the Cloud Service.
  7. Click Create Storage Account.

Perform the same steps for creating Storage for the data, which will hold the publicly exposed catalogues. However, use data as the URL prefix instead (as in step 5). In our example we used myopencityconfig and myopencitydata.

These are public URLs and if the names are already taken, the Management Portal will notify you and ask to enter a different name.

When completed, the Storage panel will display the two new accounts (config and data) you have just created. Newly created Storages

Installing Dependencies

The following will prepare the OGDI DataLab for compiling and publishing on the Windows Azure instance. When you download the files, you will be missing the 3rd party dependencies for some of the projects in the package.

Prerequisite: you need to have completed all the steps on setting up the Windows Azure account, storages, and service.

To install the OGDI DataLab dependencies

  1. Once you have downloaded OGDI DataLab – version 5, you can decompress it in a folder and open it in Visual Studio 2010. Note: Run VS 2010 as administrator (right-click the VS 2010 icon while holding the Shift key and select Run as administrator from the context menu).
  2. Some components may be out of date. To get the latest versions of these 3rd party components, use the Installing NuGet.
  3. Right-click on the DataBrowser.WebRole project and select Manage NuGet Packages
  4. In the NuGet window, search for OGDI DataLab and you will see the four dependency packages that are named according to the project names they relate to. Manage NuGet Packages Window
  5. Select the OGDI DataLab DataBrowser.WebRole Dependencies item and click Install.
  6. Once the installation is complete, follow the same process to install: OGDI DataLab DataService.WebRole Dependencies and OGDI DataLab DataLoader Dependencies.
  7. To verify that all dependencies were installed correctly, build the solution and make sure there are no errors. If you notice any error messages, ensure you install the necessary dependency packages for each of the errors.

Configuring the Roles

The following walkthrough configures the web role and the worker role projects so you can publish them on your Windows Azure Cloud service. This configuration is necessary to connect the catalogue data with the metadata of the endpoint. They will provide the data to any client that visits your site.

Prerequisite: you need to have completed all the steps on setting up the Windows Azure account, storages, and service, as well as installed all 3rd party dependencies.

To configure the web role

  1. Locate the Solution explorer, and find the DataBrowser.Cloud project.
  2. Expand the branches under the project and look for the Roles folder, where you will find the two roles: DataBrowser.WebRole and DataBrowser.WorkerRole. The Solution Explorer
  3. Double-click on the DataBrowser.Webrole to open the configuration table.
  4. Select the **Settings **tab on the left.
  5. In the newly opened window, locate the Service Configuration drop-down box and select Cloud. This way, all changes you make will affect only the configurations for the Azure cloud installation. Service Configuration Dropdown
  6. Click DataConnectionString and select ConnectionString in the column Type. This will display a browse button at the end of the same row.
  7. Click the browse […] button to open a Connection String form.
  8. Once it opens, select Enter storage account credentials. These settings will allow you to connect to the storages you created.
  9. In the Account name field type the name of the storage account you chose to use for the configuration database.
  10. In the Account key field enter the access key for config storage. You can get it from the Windows Azure Management panel. Locate the Storages panel, select the configuration storage.
  11. In your Azure Portal, on the bottom toolbar you will see Manage Key. Clicking it will display the Primary access key and the Secondary access key. Note: They both have access rights but we suggest using the second one. Keep the first for administration purposes and the second for the OGDI DataLab installation itself. This way at a later point you will be able to change it and cancel access to everyone that is not an administrator. Manage Access Keys
  12. In Visual Studio and paste the key into the Storage Account Connection String form. Manage Access Keys
  13. Click OK to save the settings and close the Storage Account Connection String form.
  14. Repeat the same process for the DiagnosticsConnectionString setting of the web role project (starting from step 5), using the same storage name and key.
  15. Select the Cloud Services screen on the Windows Azure Platform portal. Cloud Services
  16. Click the service you created in the first part of this article and locate the URL column to the right.
  17. Copy the DNS Prefix.
  18. Paste the DNS Prefix followed by cloudapp.net into the serviceURI field in the *Settings tab in Visual Studio. Note: All standard Azure DNS names finish with .cloudapp.net so the service URI you will use starts with the DNS prefix you copied and finishes with cloudapp.net:8080/v1/.

After you have completed edited the three settings, the table with settings should look similar to this: Web role settings

To configure the worker role

Configure the same three settings for the DataBrowser.WorkerRole, so follow the exact same steps from above (starting from step 3) but instead double-clicking on the DataBrowser.WorkerRole in the Solution Explorer. The values for all settings are the same.

Once you are done, the screen will look similar to this: Worker role settings

Publishing the DataBrowser Project

The following will show you how to upload the DataBrowser.Cloud project, which includes the web role and the worker role, to the Windows Azure instance you have created. These roles will become your webpages where users will have access to the data in your catalogue.

Prerequisite: you need to have completed all the steps on setting up the Windows Azure account, storages, and service, installed all 3rd party dependencies, and configured the web and worker roles.

To publish the DataBrowser Project

  1. After you have set the connection strings as described above, go to the Solution Explorer, right-click on the DataBrowser.Cloud project, and click Publish. Solution explorer
  2. If you are publishing the project for the first time, you will see no subscription in the drop-down box. In the Publishing Wizard, click on Sign in to download credentials and the wizard will open a web browser window for you and ask you to sign in with your LiveId, which you used to open a Windows Azure account. If you were already signed in, you will not see the login screen.
  3. After a few seconds, the browser will download a configuration file with publishsettings extension. Save it on your hard drive. Windows Azure Publish Sign In
  4. In the Publishing wizard, click Import and browse to the downloaded settings file. When you open it, you will see the drop-down box change to the name of your subscription.
  5. Click Next.
  6. The next screen will show all settings you received through the settings file you imported. Verify that you have selected Release built configuration and Cloud service configuration. Click Next. Windows Azure Publish Settings
  7. The Summary page will show you the way the project will be published on your hosting account. Click Publish to start the automatic upload process.

Windows Azure Publish Summary

This process will take several minutes. During this time, you will be able to see the progress in the Windows Azure Activity Log screen in Visual Studio.

Windows Azure Activity Log

You will also be able to see the uploading on the management panel as well.

Management panel

When publishing is complete, the hosting service with the DataBrowser roles will be started.

DataBrowser

Note: You will not have any data in the database and you do not have any catalogues defined in it. That is next. For now, check that your Cloud service is running on Azure and read our next part. Note: The Data Browser might show an error / blank catalog because you haven't loaded any datasets.

Defining Endpoints

The following shows how to define OGDI catalogues necessary for the connections between the data storage and the data service. The configuration of the end points uses the ConfigTool project from the OGDI DataLab solution. It’s use is quite simple.

Prerequisite: you need to have completed all the steps on setting up the Windows Azure account, storages, and service, installed all 3rd party dependencies, aconfigured the web and worker roles, and published the DataBrowser project. You will also need access to the Windows Azure Platform panel or have the necessary storage account name and key.

The ConfigTool project is executed on your local computer. You will need to run it only one time and it will not be visible to your clients and users. Before you can start it though, you have to configure the storage account information (for the table where you store catalogue configuration data). It's exactly the same account information as what you added in the step: "Configuring Web Role"

To define endpoints

  1. Locate the Web.Config file in the ConfigTool project and open it. Web.config
  2. On the DataConnectionString line, find the StorageName after the AccountName attribute and change it to the real storage name you will use. Again, you should use the storage that you created to keep the configurations in. You can find it in the Windows Azure Platform Management Panel. Windows Azure Platform Management Panel
  3. On the same line, locate the StorageKey after the AccountKey attribute and replace it with the secondary access key from the same config storage account. Storage Account Connection String
  4. Select Debug as a solution configuration in Visual Studio, right-click on the ConfigTool project, point to Debug, and click on Start new instance to start the project. It will start a new instance of your web browser and show you a simple form. Note: you need to have IIS installed on your machine.
  5. Fill out the fields as follows:
  • Alias – a short name that describes what information is included in the catalogue. For example: that are NewYorkStreetParking, TorontoDayCareFacilities, MyExperimentalCatalogue.
  • Description – a short user-friendly description about the catalogue. For example, New York Parking Facilities.
  • TableStorageAccount – the second storage account name. This is the only time you will be using the second storage account name and access key. The account storage name can be found in the Windows Azure Platform Management Panel.
  • TableStorageAccountKey – use the secondary access key of the same data storage account.

Enabled Storage Accounts

Once you confirm all of the information, click the Add button and the information for the new catalogue will be added to the config storage. A confirmation of the information will appear under the form once the webpage reloads.

Updated Enabled Storage Accounts

Uploading Data in the Catalogue

The following instructions covers uploading the catalogue we have created and configured. The OGDI DataLab v.5 solution has two projects that can help with that – console application and Windows application. We will focus on the more visual and easy to use one – DataLoaderGuiApp.

To complete this step, you will need a CSV or a KML file that contains data for your catalogue. If you do not have one and you are experimenting with OGDI, you can always search the web for the terms "open data" and even include a city name.

Most large municipalities already have their data published in plain files. For instance the Region of Peel in Ontario, Canada has data online as of 26 June 2012. In the following example, we will use a CSV file with parking facilities in New York.

To upload data to the catalogue

  1. Start the project DataLoaderGuiApp and you will see the OGDI Data Loader Windows application started OGDI Data Loader
  2. Click on the Settings tab and click the Connection button to configure connection settings.
  3. In the Endpoint Setting, form add your Account name of the data storage account and its secondary key in the Account key field. This information can be found in the Windows Azure Platform Management Panel and look for the storage account with the word data at the end. Endpoint Settings
  4. Click Save.
  5. Click Open in the File tab.
  6. Select a CSV or KML file to upload. Note: Ensure it is really a comma separated value file and all rows have valid information in each column.
  7. Click Open to load the file. It will not go in the storage yet.
  8. Fill in the Dataset Metadata tab with descriptive information for end users. Dataset Metadata
  9. Fill in the Dataset Properties tab. Use New.Guid to generate random and unique values for the Dataset Primary Key and Dataset RowKey. The Data Source Timezone defines a reference to the location this data is published. Dataset Properties
  10. Define the data type and geographic location data in the Dataset Columns tab. The OGDI Metadata Designer will look for latitude and longitude information if they exist.
  11. Select Bing to Map if you have geolocation data to map your data. Dataset Column
  12. Define extra information about the catalogue on the Dataset Columns Metadata. Dataset Column Metadata
  13. Click Save to save all configuration data in a file next to the CSV or the KML file.
  14. The OGDI Data Loader will display some information about the data catalogue you just configured. You are ready to upload it. IMPORTANT: Select the Upload Method from the dropdown menu and decide if you need to preserve the original data as well using the similarly named checkbox. Dataset Column Metadata
  15. Click Start and monitor the progress of the upload. Based on the size of your file and your internet connection, you may need to wait several minutes. You will see information about the upload and errors that may occur. Even if you experience errors, you may have some of your data uploaded. The loader will ignore the bad records from your file and let you know about it on the log screen.

And that's it! You are now ready to see your data on Windows Azure. The URL should look like this: http://yourcloudservicename.cloudapp.net.

More detailed information can be found at My Open City