Learn Parabola

Welcome to Learn Parabola! You'll find comprehensive guides and documentation to help you start working with Parabola as quickly as possible, as well as support if you get stuck. Let's jump right in!

Get Started    

Hands on guide to connecting to an API

Step-by-step guide to accessing API's and working with their responses in Parabola

In order to get you comfortable working with API's, we are going to walk through an example using a real API (that you can also use). We'll touch on the process from start to finish, all using Parabola to create a report that we care about.

This example uses the Ticketmaster API. The first step is to sign up for a developer account here and then retrieve your API key to follow along in your own flow.

Have you done your homework?

This document assumes that you are somewhat familiar with API terminology from Reading API docs and WTF is an API? I'm scared 👻. It also assumes that you are familiar with JSON to the extent that Working with JSON data in Parabola explains.

Who are we?

In this walkthrough, we will be imagining ourselves as a reporter covering the local events in Berkeley, CA.

Most of our day is spent logging into websites like Ticketmaster and scouring them for new events. The task feels so repetitive that we want to automate it, and free up time for more thorough reporting, not just data gathering.

Although we don't know how to code, we do know that Ticketmaster has an API that would give us all the information we need. With Parabola we can both access the API, and automate the full data gathering process.

Define our goal

For our reporting duties, we care about the next two weeks of events that will be happening in Berkeley, as well as any future events that have tickets on sale.

Explore the API docs

Heading over to the Ticketmaster API docs, we can try to figure out what endpoints may have the data we care about.

Ticketmaster API docs

Ticketmaster API docs

Many times in API docs, there will be an index of the calls you can make in the left side bar. In the Ticketmaster docs, we see an endpoint for Event Search - that looks promising.

Let's click into the item in the side bar to check if they show us any sample data about what will be returned, or list out possible ways to use this endpoint.

The Event Search endpoint detail page

The Event Search endpoint detail page

Once on an endpoint's detail page, there are a few sections to call out:

  1. All endpoints have one or more request methods. In this case, the endpoint only accepts GET requests. Keep in mind that the API Import we will be using in Parabola can only make GET requests.
  2. This is the endpoint URL slug - we will need to add the base URL onto it it create the full request URL.
  3. There's a long table of query parameters, which we can use as filters for what data is returned. Because services like Ticketmaster have so much data, using query parameters are a good way to only retrieve a digestible amount of data.
  4. Most API docs have a section showing code examples for each endpoint. This one is in cURL, and shows us what the full URL should look like.

Find the authentication section

We need to poke around to find the API authentication section. People who write API docs like to hide this section in strange places, because they have boring jobs :)

In the case of Ticketmaster, we can find it in the overview section of the Discover API.

Discovery API overview section, looking at authentication details

Discovery API overview section, looking at authentication details

In the Authentication section, the docs call out that all requests must be made with the API key attached as a URL parameter called apikey

Looking at our developer account, we see our key here:

Ticketmaster developer center

Ticketmaster developer center

In this case, we will set our apikey value in the request URL to the Consumer Key.

In each request URL, we will add ?apikey=OurAPIKey to the end.

Understand query parameters

Look again at that string that we want to add to all of our request URL's - ?apikey=OurAPIKey

That is a request parameter, also known as a query parameter. Parameters are appended to a URL by a question mark (?) and combined with an ampersand (&).

Parameters help us narrow down what we want the API to respond back from this URL, acting as filter criteria. We are looking only for events in Berkeley, so we should add a parameter for Berkeley. That would make a query string that looks like ?city=Berkeley&apikey=OurAPIKey. If there are any other needed filters, we can just keep adding more to the query string.

Make our first request

In Parabola, to GET data from a single endpoint, we should use the API Import, which looks like this:

API Import source

API Import source

Opening it up, we see some options that should look familiar:

API Import settings

API Import settings

On the left sidebar, near the top, is a field for our request URL. We know that we are looking for the events in Berkeley, so our URL should look like this https://app.ticketmaster.com/discovery/v2/events?city=Berkeley&apikey=OurAPIKey

Notice that its comprised of the base URL defined on the Overview page we saw, the slug that we saw on the Search Events page, the apikey from the authentication section, and a request parameter that narrows the search to the city of Berkeley.

Test your knowledge of API's by replacing the city parameter

Use the latlong parameter with radius and units for even more accurate geolocation targeting with the API.

After entering this URL into the API Import, and clicking Update Settings, this is what we see:

Initial response from the API

Initial response from the API

There are a few things to note when looking at any data you pull from an API:

  • The data itself is in a JSON format, and we can see an array in the _embedded column. If that doesn't make sense to you, read about JSON here.
  • There's a column called page that appears to have some information about needed pagination
  • A new option has appeared on the left sidebar called Add Top Level Key

Page through the API

We have a choice, now. We could expand the cell that contains our data into a full table using a Top Level Key, or we could get our pagination set up.

Because we will no longer see the page column when we expand any other column, its probably best to set up pagination now. As it will be easier to see that it's working.

A refresher on what pagination actually is

The general paradigm that we have been using with API's thus far is request and response. We ask the API in a specific way for something, and we get that something back. We can always expect the same type of thing from the API if we ask it in the same way every time.

The API designers know that certain requests will result in a lot of information being returned in the response. To keep their API stable and fast, and to follow best practices, they don't always send back the complete dataset. They may break the data into pages.

When an API sends back data, it may indicate that more pages are ready to be sent as well, we just need to ask for each one. The docs of the API will tell us exactly how to request all of the pages. A common way to do this is to include a parameter that looks like page=0. We can then just increment that number on each request, until we stop getting data back. In this way, we can get the full response from the API, no matter how large.

Back to our set up.

We should examine that page column to see if it holds any clues about how pagination works in this API. The cell has this data: {"size":20,"totalElements":48,"totalPages":3,"number":0}

Looks like the current page brought back 20 events, there are 48 total, spread across 3 pages, and we are currently on page 0.

Many times the first page of an API response is page 0, not page 1

Computers, unlike most humans, begin counting things at 0. So be sure to try to start your pagination at page=0. Many times the docs or the first API request will clue you in to how the API indexes its pages.

We have some basic information about how the pagination for this API works, but nothing in there indicates decisively how to specify the page we want in the request. So let's go back to the docs and see what we can find.

Looking at the available query parameters for the Event Search endpoint, two stand out as useful for pagination:

Pagination fields in the query parameters section

Pagination fields in the query parameters section

This API has a parameter for the page number, and for the size of each page. Usually the size parameter will let you know what the default size is, and what the max size that you can request is. This API shows neither of those details...

From our investigation above, we know that we need to send the page parameter, increment it by 1 each time, and it starts at 0. We can adjust the page size if we want as well.

In Parabola, these fields are in the Advanced Settings of the API Import:

API Import Advanced Settings

API Import Advanced Settings

That's a lot of options! Let's define them all:

Field
Example Value
Description

Pagination Key

page

The key specified by the API to send the page data in

Pagination Start Value

0

What page to start on

Increment Pagination Value

1

How much to add to the page to get to the next page

Limit Key

size

The key specified by the API to send the page size data in

Limit Size

50

How many entries requested per page

Cursor Key

page.next

The location of a key in the JSON that contains a value representing the next page

Max Requests per Minute

60

How many requests Parabola should make per minute. Dictated by the API docs

Max Requests

10

Maximum requests to make before stopping

One field that hasn't been previously touched on is the Cursor Key. Instead of using something like page=2 to fetch a page, an API with cursor based pagination uses a hash key to find the page, and sends back the key for the next page. So it may look like page=Ki847GHud7329G91e and then the response may have a a column called page with a JSON key in it called next that has a value of 789asGU8298yGH8jdsfo, which can be used to get the next page.

Whoa that's kind of a lot of information, maybe we should get back to getting this set up.

With that understanding of pagination, its clear that we need our settings to look like this:

Our filled out pagination settings

Our filled out pagination settings

Those settings can be translated to the instructions: start with page=0, and increment it by 1 each time. Only make 3 requests.

We could set the pagination limit higher if we were unsure how many pages that there might be.

The results after adding these settings and clicking Update Settings is:

Results of our paginated calls

Results of our paginated calls

This is looking good, we have one row for each call that was made, and they all seem to contain data that we want. Time to get our hands on it.

Use top level and nested keys

Top level and nested keys are discussed in detail in the Working with JSON data in Parabola document.

In essence, they let us expand that one cell into an entire table, which is way more useful. If you see the option to add a top level key and nested key, its almost always the case that you should do it. You can always remove the key if it did not work as expected, so there is no harm in trying.

In this case, when we add the top level key, we can immediately see an option for a Top Level Key:

Top Level Key menu

Top Level Key menu

It looks like the data that we want is in _embedded. We can tell that is the case because the visible part of the cell is showing: {"events":[{"name":"Brandi Carlile","type":"ev

Notice how there is an array with the key events. We are trying to pull events data, and arrays represent tables, so this one is a safe bet. We can always change our mind and change the key once we see the results of unpacking it.

After choose _embedded and click Update Settings, we should see a full table of results:

Results from applying a Top Level Key

Results from applying a Top Level Key

Well that doesn't look quite like it was unpacked into a table. The data is still all locked in a few cells. On the left sidebar, we see that another option has shown up beneath the Top Level Key for a Nested Key.

Clicking that, we can see that there is only one option (in this case), which is to expand the events column:

2nd Level Key Options

2nd Level Key Options

Once we add the events key as a 2nd level key, and click Update Settings, we can see the results that we were looking for:

Expanded results from the API

Expanded results from the API

Now this looks like data that we can use! Notice at the top that it says 48 rows and remember that the API told us this request has 48 results. We saw that when looking at the page column earlier. Looks like we got the full response.

Look for columns with arrays

Take a closer look at that data. Looks like we are not out of the JSON woods yet. That images column looks like it contains an array. Scrolling to the right we can see that there are quite a few JSON columns still, some arrays and some not:

JSON columns to deal with

JSON columns to deal with

As we learned in Working with JSON data in Parabola, we would like to parse those JSON columns into normal columns or extra tables.

The non-array JSON columns can be simply flattened, and the JSON array columns can be expanded into new rows to create a new table.

Looking through the columns and knowing our goal, it's clear that we don't need all of the tables (arrays) that are present, so we don't need to deal with all of them.

Flatten and expand JSON

We need to pull the JSON arrays that we care about out of this data and create new tables with them. Our result should be a set of tables that can relate to each other via key columns. With structured data like that, we will be able to do anything in Parabola.

Scanning the data, we can see that besides the "default table" that we already have, we want to break out the classifications and the _embeddable arrays. Looks like venue information is inside of _embeddable.

To go about this, we can create branches off of our API Import, each one holding the data that we will turn into a table. Drag some column filters off of the right sidebar, and send data from the API Import into them. It should look like this:

Branching to create additional tables

Branching to create additional tables

In the top column filter, we can keep the default data as it is received from our API Import. The only restriction here is that we should not bring in any data in an array column. In this example, we are going to keep the following columns seen here:

Keeping only the columns that we need in a table

Keeping only the columns that we need in a table

Notice how we did not bring any array columns, but we do have some JSON columns that are not arrays. Those we can easily flatten with another transform. But first, let's set up our other column filters.

The other tables, which can be thought of as auxiliary tables to the default events one, will have a single array based column, an ID column that is shared amongst the other tables, and any other additional columns needed. In this case, it will just be the event ID and the array we are interested in.

Only keep one array column per branched out table

When you expand the array, it will be easiest to understand and work with if there is only one array per table to expand. Other columns can be included as well, as long as they are not array columns.

Branched classifications table

Branched classifications table

Branched _embedded table

Branched _embedded table

Notice how our other tables have an ID to tie the data between all the tables, and one column that is the array to be expanded.

Now that we have our branches, we can convert all of the JSON into our tabular data. To do so, we use a JSON Flattener step and target all of the JSON columns. We can also specify that we want to expand the Array columns, as opposed to flattening them.

In general, you will want to expand array columns into new rows

If you only flatten a JSON array column, you can end up with an unknown amount of columns in future runs, and it will be difficult to use your data in a relational way.

The JSON flatteners should look like this:

The two flatteners that are expanding arrays have values in the Expand List Into New Rows field, targeting the exact array.

After flattening a column with JSON in it, the original column remains next to the new expanded data. A simple column filter on each branch can remove those JSON columns so that we are left with the data we want, looking like this:

Join our tables

Now that we have tables of data we care about, we can join it together to create a complete picture, looking like this:

Notice how the default table is first sorted (just for ease of reading the data) and is contributing the first arrow to the Join.

Pay attention to the numbers on the arrows when using a Join

The number on an arrow tells you what number input it is for the steps its leading to. In a Left Join, the first input arrow should be your primary data, and all subsequent arrows should be the auxiliary data that you want to add to your primary data.

When we join the data, we can use the ID to match up the rows, and now we have a complete table.

Request extra information

Now we want some extra information about ticket availability. It looks like there is an endpoint where we can pass an id of an event and get that availability. The endpoint looks like this: https://app.ticketmaster.com/inventory-status/v1/availability?events={id}&apikey=OurAPIKey

When we want to get extra data from an API to fill out our table more we can use the API Enrichment step. This step takes in data from your table, makes a request per row of data, and adds that data to the row. Because we have event id's in the id column, we can request the availability of the tickets for each row by merging the id value into the URL we got from the API via the API Enrichment settings. Just replace the part of the URL that needs the column value with the column name in curly brackets (i.e. {id}).

Notice how the data that came back from that endpoint is appended to the table with columns that start with api. to indicate they are values brought in by the API Enrichment step.

Format and schedule our flow

At this point, we can do whatever formatting is leftover to make the exact report that we need. We will branch off again to create two views of the data.

The first, which is sent out via an Email Attachment step, is a list of all of the events that have tickets available. The second, which is updated in a Google Sheet, is a list of the events in the next 7 days.

Now that we are happy with how our flow functions and looks, we can publish it.

Publish a flow in order to run in or schedule it to run

A flow must be published in order to run it, or to set it up to run on a schedule.

After clicking the publish button, we will see that the right sidebar has been swapped out with a log of all runs for the flow, and a tab to view the schedule. When we click on the schedule tab, we can add a new schedule rule, which looks like this:

We can select the days, hours, and frequency of the runs of this flow. Once the schedule is set, we are done! And now we've saved ourselves countless hours by automating this portion on our daily work.

Updated 10 months ago

Hands on guide to connecting to an API


Step-by-step guide to accessing API's and working with their responses in Parabola

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.