Learn Parabola

Welcome to Learn Parabola! You'll find comprehensive guides and documentation to help you start working with Parabola as quickly as possible, as well as support if you get stuck. Let's jump right in!

Get Started    

Working with JSON data in Parabola

How to translate JSON data into relational, tabular data

JSON is a widely used data structure, and if you are interacting with API's, there is a good chance that you will come across JSON that you need to work with.

What is JSON and how to recognize its parts

It's not necessary to understand exactly how JSON works in order to use it in Parabola. There are a few patterns to remember, convenient ways to work with JSON in Parabola.

Traditional JSON may look like this:

{  
   "data":[  
      {  
         "id":1,
         "color":"blue",
         "material":"rubber",
         "size":5,
         "squeaks":true
      },
      {  
         "id":2,
         "color":"yellow",
         "material":"tennis",
         "size":9,
         "squeaks":false
      },
      {  
         "id":3,
         "color":"green",
         "material":"tennis",
         "size":3,
         "squeaks":false
      },
      {  
         "id":4,
         "color":"red",
         "material":"plastic",
         "size":13,
         "squeaks":true
      }
   ],
   "length":4,
   "has_more":false
}

Maybe this is what your API returns to Parabola. If so, it will likely be in a table that looks like this:

datalengthhas_more
[{"id":1,"color":"blue","material":"rubber","size":5,"sq...4false

Notice how that JSON blob was translated into 3 columns - data, length, and has_more. JSON is hierarchical, which allows you to infer relationships between the data present in the blob. The columns were created from all of the top level keys. Their values then stored in the cell.

If all of these terms don't make sense, then don't worry. We can define them.

When talking about JSON, there are a few terms to know:

TermDefinitionExample
BlobA chunk of JSONSee above code block - that is a blob
KeyThe name of the attribute that holds the value, like a column name{"key": "value"}
ValueThe value associated with that attribute, like a cell value. A value could be a simple bit of data or another object{"key": "value"}
Key:value pairA pair that represents a data point{"key": "value"}
ObjectAnything surrounded by {curly braces} - JSON is built out of nested objects{"key": "value","key": "value"}
ArrayA series of values or objects, comma separated, surrounded by [square brackets]. Represents a row of values.[{"id":1},{"id":2},{"id":3}]
Top level keyThe key(s) at the top level of the JSON blob.{"data": "value","length":4,"has_more":false}

When JSON is imported into Parabola, each top level key from the blob is set as a column header, and its values are set as the cell values.

This is why many times, during the initial import of JSON into Parabola, it will show up all in one cell, or all the relevant data in one cell with meta data in adjacent cells, just like our example.

datalengthhas_more
[{"id":1,"color":"blue","material":"rubber","size":5,"sq...4false

The keys called data, length, and has_more are all the top level keys in our example blob.

Notice that data contains the information we are most likely after - this can be discerned from the title of the column, as well as seeing that there is a lot of data held in that column.

Also, notice that the cell under data contains an array. The goal with importing JSON data into Parabola is to convert each array into its own table.

Using JSON from API's in Parabola

To help illustrate this process, you can use an API Import inside of Parabola to follow along with this example by placing this URL into the URL field https://slashtheapidog.com/api/balls and following the steps.

When you access that API endpoint for the first time, you will see the same example from this article - a table with 1 row and 3 columns.

datalengthhas_more
[{"id":1,"color":"blue","material":"rubber","size":5,"sq...4false

We want that table in the data column. You'll notice that now there is an option on the settings bar of your API Import that says "Add Top Level Key". Click that, and select "data". Adding that top level key will tell the import to move its frame of reference down one layer in the JSON blob's hierarchy. The result will look like this:

idcolormaterialsizesqueaks
1bluerubber5true
2yellowtennis9false
3greentennis3false
4redplastic13true

Our table has been formed from the JSON array! Now that looks like something usable.

This is fairly simply JSON, and you may have to work with it more to get what you need. Sometimes you need a top level key, and a nested key. That just takes you deeper than the top level key by one layer per key so that you can drill into the data you need.

What if, after using a top level key, there are still JSON objects in the data? Then we can use a JSON flattener to move the data into the format that we want. For example, what if you had a table that looked like this:

idball_data
1{"color":"blue","size":5}
2{"color":"yellow","size":9}
3{"color":"green","size":3}
4{"color":"red","size":13}

Notice that there are values inside the ball_data column that should be other column headers. A top level key would not be appropriate here since we already have reached and unpacked the array that we want. There is just some nested data that we want to break out - or flatten.

Using the JSON flattener, we can simply target the ball_data column and let it do the rest. It will take all keys, set them as new column headers, and fill in the values to the cells. Resulting in a table like this:

idcolorsize
1blue5
2yellow9
3green3
4red13

Principles of converting JSON to relational tables

When presented with an arbitrary JSON blob, there are a few generic rules that you can apply to understand how it should be represented as a table. Most of this is handled by Parabola, but it can be helpful to understand what is happening under the hood.

As discussed, a table in Parabola is equivalent to an array in JSON. That table is most likely composed of key:value pairs - the value equating to a cell, and the key equating to a column name, also known as an attribute.

Knowing that, we can take an array of JSON objects, and know what the column headers will be, what the cell values will be, and how many rows there will be.

JSON with many arrays

Because of the flexibility of the JSON spec, you may encounter arrays that contain other arrays, or multiple arrays on the same level in the blob. How do we handle multiple arrays, if a single array constitutes an entire table.

When you see multiple arrays, it's time to create some branches in your flow. Each array can be its own table.

👍

Break all arrays out into their own tables

Using column filters to remove the other columns containing arrays, create branches that that can be expanded into tables

Consider this example data that has 3 array columns, and a few normal columns. This is the result from an API that was imported and a top level key was added. So we have our table, but we have some additional arrays to deal with.

idprimary_emailother_emailsaliasesaccounts
1[email protected][{"email":"[email protected][{"alias":"BG","alias...[{"account":1123,"acc...
2[email protected][{"email":"[email protected][{"alias":"JB","alias...[{"account":4354,"acc...
3[email protected][{"email":"[email protected][{"alias":"MB","alias...[{"account":8798,"acc...

Because we've already used our top level key to get to the primary array of the table, I want to keep the scope of this table where it is. But the columns for other_emails, aliases, and accounts are not in a usable state right now.

Because we don't know how many entries are in each of those arrays, we need to split them into new tables. They are, however, related to my primary data. The cell containing the other emails for id 1 are related to id 1, to [email protected], and to the other arrays in that row. I need to retail a unique identifier that I can use to tie this array back to my original table after I have moved it.

📘

Always keep a unique identifying column on all tables when breaking apart arrays

Because JSON is hierarchical, and therefore has relationships between the data present, your final tables should be able to relate to each other using keys that are common between the tables.

To break apart this table, we can branch off of the API Import with 4 column filters - one for each array column plus one for the original table. It would look like this:

Branching off of the API Import to accommodate all of the JSON arrays

The Original table filter should only keep the ID and the primary email - essentially it keeps all columns that are not array columns.

The others would keep a single array column, such as other emails, and the ID column. That ID will be what we use to relate these tables later when working in Parabola.

Taking the Other Emails branch as our example, the resulting table will look like this:

idother_emails
1[{"email":"[email protected]
2[{"email":"[email protected]
3[{"email":"[email protected]

We know that we want to do something to the data to make it more usable, and we know that it's an array of JSON objects. In the past, we have used top level keys to access further arrays, but that won't work here. Also in the past we have used the JSON flattener to manage JSON objects - let's try that.

If you apply the JSON flattener to the other_emails column, you would get an odd looking result. You will end up with many email columns, which is not as helpful as 1 email column.

Each cell holds an array of emails, and each cell could have 1 email in its array or 100 emails. The JSON flattener creates a new column for each key, so if any of the arrays has 100 emails, we will get 100 new email columns that look like: email [0], email [1], email [2], and so on.

🚧

Look for column names with a number inside [square brackets]

A sure sign that you have flattened an array is to find that your columns have repeating names followed by numbers in [square braces]. When you see this, use the setting in your JSON flattener to create rows from the array

After the JSON flattener flattens an array, a new option will show up in the settings side bar. This setting will allow you to select the array and tell it to create new rows instead of new columns. The resulting table will look like this:

Look closely - we started out with only 1 row for each ID, and now we have duplicate ID's, but with unique email addresses. This is a table of other email addresses that our other table relates to by ID, and its all contained in one column.

The JSON flattener took each array, and created new rows for each key:value pair within the array, and duplicated the non-array information (only ID in this case) when it created each row.

Repeat this process for all of your branches, and you will have a grouping of 4 tables, all of which are usable and free of JSON, and all of which relate to each other.

JSON best practices in Parabola

Use these maxims to ensure that you are handling your JSON data appropriately in Parabola:

  • Always use a top level key and any nested keys needed in your API Import to create a table from its results.
  • Never leave raw JSON in your data. Use the JSON flattener to convert remaining JSON objects into columns and cells.
  • Never expand multiple array columns in the same table, always branch them off first.
  • Always expand arrays into new columns with the JSON flattener
  • Never use tables that have flattened arrays (columns names with [0] [1] [2] etc. in them)

Updated about a year ago

Working with JSON data in Parabola


How to translate JSON data into relational, tabular data

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.