Spark and the Minor Planet Center data part 2

In the last post we read the minor planet center orbit file. This was a JSON text file. This time we are going to look at a bit more complex file to process. If you haven’t read the first post in this series I recommend starting there before reading this. In this post we are going to be looking at the Observation file. There are two parts to this file. One is for the numbered objects and other other for the un-numbered objects....

2017 December 3 · Emily Selwood

Spark and the Minor Planet Center data

Introduction A few weeks ago I saw comments between @Sondy and @JLGalache talking about getting a list of asteroids with their date of discovery. The main data file lists the year of discovery but not the actual date. I thought there was a way to get this information by looking at the observation file and joining it to the main data file. Todo this I decided to use Apache Spark. In this post I’ll go through setting up the spark environment and reading the json object file....

2017 December 2 · Emily Selwood

Introduction

Hello World I’m Emily Selwood. Every so often I get the urge to try and start a blog again. Here is iteration 235. What do I do? I build systems for a living. Mostly data processing but I’ll happily get my hands dirty doing anything that needs to get done. I’ve worked many things from Unity and C# to C, Groovy, Go, Javascript, and Big data things like Accumulo and Spark....

2017 December 2 · Emily Selwood